About Me
I'm a second-year Ph.D. student at Mila and the University of Montreal, advised by Professor Aaron Courville.
My life goal is to understand and build general intelligence. Therefore, my research interest evolves as my belief about the best path toward general intelligence changes.
Currently, I'm interested in long-range sequence modeling with linear recurrent models, and its application in RL.
Previously I've worked on learning object-centric representations using structured generative models with Professor Sungjin Ahn at Rutgers University. I also had experience with computer vision when I started my research career in Professor Xiaowei Zhou's group at Zhejiang University.
About Mind/Intelligence
What I believe:
Eventually, the agent needs online data. In other words, they need to interact with the world. Static datasets are very useful but they have limitations.
The agent should have a memory system. In terms of architecture, this is the most important part.
Agents should be compared in terms of scalability, just like we compare algorithms with big O notation.
Thoughts:
About RL: I believe the true value of RL is that it provides the right ultimate objective to optimize. However, recent advances in deep learning (foundation models) seem to suggest the best way to optimize this objective might be through currently optimizing some other auxiliary objective. These auxiliary objectives may provide most of the learning signals. The main considerations are (1) the amount of data available, (2) the amount of useful learning signal, and (3) how easy it is to optimize the objective.
Language: we think with language (and images) instead of some hidden, internal representations. At least that's what we perceive. This is interesting because this is very inefficient.