About Me
I'm a third-year Ph.D. student at Mila and the University of Montreal, advised by Professor Aaron Courville.
My life goal is to understand and build general intelligence. Therefore, my research interest evolves as my belief about general intelligence changes.
I'm currently interested in long-context sequence models (particularly linear complexity models) and their potential application in RL.
Previously, I've worked on learning object-centric representations using structured generative models with Professor Sungjin Ahn at Rutgers University. I also had experience with computer vision when I started my research career in Professor Xiaowei Zhou's group at Zhejiang University.
About Mind/Intelligence
What I believe:
Eventually, the agent needs online data. In other words, they need to generate data for themselves. Static datasets are very useful, but they have limitations.
The agent should have a memory system. It could be implicit (e.g., as weights of a neural network) or explicit. This is the most important part of the agent.
Thoughts:
About RL: The true value of RL might be that it can provide the right ultimate objective to optimize. However, recent advances in deep learning (i.e., LLMs) suggest the best way to optimize this objective might be by optimizing some other auxiliary objective at the same time, either for obtaining a reasonable initial policy or for getting sufficient learning signals. The main considerations are (1) the amount of data available, (2) the amount of useful learning signal, and (3) how easy it is to optimize the objective.
Language: Most of the time, we think with language instead of some hidden, internal representations. At least, that's what we perceive. This is interesting because this is very inefficient.
Human brains are limited in many ways. For example, most (if not all) people only have geometric intuition for 3D spaces. Can neural networks develop geometric intuition for high-dimensional vector spaces (or some even more abstract spaces)? A more general question is, are there some fundamental abilities (the most interesting being mathematical intuition) that a neural network can develop but humans cannot?