About Me

I'm a third-year Ph.D. student at Mila and the University of Montreal, advised by Professor Aaron Courville.

My life goal is to understand and build general intelligence, and my research interests reflect my overall beliefs about intelligence. Currently, I'm interested in long-context sequence models (particularly linear complexity models) and their potential applications in RL.

Previously, I've worked on learning object-centric representations using structured generative models with Professor Sungjin Ahn at Rutgers University. I also had experience with computer vision when I started my research career in Professor Xiaowei Zhou's group at Zhejiang University.

About Intelligence

What I believe:

Eventually, the agent needs online experience. This could mean interaction with the real world or an internal thinking process.
I think of an agent as a sequence model. I believe the most important aspect of intelligence is related to the temporal, sequential nature of agent-environment interaction (memory, the experience stream, learning, credit assignment...). In terms of the architecture, the most essential part is the mechanism that handles temporal dependencies.

Thoughts:

About RL: It seems that RL would be needed to achieve superhuman intelligence. However, it is likely that simple future prediction -- instead of pure reward-based learning -- would still be an essential aspect of agent learning. The main concern is the amount of learning signals available.
About language: Most of the time, we think and reason with natural language instead of some hidden, internal representations. At least, that's what we perceive. This is interesting because this is very inefficient.
Human brains are limited in many ways. For example, most (if not all) people only have geometric intuition for 3D spaces. Can neural networks develop geometric intuition for high-dimensional vector spaces (or some even more abstract spaces)? A more general question is, what are some fundamental abilities (the most interesting being mathematical intuition) that a neural network can develop but humans cannot?