Zhixuan Lin

Zhixuan Lin

Github CV Email

zxlin.cs [AT] gmail.com

About Me

I recently graduated from Zhejiang University. I will join Mila as a MSc student next year.

My life goal is to understand mind/consciousness/intelligence and to build one (so-called AGI). I believe the important thing is to keep thinking, learning and exploring, so I won't limit my research to some particular topics until I find an approach that I truly believe in.

Previously I have been working on learning object-centric representations using structured generative models with Professor Sungjin Ahn. I also have experience with computer vision when I started my research career at Professor Xiaowei Zhou's group.

Currently, I'm generally interested in memory-augmented neural networks. Specifically, I'm interested in the

About Mind/Intelligence

What seem to be true:

  • The agent has to be dealing with a sequential world.

  • The formulation is world-agnostic.

  • The agent should be self-improving.

  • The agent should have a memory. It can be in any form though.

  • In terms of implementation, it has to be scalable in some way.

  • Consciousness is crucial. Actually I am not sure about this. It is more like a belief or intuition.

    • Two most central questions 1) why we are experiencing this subjective experience (what's the advantage of being conscious?) and 2) how this subjective experience is produced from pure physical processes?

    • Some (weak) justifications:

      • We are weak in terms of computational ability. But we are self-conscious. There is no reason why a higher life form will not be self-conscious.

      • Cleverer animals tend to exhibit more consciousness.

      • An agent can be stupid while still being an AGI. Consider a human baby.

      • Evolution brings us here. It must be useful. And through these years we are getting more and more self-conscious.

What I don't believe:

  • Pure SGD will work. There must be meta-level learning.

  • Causal learning is important. Instead, it should be something that naturally emerges if we have chosen the right paradigm. Actually, I seriously doubt that we should explicitly study this (if our goal is general AI).

Thoughts:

  • Towards non-reactive, thinking agent:

    • Its performance scales with time and computing resources available.

    • The way it scales must not be hard coded, like SGD or some predefined search routine.

    • Consider a Go agent. Suppose we parametrize the policy network with an RNN. The RNN accepts the current board configuration, and is allowed to perform any number of steps of computation. Is there any way to train it, such that, at test time, the performance of the agent will scale with the number of steps allowed?

      • We know that there is way to scale: use MCTS. This is a highly systematic procedure, but unfortunately, hard-coded. Can the RNN learn this kind of systematic scaling?

  • Arguments against "RL is the computational theory of intelligence" (the position that I take currently):

    • The "reward is part of the observation" assumption is weird. We don't receive rewards. Our observations are just $o_t$ . There is no $r_t$. Someone might say that reward should be internal. But, the point is, we don't need rewards to learn.

      • We can learn just by reading texts and watching videos. Think about which part of an RL agent implements this kind of learning. It is actually the state update function $s_{t+1} = f(o_t, s_t)$ in RL , which is totally non-essential in current RL framework and often implemented using some kind of RNN (in MuZero, for example), and its purpose is basically just for handling partial observability. However, learning by reading things and watching is probably the most important form of learning for us.

      • The online, continual learning aspects of RL relies on the fact that agents can receive rewards online.

      • To me, the meta-RL setting (without rewards as input) would make more sense. There could be an evaluation function that quantifies the performance of the agent and can be used to update the agent in the outer loop, but the agent should not be aware of it. Note that in this case, RL algorithms can still applied in the outer loop, but they should not considered part of the agent's learning process. Only things that happen in the agent state update function should be consider "learning".

      • Where do the reward signals come from, after all?

    • As Rich points out, mathematics is not world knowledge and thus not a kind of predictive knowledge. So I assume that RL cannot deal with that. But from my perspective, the ability to understand math is essential.

      • Can you imagine an RL agent, justing by interacting with the world (assume the right reward is available, somehow), without having access to all the math textbooks that we have, can develop the same of level of understanding of modern mathematics as we did? It could probably learn Go in that way, but not math.

    • This is more a personal thing. I cannot imagine what kind of role consciousness would play in an RL agent.

  • Arguments for "RL is the computational theory of intelligence":

    • It is beautiful, pure and elegant.

      • We only need to predict a scalar value. This is easy to learn.

      • Long-term, highly uncertain predictions are "compressed" in the value function. Neat.

      • Partial observability all handled by the state-update function. So the policy can be reactive. RL starts from MDP, so this might look trivial. But it is not, actually.

    • It matches our intuition of "agents" (except the reward part). There is a sequential world, and the agent receives observation and acts. The agent learns by interacting with it.

    • All learning and knowledge are grounded in experience, making it fully general and scalable.

    • Learning, at least in theory, online and continual.

    • There is currently no alternative to it.

  • Languages:

    • Yann LeCun says language is just an inefficient way to express your thought, but language may not just be a trivial thing. It can be used to represent anything. It is interesting that it could be so primitive while so useful. Most thought processes basically are like speaking to yourself.

    • We think using languages (and images) instead of some hidden, internal representations. At least that's what we perceive. This is interesting because this is very inefficient. There must be some reasons.

    • How are you supposed to convey the same amount of information in a paper to an agent without language?

  • Internal/Subjective experience

    • If you consider the part of our brain that implements consciousness is the agent, then you would see that it is not interacting with the world directly. It is interacting with the rest part of our brain:

      • Observations: raw observations are processed, parsed and integrated before consciousness get access to it. Also, observations include past memory, chemicals release by our brain that induces emotions, and so on.

      • Actions. We get move our body. But at the same time, we get to query our memory and write to our memory. In that sense, the memory in our brain is external.

  • Adversarial learning might be important. Being better than yourself always seems to be something learnable.

    • Think about it. If AlphaZero is trained against a very strong human player, it is not going to improve. This is kind of similar to the exploration problem in RL. The balance issue of discriminator and generator in GANs is also similar. Also, this is how evolution works: all creatures start with a simple form, and they jointly evolve to today's complex forms. This might hint some solution for the exploration problem in RL.

  • About consciousness:

    • First, we need to admit that the problem is not well formulated. But to be clear, we also need to understand that the problem is not to answer "for what problem consciousness and thinking are the optimal solution" (it is a useful problem to consider though). The problem that I'm really interested in is "what consciousness and thinking are", and then "how to build it". This problem is in itself interesting. I don't actually care whether it is optimal in any sense, although I do believe it will be the optimal solution to some problem.

    • Currently, even though we don't know what objective we are trying to optimize, we have some clues about what the solution will be like: it is multi-step, sequential, with a memory, with attention, and so on. More ideas and experiments are needed.

  • If you think about it, the agent is not part of the environment. For example, in robotics, everything about the robot (position, etc.) is considered part of the environment/external world. This is very unnatural. I'm not saying that positions should not be considered an external world state. I'm just saying that maybe we should reconsider this problem.

  • (Bengio's onsciousness Prior): The fact that 1) mind is sequential 2) memory is huge (so attention is required for useful read and write operation) 3) Our consciousness can only handle a small part of info at a time, there are reasons to believe the thinking process act in a Turing machine way, or a modern computer CPU way, where a controller acts on an external memory using attention.

Publications

Improving Generative Imagination in Object-Centric World Models

Zhixuan Lin, Yi-Fu Wu, Skand Vishwanath Peri, Bofeng Fu, Jindong Jiang, Sungjin Ahn

ICML 2020

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

Zhixuan Lin*, Yi-Fu Wu*, Skand Vishwanath Peri*, Weihao Sun, Gautam Singh, Fei Deng, Jindong Jiang, Sungjin Ahn

ICLR 2020

[Project] [Code][Paper]

GIFT: Learning Transformation Invariant Dense Visual Descriptors via Group CNNs

Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou

NeurIPS 2019

[Project] [Code] [Paper]