Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments.


Physics of Living Systems Group, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, United States; W. M. Keck Science Department, Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, CA 91711, United States. Electronic address: [Email]


Evolved and engineered organisms must adapt to fluctuating environments that are often only partially observed. We show that adaptation to a second environment can be significantly harder after adapting to a first, completely unrelated environment, even when using second-order learning algorithms and a constant learning rate. In effect, there is a lack of fading memory in the organism's performance. However, organisms can adapt well to the second environment by incorporating a simple novelty detection algorithm that signals when the environment has changed and reinitializing the parameters that define their behavior if so. We propose that it may be fruitful to look for signs of this novelty detection in biological organisms, and to engineer novelty detection algorithms into artificial organisms.


Fading memory,Novelty detection,Reinforcement learning,