See also
- An interactive blog post
- A Keras/Tensorflow implementation
- A Tensorflow implementation
- This one seems to have a more stable VAE implementation.
Overview
- Vision Model (V)
- A variational autoencoder that learns a compressed representation (latent space), z.
- Memory RNN (M)
- An RNN with a Mixture Density Network on top that predicts the distribution of the next z, given current z and previous action.
- Controller (C)
- A simple network that maps from current z plus M's internal hidden vector to an action.
Still trying to get the implementation to run all the way through...
