Access consciousness and its relation to general intelligence
read more'Fastai Transforms for DonkeyCar'
Fastai transforms don't seem to help DonkeyCar training
read more'Effect of Dropout Layers in a VAE'
Dropout layers make VAE output worse
read moreStable Baselines Algorithms
Table of algorithms in the Stable Baselines repo
read moreUnicorn: Continual learning with a universal, off-policy agent
Continual learning with a universal, off-policy agent.
read moreSample-Efficient Deep RL with Generative Adversarial Tree Search
Learned dynamics model with a GAN for image generation and MCTS for planning.
read moreLearning Real-World Robot Policies by Dreaming
Unsupervised learning of image encoding, dynamics and reward models.
read moreUnsupervised Predictive Memory in a Goal-Directed Agent
Unsupervised training of a memory that is used for prediction of state and reward.
read moreWorld Models
Unsupervised learning of image encoding and dynamics model.
read more'First Real DAgger Results'
Using DAgger to improve MaLPi's training, while MaLPi is driving
read more'Initial DAgger Results'
Using DAgger to improve MaLPi's training
read more'First attempts at Hyperparameter Optimization'
Hyperparameter optimization using hyperopt on racetrack data.
read morePretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations.
read more'Normalizing image data before training - GRU version'
Testing image normalization when using a GRU.
read more'Normalizing image data before training - LSTM version'
Testing image normalization when using an LSTM.
read moreTensorizing LSTMs
Tensorizing LSTMs to make them wider and deeper without adding parameters and with minimal extra compute costs.
read more'Normalizing image data before training'
I had completely forgotten to normalize the images I'm feeding into MaLPi's network, so I thought I'd try to be a bit more formal about it than my usual.
read moreMastering the game of Go without human knowledge
AlphaGo Zero, all RL self-play.
read moreGetting a Keras LSTM layer to work on MaLPi
Training on batch sizes and/or sequence lengths longer than one, while still being able to run one image at a time on the robot.
read morePre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning
Pre-train using supervised learning on human provided demonstations.
read more
