Brief description of migrating a static website from Github to Codeberg
read moreOn the link between conscious function and general intelligence in humans and machines
Access consciousness and its relation to general intelligence
read more'Fastai Transforms for DonkeyCar'
Fastai transforms don't seem to help DonkeyCar training
read more'Effect of Dropout Layers in a VAE'
Dropout layers make VAE output worse
read moreEndgame Speculation
My speculation on how Avengers Endgame could go
read moreStable Baselines Algorithms
Table of algorithms in the Stable Baselines repo
read moreUnicorn: Continual learning with a universal, off-policy agent
Continual learning with a universal, off-policy agent.
read moreSample-Efficient Deep RL with Generative Adversarial Tree Search
Learned dynamics model with a GAN for image generation and MCTS for planning.
read moreLearning Real-World Robot Policies by Dreaming
Unsupervised learning of image encoding, dynamics and reward models.
read moreUnsupervised Predictive Memory in a Goal-Directed Agent
Unsupervised training of a memory that is used for prediction of state and reward.
read moreWorld Models
Unsupervised learning of image encoding and dynamics model.
read more'First Real DAgger Results'
Using DAgger to improve MaLPi's training, while MaLPi is driving
read more'Initial DAgger Results'
Using DAgger to improve MaLPi's training
read moreFirst (Second) Fully Autonomous Full Lap
One of MaLPi's first fully autonomous laps
read more'First attempts at Hyperparameter Optimization'
Hyperparameter optimization using hyperopt on racetrack data.
read morePretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations.
read more'Normalizing image data before training - GRU version'
Testing image normalization when using a GRU.
read more'Normalizing image data before training - LSTM version'
Testing image normalization when using an LSTM.
read moreTensorizing LSTMs
Tensorizing LSTMs to make them wider and deeper without adding parameters and with minimal extra compute costs.
read more'Normalizing image data before training'
I had completely forgotten to normalize the images I'm feeding into MaLPi's network, so I thought I'd try to be a bit more formal about it than my usual.
read moreMastering the game of Go without human knowledge
AlphaGo Zero, all RL self-play.
read moreGetting a Keras LSTM layer to work on MaLPi
Training on batch sizes and/or sequence lengths longer than one, while still being able to run one image at a time on the robot.
read morePre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning
Pre-train using supervised learning on human provided demonstations.
read moreExperimenting with OpenAIs Baselines code
I forked Open AI's baseline code and made a few changes. This was my first full run before I started playing around with the model architecture.
Changes from OpenAI's: * Turned on Logging, including Tensorboard output * Log rewards * Add a command line option for setting number of cpus
Code: Commit used …
read moreAn Overview of Multi-Task Learning in Deep Neural Networks
A long review of the use of DL in robotics
read moreDeep Learning in Robotics: A Review of Recent Research
A long review of the use of DL in robotics
read moreEligibility Traces
Notes on using Eligibility Traces with neural networks
read moreOne Model To Learn Them All
A single ML model used for very different tasks.
read moreA simple neural network module for relational reasoning
Relationships between objects.
read moreQuestions and Intuition for Tackling Deep Learning Problems
Five questions to ask about your deep learning project.
read moreReinforcement Learning with Unsupervised Auxiliary Tasks
Increase speed of a Reinforcement Learning system with auxiliary task.
read moreAttention and Augmented Recurrent Neural Networks
Overview (with references) of attention and several types of augmentation for RNNs.
read moreReinforcement Learning: An Introduction (2nd Edition)
In-progress second edition of an RL textbook.
read moreDeep Reinforcement Learning with Double Q-learning
Improved Q-value estimation by reducing overestimates of Deep Q-networks.
read moreHuman-level control through deep reinforcement learning
One of the first deep reinforcement learning papers.
read moreMotors
Motors and controllers and a breadboard
read moreOff-Policy Actor-Critic
Off-Policy AC with linear state features. Includes elegibility traces.
read moreLego Chassis
Some progress on the hardware front.
The chassis really needs to be wider but I have a limited selection of Legos. I still don't have any motors or any way to control them but it's far enough along that I can try manually taking some images with the webcam and …
read moreEndurance Test
Test how long the PowerGen battery can run MaLPi on a single charge.
I ran an endurance test with MaLPi, running the Pi, the webcam and a shell script that logged uptime every ten seconds and the motion program (/usr/bin/motion) in an attempt to detect changes in the …
read moreMaLPi Intro
MaLPi (Machine Learning Pi)
First, the hardware. This is my current setup and although I've tested each piece separately, I haven't had them all working together, yet.
- Raspberry Pi, model B v7 (or 0xf, I'm not sure how to read /proc/cpuinfo)
- 3D printed case (sorry, I can't remember where …
Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction
Using RL value functions to encode semantic knowledge, specifically by a robot.
read more