Ashwin Balakrishna
I am a Research Scientist at Google DeepMind working on building generally intelligent robots. Broadly speaking, I am excited about bridging the general-purpose capabilities of vision and language foundation models with algorithmic paradigms for learning from interaction, such as imitation learning and reinforcement learning. I've previously spent time at Toyota Research Institute and Nuro, where I worked on large-scale robot learning for robotic manipulation and autonomous vehicle planning respectively.
I did my PhD in Computer Science in the AUTOLAB at UC Berkeley, where I developed a number of algorithms for safe and efficient online robot learning and studied applications to deformable manipulation, industrial automation, and robot grasping. I did my bachelors degree at Caltech in Electrical Engineering, where I worked on a number of applications of machine learning and signal processing to scientific problems.
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
LinkedIn  
|
|
Reinforcement Learning, Imitation Learning, and Large-Scale Pre-training
|
|
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim*,
Karl Pertsch*,
Siddharth Karamcheti*,
Ted Xiao,
Ashwin Balakrishna,
Suraj Nair,
Rafael Rafailov,
Ethan Foster,
Grace Lam,
Pannag Sanketi,
Quan Vuong,
Thomas Kollar,
Benjamin Burchfiel,
Russ Tedrake,
Dorsa Sadigh,
Sergey Levine,
Percy Liang,
Chelsea Finn
Preprint.
Website /
PDF /
Bibtex
Fully open-source vision-language-action model for general-purpose robotic manipulation. Exhibits strong performance both in the zero-shot and finetuning regimes.
|
|
CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving
Jonathan Booher,
Khashayar Rohanimanesh,
Junhong Xu,
Vladislav Isenbaev,
Ashwin Balakrishna,
Ishan Gupta,
Wei Liu,
Aleksandr Petiushko
Preprint.
Website /
PDF /
Bibtex
Develops a safe reinforcement learning system for autonomous motion selection as part of the NuroDriver. Tested extensively in simulation and real world autonomous trials!
|
|
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration (>150 authors)
International Conference on Robotics and Automation (ICRA), 2024 - Best Conference Paper
Website /
PDF /
Bibtex
A large-scale cross-embodiment dataset and results suggesting positive policy transfer across robot embodiments.
|
|
DROID: A Large-Scale In-the-Wild Robot Manipulation Dataset
Alexander Khazatsky*,
Karl Pertsch*,
Suraj Nair,
Ashwin Balakrishna,
Sudeep Dasari,
Siddharth Karamcheti,
Soroush Nasiriany,
Mohan Kumar Srirama,
et al. (~100 authors)
Robotics Science and Systems (RSS), 2024
Website /
PDF /
Bibtex
A large-scale, in-the-wild robot manipulation dataset of 75K+ trajectories collected in various settings such as homes, labs, offices and more across multiple continents! Initial experiments suggest that co-training policies with DROID significantly improves policy robustness and OOD generalization.
|
|
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti,
Suraj Nair,
Ashwin Balakrishna,
Percy Liang,
Thomas Kollar,
Dorsa Sadigh
International Conference on Machine Learning (ICML), 2024
Website /
PDF /
Bibtex
A thorough investigation of what design choices matter most for building performant visually-conditioned language models. We release optimized and hackable training code, evaluation code, and all trained models for the community to build on.
|
|
Monte Carlo Actor Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
Albert Wilcox,
Ashwin Balakrishna,
Jules Dedieu,
Wyame Benslimane,
Daniel Brown,
Ken Goldberg
Conference on Neural Information Processing Systems (NeurIPS), 2022
Website /
PDF /
Bibtex
A simple reinforcement learning algorithm that can be applied to any existing actor critic method to accelerate exploration for sparse reward tasks.
|
|
Dynamics-Aware Comparison of Learned Reward Functions
Blake Wulfe,
Ashwin Balakrishna,
Logan Ellis,
Jean Mercat,
Rowan McAllister,
Adrien Gaidon
International Conference on Learning Representations (ICLR), 2022 - Spotlight Presentation
Website /
PDF /
Bibtex
An algorithm for robust off policy comparison of learned reward functions.
|
|
MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance
Michael Luo,
Ashwin Balakrishna,
Brijen Thananjeyan,
Suraj Nair,
Julian Ibarz,
Jie Tan,
Chelsea Finn,
Ion Stoica,
Ken Goldberg
NeurIPS Workshop on Safe and Robust Control of Uncertain Systems, 2021
Website /
PDF /
Bibtex
Safe exploration by meta-learning risk measures across environments with different dynamics.
|
|
ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning
Ryan Hoque,
Ashwin Balakrishna,
Ellen Novoseller,
Albert Wilcox,
Daniel S. Brown,
Ken Goldberg
Conference on Robot Learning (CoRL), 2021 - Oral Presentation
Website /
PDF /
Bibtex
An algorithm for query-efficient interactive imitation learning which learns to cede control to a supervisor when (1) in novel states or (2) in bottlenecks where task success is unlikely.
|
|
LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks
Albert Wilcox*,
Ashwin Balakrishna*,
Brijen Thananjeyan,
Joseph E. Gonzalez,
Ken Goldberg
Conference on Robot Learning (CoRL), 2021
Website /
PDF /
Bibtex
Safe and efficient RL from image observations by leveraging suboptimal demonstrations to structure exploration and examples of constraint violations to satisfy user-specified constraints.
|
|
LazyDAgger: Reducing Context Switching in Interactive Imitation Learning
Ryan Hoque,
Ashwin Balakrishna,
Carl Putterman,
Michael Luo,
Daniel S. Brown,
Daniel Seita,
Brijen Thananjeyan,
Ellen Novoseller,
Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2021
PDF /
Bibtex
An algorithm for reducing supervisor burden by limiting context switches in interactive imitation learning.
|
|
Policy Gradient Bayesian Robust Optimization for Imitation Learning
Zaynah Javed*,
Daniel Brown*,
Satvik Sharma,
Jerry Zhu,
Ashwin Balakrishna,
Marek Petrik,
Anca D. Dragan,
Ken Goldberg
International Conference on Machine Learning (ICML), 2021
Website /
PDF /
Bibtex
A scalable and robust RL algorithm which optimizes for a combination of expected performance and tail risk under a distribution over learned reward functions.
|
|
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Suraj Nair,
Michael Luo,
Krishnan Srinivasan,
Minho Hwang,
Joseph E. Gonzalez,
Julian Ibarz,
Chelsea Finn,
Ken Goldberg
Robotics and Automation Letters (RA-L) Journal and International Conference on Robotics and Automation (ICRA), 2021 - Mentioned in Google AI Year in Review
Website /
PDF /
Bibtex
An algorithm for safe reinforcement learning which utilizes a set of offline data to learn about constraints before policy learning and a pair of policies which seperate the often conflicting objectives of task directed exploration and constraint satisfaction to learn contact rich and visuomotor control tasks.
|
|
ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Ugo Rosolia,
Joseph E. Gonzalez,
Aaron Ames,
Ken Goldberg
Algorithmic Foundations of Robotics (WAFR), 2020 - Invited to IJRR Special Issue
Website /
PDF /
Bibtex
An MPC-based algorithm for robotic control (ABC-LMPC) with (1) performance and safety guarantees for stochastic nonlinear systems and (2) the ability to continuously explore the environment and expand the controller domain.
|
|
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Ugo Rosolia,
Felix Li,
Rowan McAllister,
Joseph E. Gonzalez,
Sergey Levine,
Francesco Borrelli,
Ken Goldberg
Robotics and Automation Letters (RA-L) Journal and International Conference on Robotics and Automation (ICRA), 2020
Website /
PDF /
Bibtex
A new algorithm for safe and efficient reinforcement learning (SAVED) which leverages a small set of suboptimal demonstrations and prior task successes to structure exploration. SAVED also provides a mechanism for handling state-space constraints by leveraging probabilistic estimates of system dynamics.
|
|
On-Policy Robot Imitation Learning from a Converging Supervisor
Ashwin Balakrishna*,
Brijen Thananjeyan*,
Jonathan Lee,
Felix Li,
Arsh Zahed,
Joseph E. Gonzalez,
Ken Goldberg
Conference on Robot Learning (CoRL), 2019 - Oral Presentation
PDF /
Bibtex
A new formulation of imitiation learning from a non-stationary supervisor, associated theoretical analysis, and a practical algorithm to apply this formulation to develeop an RL algorithm which combines the sample efficiency of model-based RL and the fast policy evaluation enabled by model-free policies.
|
Algorithms for Specific Robotic Manipulation Tasks
|
|
LEGS: Learning Efficient Grasping Sets for Exploratory Grasping
Leitan Fu,
Michael Danielczuk,
Ashwin Balakrishna,
Daniel S. Brown,
Jeffrey Ichnowski,
Eugen Solowjow,
Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2022
Website /
PDF /
Bibtex
An algorithm for rapidly exploring large sets of grasps on objects with challenging geometries.
|
|
Disentangling Dense Multiple-Cable Knots
Vainavi Viswanath*,
Jennifer Grannen*,
Priya Sundaresan*,
Brijen Thananjeyan,
Ashwin Balakrishna,
Ellen Novoseller,
Jeffrey Ichnowski,
Michael Laskey,
Joseph E. Gonzalez,
Ken Goldberg
International Conference on Intelligent Robots and Systems (IROS), 2021
Website /
PDF /
Bibtex
Learning to disentangle multiple cables in a variety of different knotted configurations.
|
|
Kit-Net: Self-Supervised Learning to Kit Novel 3D Objects into Novel 3D Cavities
Shivin Devgon,
Jeffrey Ichnowski,
Michael Danielczuk,
Daniel S. Brown,
Ashwin Balakrishna,
Shirin Joshi,
Eduardo M. C. Rocha,
Eugen Solowjow,
Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2021
Website /
PDF /
Bibtex
Learning to kit novel 3D objects in complex 3D cavities with self-supervised rotation estimation.
|
|
Untangling Dense Non-Planar Knots by Learning Manipulation Features and Recovery Policies
Priya Sundaresan*,
Jennifer Grannen*,
Brijen Thananjeyan,
Ashwin Balakrishna,
Jeffrey Ichnowski,
Ellen Novoseller,
Minho Hwang,
Michael Laskey,
Joseph E. Gonzalez,
Ken Goldberg
Robotics Science and Systems (RSS), 2021
Website /
PDF /
Bibtex
Learning robust untangling policies with learned progress detection and recovery behaviors.
|
|
Learning Dense Visual Correspondences in Simulation to Smooth and Fold Real Fabrics
Aditya Ganapathi,
Priya Sundaresan,
Brijen Thananjeyan,
Ashwin Balakrishna,
Daniel Seita,
Jennifer Grannen,
Minho Hwang,
Ryan Hoque,
Joseph E. Gonzalez,
Nawid Jamali,
Katsu Yamane,
Soshi Iba,
Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2021
Website /
PDF /
Bibtex
A general method for multi-task fabric manipulation using learned visual correspondences which can be applied across different robots to manipulate fabrics of varying shapes and colors.
|
|
Exploratory Grasping: Asymptotically Optimal Algorithms for Grasping Challenging Polyhedral Objects
Michael Danielczuk*,
Ashwin Balakrishna*,
Daniel Brown,
Shivin Devgon,
Ken Goldberg
Conference on Robot Learning (CoRL), 2020
Website /
PDF /
Bibtex
An asymptotically optimal algorithm for rapidly learning to grasp new objects through online exploration.
|
|
Untangling Dense Knots by Learning Task-Relevant Keypoints
Jennifer Grannen*,
Priya Sundaresan*,
Brijen Thananjeyan,
Jeffrey Ichnowski,
Ashwin Balakrishna,
Minho Hwang,
Vainavi Viswanath,
Michael Laskey,
Joseph E. Gonzalez,
Ken Goldberg
Conference on Robot Learning (CoRL), 2020 - Oral Presentation
Website /
PDF /
Bibtex
Algorithms for untying dense knots in cables in the real world.
|
|
Deep Imitation Learning of Sequential Fabric Smoothing from an Algorithmic Supervisor
Daniel Seita,
Aditya Ganapathi,
Ryan Hoque,
Minho Hwang,
Edward Cen,
Ajay Kumar Tanwani,
Ashwin Balakrishna,
Brijen Thananjeyan,
Jeffrey Ichnowski,
Nawid Jamali,
Katsu Yamane,
Soshi Iba,
John F. Canny,
Ken Goldberg
International Conference on Intelligent Robots and Systems (IROS), 2020
Website /
PDF /
Bibtex
A new fabric simulator for learning fabric smoothing policies and learned policies which successfully smooth fabric in simulation and transfer to physical robotic systems.
|
|
MMGSD: Multi-Modal Gaussian Shape Descriptors for Correspondence Matching in 1D and 2D Deformable Objects
Aditya Ganapathi*,
Priya Sundaresan*,
Brijen Thananjeyan,
Ashwin Balakrishna,
Daniel Seita,
Ryan Hoque,
Joseph E. Gonzalez,
Ken Goldberg
International Conference on Intelligent Robots and Systems (IROS) Workshop on Managing Deformation, 2020
PDF /
Bibtex
A new algorithm for learning symmetry aware visual representations for deformable objects.
|
|
Accelerating Grasp Exploration by Leveraging Learned Priors
Katherine Li*,
Michael Danielczuk*,
Ashwin Balakrishna*,
Vishal Satish,
Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2020
PDF /
Bibtex
An algorithm for leveraging priors from general purpose grasping systems to accelerate online grasp exploration on novel, difficult to grasp objects.
|
|
Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms
Shivin Devgon,
Jeffrey Ichnowski,
Ashwin Balakrishna,
Harry Zhang,
Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2020
Website /
PDF /
Bibtex
A self-supervised algorithm which learns to orient unseen objects with unknown geometry given only a depth image observation of the desired orientation.
|
|
VisuoSpatial Foresight (VSF) for Multi-Step, Multi-Task Fabric Manipulation
Ryan Hoque*,
Daniel Seita*,
Ashwin Balakrishna,
Aditya Ganapathi,
Ajay Tanwani,
Nawid Jamali,
Katsu Yamane,
Soshi Iba,
Ken Goldberg
Autonomous Robots Journal Special Issue, 2021 and Robotics Science and Systems (RSS), 2020
Website /
PDF /
Bibtex
A method for multi task fabric manipulation by leveraging recent advances in video prediction and depth sensing.
|
|
Learning Interpretable and Transferable Rope Manipulation Policies Using Depth Sensing and Dense Object Descriptors
Priya Sundaresan,
Jennifer Grannen,
Brijen Thananjeyan,
Ashwin Balakrishna,
Michael Laskey,
Kevin Stone,
Joseph E. Gonzalez,
Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2020
Website /
PDF /
Bibtex
An algorithm for learning visual correspondences for highly deformable objects and an associated controller which is used to manipulate rope into a variety of different arrangements either by learning from demonstrations or by designing interpretable geometric policies on top of the learned visual representation.
|
|
Automating Planar Object Singulation by Linear Pushing with Single-point and Multi-point Contacts
Zisu Dong,
Sanjay Krishnan,
Sona Dolasia,
Ashwin Balakrishna,
Michael Danielczuk,
Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2019
PDF /
Bibtex
An efficient geometric algorithm (ClusterPush) for singulating a set of clustered planar objects.
|
|
Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter
Michael Danielczuk*,
Andrey Kurenkov*,
Ashwin Balakrishna,
Matthew Matl,
David Wang,
Roberto Martin-Martin,
Animesh Garg,
Silvio Savarase,
Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2019
Website /
PDF /
Bibtex
Formulation and algorithms for the problem of efficiently identifying and retrieving a specific object in a cluttered environment.
|
Machine Learning for Physical Sciences
|
|
Fabry-PĂ©rot optical sensor and portable detector for monitoring high-resolution ocular hemodynamics
Jeong Oen Lee,
Vinayak Narasimhan,
Ashwin Balakrishna,
Marcus R. Smith
Juan Du,
David Stretavan,
Hyuck Choo
Photonics Technology Letters, 2019
PDF /
Bibtex
High resolution measurement of both intraocular pressure and ocular pulsation profiles using an implmantable micro-optical sensor and portable optical detector.
|
|
Reliable Real-time Seismic Signal/Noise Discrimination with Machine Learning
Men-Andrin Meier,
Zachary E. Ross,
Anshul Ramachandran,
Ashwin Balakrishna,
Suraj Nair,
Peter Kundzicz,
Zefeng Li,
Jennifer Andrews,
Egill Hauksson,
Yisong Yue
Journal of Geophysical Research: Solid Earth, 2018
PDF /
Bibtex
Algorithms for rapid and reliable earthquake detection for earthquake early warning systems.
|
|
Predicting Electric Vehicle Charging Station Usage: Using Machine Learning to Estimate Individual Station Statistics from Physical Configurations of Charging Station Networks
Anshul Ramachandran,
Ashwin Balakrishna,
Peter Kundzicz,
Anirudh Neti
Preprint, 2018
PDF /
Bibtex
Algorithms for predicting electrical vehicle power usage for different charging network designs.
|
|
Machine Learning Methods for Rapid, Real-Time Pressure Readout from an Optics-Based Intraocular Pressure Sensor
Ashwin Balakrishna,
Jeong Oen Lee,
Hyuck Choo
Preprint, 2018
PDF /
Bibtex
An evaluation of different machine learning algorithms for intraocular pressure measurement.
|
|
A microscale optical implant for continuous in vivo monitoring of intraocular pressure
Jeong Oen Lee,
Haeri Park,
Juan Du,
Ashwin Balakrishna,
Oliver Chen,
David Stretavan,
Hyuck Choo
Nature: Microsystems and Nanoengineering, 2017
PDF /
Bibtex
A new microscale implantable sensor and associated algorithms for accurate and convenient measurement of intraocular pressure.
|
|
Novel positioning sensor with real-time feedback for improved postoperative positioning: pilot study in control subjects
Frank Brodie,
David Ramirez*,
Sundar Pandian*,
Kelly Woo,
Ashwin Balakrishna,
Eugene De Juan,
Hyuck Choo,
Robert H Grubbs
Clinical Opthalmology, 2017
PDF /
Bibtex
A new wearable, wireless sensor to aid postoperative recovery from retinal detachment surgery.
|
|
Validation of sensor for postoperative positioning with intraocular gas
Frank Brodie,
Kelly Woo,
Ashwin Balakrishna,
Hyuck Choo,
Robert H Grubbs
Clinical Opthalmology, 2016
PDF /
Bibtex
A simple, wearable electromechanical sensor to aid postoperative recovery from retinal detachment surgery.
|
|
A Neural Network Approach to Monitor Intraocular Pressure for Glaucoma Diagnosis
Ashwin Balakrishna,
Oliver Chen,
Jeong Oen Lee,
Hyuck Choo
Progress In Electromagnetics Research Symposium (PIERS), 2016
PDF /
Bibtex
A neural network based algorithm to efficiently extract accurate intraocular pressure measurements from reflection spectra from an optics-based intraocular pressure sensor.
|
|
In vivo Intraocular Pressure Monitoring using Implantable Optomechanical Sensor
Jeong Oen Lee,
Haero Park,
Juan Du,
Vinayak Narasimhan,
Ashwin Balakrishna,
Oliver Chen,
David Stretavan,
Hyuck Choo
Interenational Symposium on Optomechatronic Technology (ISOT), 2016
PDF /
Bibtex
Evaluation of a new optics-based intraocular pressure sensor in live rabbits.
|
|
Efficient Power Generation from Vocal Folds Vibrations for Medical Electronic Implants
Hyunjun Cho,
Ashwin Balakrishna,
Yuan Ma,
Jeong Oen Lee,
Hyuck Choo
International Conference on Micro-Electro-Mechanical Systems (MEMS), 2016
PDF /
Bibtex
A piezoelectric based device to harvest power from human vocal cord vibrations.
|
|
Optimal Control Strategies for Trajectory Optimization with Applications to Continuous Solar Flight
Ashwin Balakrishna
INFORMS Annual Meeting, High School Mathematical Science Journal, Intel Science Talent Search Semifinalist, 2014
PDF /
Bibtex
Mathematical model and algorithm for controlling continuously flying solar aircraft.
|
|