Archit Sharma
Archit Sharma
PhD Student, Stanford University
Verified email at - Homepage
Cited by
Cited by
Direct preference optimization: Your language model is secretly a reward model
R Rafailov, A Sharma, E Mitchell, CD Manning, S Ermon, C Finn
Advances in Neural Information Processing Systems 36, 2024
Dynamics-aware unsupervised discovery of skills
A Sharma, S Gu, S Levine, V Kumar, K Hausman
International Conference on Learning Representations (ICLR), 2020, 2019
Open X-Embodiment: Robotic learning Datasets and RT-X Models
A Padalkar, A Pooley, A Jain, A Bewley, A Herzog, A Irpan, A Khazatsky, ...
arXiv preprint arXiv:2310.08864, 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
K Tian, E Mitchell, A Zhou, A Sharma, R Rafailov, H Yao, C Finn, ...
arXiv preprint arXiv:2305.14975, 2023
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning
A Sharma, M Ahn, S Levine, V Kumar, K Hausman, S Gu
Robotics: Science and Systems (RSS), 2020
Variational empowerment as representation learning for goal-based reinforcement learning
J Choi, A Sharma, H Lee, S Levine, SS Gu
arXiv preprint arXiv:2106.01404, 2021
Autonomous Reinforcement Learning via Subgoal Curricula
A Sharma, A Gupta, S Levine, K Hausman, C Finn
Thirty-Fifth Conference on Neural Information Processing Systems, 2021
Autonomous Reinforcement Learning: Formalism and Benchmarking
A Sharma, K Xu, N Sardana, A Gupta, K Hausman, S Levine, C Finn
arXiv preprint arXiv:2112.09605, 2021
Waypoint-Based Imitation Learning for Robotic Manipulation
LX Shi, A Sharma, TZ Zhao, C Finn
arXiv preprint arXiv:2307.14326, 2023
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning
A Sharma, R Ahmad, C Finn
arXiv preprint arXiv:2205.05212, 2022
You Only Live Once: Single-Life Reinforcement Learning
A Chen, A Sharma, S Levine, C Finn
Advances in Neural Information Processing Systems 35, 14784-14797, 2022
An Emulator for Fine-Tuning Large Language Models using Small Language Models
E Mitchell, R Rafailov, A Sharma, C Finn, CD Manning
arXiv preprint arXiv:2310.12962, 2023
Preference fine-tuning of llms should leverage suboptimal, on-policy data
F Tajwar, A Singh, A Sharma, R Rafailov, J Schneider, T Xie, S Ermon, ...
arXiv preprint arXiv:2404.14367, 2024
When to ask for help: Proactive interventions in autonomous reinforcement learning
A Xie, F Tajwar, A Sharma, C Finn
Advances in Neural Information Processing Systems 35, 16918-16930, 2022
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning
A Sharma, AM Ahmed, R Ahmad, C Finn
arXiv preprint arXiv:2303.01488, 2023
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
A Khazatsky, K Pertsch, S Nair, A Balakrishna, S Dasari, S Karamcheti, ...
arXiv preprint arXiv:2403.12945, 2024
A flexible probabilistic framework for large-margin mixture of experts
A Sharma, S Saxena, P Rai
Machine Learning 108 (8-9), 1369-1393, 2019
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
J Yang, MS Mark, B Vu, A Sharma, J Bohg, C Finn
arXiv preprint arXiv:2310.15145, 2023
Yell At Your Robot: Improving On-the-Fly from Language Corrections
LX Shi, Z Hu, TZ Zhao, A Sharma, K Pertsch, J Luo, S Levine, C Finn
arXiv preprint arXiv:2403.12910, 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
A Sharma, S Keh, E Mitchell, C Finn, K Arora, T Kollar
arXiv preprint arXiv:2402.12366, 2024
The system can't perform the operation now. Try again later.
Articles 1–20