Abhinav Muraleedharan

I am a graduate student at the University of Toronto. At UofT, I work under the supervision of Prof. Nathan Wiebe and Prof. Roger Grosse. My research interests span Quantum Algorithms, Reinforcment Learning, and Alignment of large language models.

I invented Score-life programming.

Email  /  CV  /  Bio  /  Google Scholar  /  Twitter  /  Github  /  MS Thesis

profile photo
Research

My reinforcement learning research focuses on the development of efficient reinforcement learning algorithms for training generally intelligent agents. I also work on developing efficient quantum algorithms for training large scale machine learning models.

prl Beyond Dynamic Programming
Abhinav Muraleedharan
Research Paper (Under Review), 2023
bibtex

In this paper, I introduced Score-life programming, a novel theoretical approach for solving reinforcement learning problems. In contrast with classical dynamic programming-based methods, the methods in this work can search over non-stationary policy functions, and can directly compute optimal infinite horizon action sequences from a given state.

Misc
Finalist, Indian Innovation Challenge 2017



cs188 Course Instructor, UofT Engineering Outreach Office, DEEP Summer Academy
Course Instructor,Math Outreach Office, UofT
Blog Posts
(Philosophical)
NA
The Measuring Instrument

This website code is borrowed from: source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.