I'm interested in building self-improving AI systems that are capable of formal reasoning (e.g., proving mathematical theorems, synthesizing and verifying programs) and open-ended discovery. This requires interfacing ideas from type theory (to define a game of formal theorem proving), reinforcement learning (to become steadily better at this game), language models (to represent policies, value functions, and leverage informal reasoning), program induction (to discover useful symbolic abstractions - lemmas, tactics, or definitions), and the whole toolbox from game-playing AI (e.g., tree search, self-play). Ultimately, I believe formal systems also ought to be useful for human learning if we understand how to pedagogically leverage their feedback on one's reasoning, rather than final answers, and their support for open-ended exploration.
If these topics sound interesting to you, and you're thinking about pursuing a PhD in Computer Science, consider applying to UMich and mentioning my name in your application!
COLM 2025 |
From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models [Link] |
OOPSLA 2025 |
Automated Discovery of Tactic Libraries for Interactive Theorem Proving [arXiv] |
ICML 2025 Spotlight |
Position: Formal Mathematical Reasoning - A New Frontier in AI [arXiv] |
ICLR 2025 |
h4rm3l: A Language for Composable Jailbreak Attack Synthesis [Link] [arXiv] [Code] |
POPL Dafny Workshop 2025 |
dafny-annotator: AI-Assisted Verification of Dafny Programs [arXiv] [Code] [Blog post] |
NeurIPS 2024 Oral |
Learning Formal Mathematics From Intrinsic Motivation [arXiv] [Code] |
ICML 2024 |
When do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions [Link] [arXiv] |
TMLR 2024 Invited to ICLR |
Certified Deductive Reasoning with Language Models [arXiv] |
ICLR 2024 |
Hypothesis Search: Inductive Reasoning with Language Models [arXiv] |
Phil. Trans. of the Royal Society A 2023 |
Peano: Learning Formal Mathematical Reasoning [arXiv] |
NeurIPS 2023 Spotlight |
Parsel🐍: Algorithmic Reasoning with Language Models by Composing Decompositions [arXiv] |
NeurIPS Math-AI 2022 |
Lemma: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions [Link] [arXiv] [PDF] |
CogSci 2022 |
Left to the Reader: Abstracting Solutions in Mathematical Reasoning [PDF] |
ICLR 2022 |
Synchromesh: Reliable Code Generation from Pre-trained Language Models [PDF] |
NeurIPS 2021 |
Contrastive Reinforcement Learning of Symbolic Reasoning Domains [arXiv] |
EMNLP 2021 |
Open-domain clarification question generation without question examples [Link] [PDF] |
AAAI 2021 |
Pragmatic Code Autocomplete [Link] [PDF] [Code] |
OOPSLA 2020 |
Dynamic Dispatch of Context-Sensitive Optimizations [Link] [PDF] [Code] |
OOPSLA 2017 |
Static Placement of Computation on Heterogeneous Devices [Link] [PDF] |
ECML/PKDD 2014 |
A Lossless Data Reduction for Mining Constrained Patterns in n-ary Relations [Link] [PDF] |
Programming contests. I used to be an ACM-ICPC competitor (world finalist in 2015), and generally involved in programming contests in various ways. In particular, I authored 3 problems for the ACM-ICPC Latin American regionals, 2 in 2017, one in 2020, and another one upcoming in 2023. I've also coached several teams, taught at training camps in Latin America, and co-authored the problems that selected high schoolers to represent Brazil in the International Olympiad of Informatics in 2018.
Data musicalization. I've been having a lot of fun in creating music from data, as a powerful way to subjectively experience information. One recent finished project on this line was a musicalization of the COVID deaths in Brazil, along with the equally disturbing reactions from our president at the time, which you can watch on YouTube.