Saraswathy Amjith

SAGE: Self-play Adversarial Games Enhance Large Language Model Reasoning Capabilities

Amjith, S., Wang, M.X., Lynch, J., Gundlach, H., & Thompson, N.

RSI @ ICLR 2026

A framework for improving LLM reasoning through adversarial self-play where a Setter generates challenging problems and a Solver attempts to solve them, achieving up to +10% on MATH and +8% on MBPP with cross-domain transfer.

Paper

Self-Questioning Vision-Language Models: Reinforcement Learning for Compositional Visual Reasoning

Amjith, S.

E23D @ CVPR 2026

A self-questioning framework that trains a VLM to decompose compositional visual questions into sub-questions using GRPO, without any reasoning demonstrations. Applied to a 3B-parameter model on CLEVR and A-OKVQA, both self-questioning and standard RL substantially improve accuracy over the untrained model.

Paper Code

In-Context Learning for Esoteric Programming Languages: Evaluating and Enhancing LLM Reasoning Without Fine-Tuning

Amjith, S., Kolla, A., Wang, M., Lynch, J., & Thompson, N.

DL4C @ NeurIPS 2025

Investiga ting how large language models can reason about and generate code in esoteric programming languages through in-context learning, without requiring fine-tuning.

Paper

A Novel Integrated ML Approach Utilizing Radar & Satellite Imagery for Selective Logging Detection

Amjith, S. & Fan, J.

TCCML @ NeurIPS 2025 Regeneron STS Finalist 2024

A multimodal deep learning approach integrating radar and satellite imagery to detect illegal selective logging activities with high accuracy.

Paper

Flawed Chain-of-Thought Reinforcement Learning Training for Mathematical Reasoning

Amjith, S., Dusad, M., Muramalla, M., & Shah, S.

Preprint

Teaching models to identify and recover from flawed reasoning chains using reinforcement learning techniques to improve mathematical problem-solving capabilities.

Paper

Papers

SAGE: Self-play Adversarial Games Enhance Large Language Model Reasoning Capabilities

Self-Questioning Vision-Language Models: Reinforcement Learning for Compositional Visual Reasoning

In-Context Learning for Esoteric Programming Languages: Evaluating and Enhancing LLM Reasoning Without Fine-Tuning

A Novel Integrated ML Approach Utilizing Radar & Satellite Imagery for Selective Logging Detection

Flawed Chain-of-Thought Reinforcement Learning Training for Mathematical Reasoning