Shubhashis Roy Dipta

sroydip1@umbc.edu


Amazon Science (Alexa)
Applied Scientist Intern (Incoming)
Summer 2026
Manager: Dr. Lichao Wang
Seattle, WA
Amazon Science (Alexa)
Applied Scientist Intern
Summer 2025
Manager: Dr. Lichao Wang
Seattle, WA
Mentor: Dr. Daniel Bis
Scale AI
Machine Learning Research Intern
Summer 2024
Manager: Dr. Adrian Lam
San Francisco, CA
Mentor: Vijay Kalmath
University of Maryland, Baltimore County
Ph.D. in Computer Science
Fall 2023 - Present
Advisor: Dr. Frank Ferraro
Grade: 4.00/4.00
Publications: See Here (From 2023)
University of Maryland, Baltimore County
M.Sc. in Computer Science
Spring 2021 - Spring 2023
Awards: Phi Kappa Phi
Grade: 4.00/4.00
Morgan State University
Research Assistant
2017 - 2019
Advisor: Dr. Iman Dehzangi
Publications: 4 Journal
UniShopr.com
Founder
2017 - 2021

Upcoming Travel

  • ACL 2026 in San Diego, CA (Jul 3-7)
Previous
  • ✅ NeurIPS 2025 in San Diego, CA (Dec 2-7)
  • ❌ AACL 2025 in Mumbai, India (Dec 20-24) (canceled)
👋 I'm open to meet! Email me to schedule a chat!

Peer Review

Reviewed 28+ papers across top venues (2023–2025).

Conferences
*SEMCOLINGNAACLACLNeurIPS
Workshops
W-NUTSemEvalSRWTrustNLPELVM
Journals
Computational and Structural Biotechnology JournalPlant MethodsScientific ReportsBMC Bioinformatics

Shubhashis is a final-year Ph.D. researcher in Computer Science at the University of Maryland, Baltimore County (UMBC), advised by Dr. Frank Ferraro. Modern LLMs answer fluently but fail predictably - under complex reasoning, under tool-use, and under modality conflict. He targets each failure mode: structured decomposition for verifiable reasoning, RL-trained agents for robust tool-use, and benchmarks that surface multimodal calibration gaps.

  • Reasoning & Decomposition. Presupposition-free question decomposition makes claim verification robust [De-Presuppose]; hierarchical variational structure yields compositional event representations [SHEM]; distractor-aware computational graphs match reasoning-specialized models at 89% lower token cost [DAGGER]; curriculum-driven GRPO brings competitive math reasoning to under-resourced Bengali [GanitLLM].

  • Reinforcement Learning & LLM Agents. Policy-aware tool-calling agents outperform in-context baselines by 16 points while using 40% fewer tokens via RL Alignment [PA3]; multi-agent benchmarks expose where good agents become bad collaborators [AgentCollabBench]; pattern-aware tool-integrated reasoning learns when to fire, not just how [Double-TIR]; rock-token analysis explains why on-policy distillation works [Rock Tokens].

  • Multimodal Learning & Evaluation. Fact-aware reference-free evaluators flag unreliable video captions [VC-Inspector]; benchmarks that break modality consensus probe calibrated abstention in omni-modal models [OMD]; query-to-event decomposition enables zero-shot multilingual text-to-video retrieval [Q2E]; a survey maps multimodal unlearning across vision, language, video, and audio [Survey].

Industry Experience

  • Amazon Science (Alexa AI) - Applied Scientist Intern, Summer 2025 + Summer 2026. Extended Deliberative Alignment to tool-calling LLMs in PA3, cutting tokens by 40% while gaining 16 points over in-context baselines.
  • Scale AI - ML Research Intern, Summer 2024. Trained text-to-SQL with a small LLM-as-a-judge using Online-KTO (Scale AI blog).

Graduating Spring 2027 - actively seeking Research Scientist roles in NLP / Multimodal AI. Please reach out if you have an opening.

Beyond Research

Shubhashis founded UniShopr (2017–2021), a cross-border e-commerce platform serving consumers in Bangladesh.

He has also competed internationally in robotics and algorithms - placing 9th at the University Rover Challenge 2015 (Utah, USA) and 22nd at the European Rover Challenge 2016 (Poland), ranking 8th out of 300+ teams at the 2018 ACM ICPC Asia Dhaka Regional with multiple regional and national placements, and reaching the top 70 on Kaggle đŸ„‰ in the Birdcall Identification competition. See more.

Recent News

Apr 7, 2026 Sucessfully defended my PhD Proposal. Now I am offciially ABD (All But Dissertation). Slides of the proposal can be found here.
Apr 6, 2026 đŸ„ł 3 of my papers got accepted to ACL 2026 (VC Inspector, GanitLLM, and Survey on Multimodal Unlearning).
Jan 28, 2026 đŸ„ł 2 of my mentored papers got accepted at LoResLM Workshop at EACL 2026 [ 1, 2 ]
Dec 3, 2025 ✈ Heading to San Diego for NeurIPS 2025. Email me to schedule a chat!

Featured Publications

Check out Google Scholar for a full list of my publications.

  1. Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
    Submitted
    Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
    Yuxuan Jiang*, Runchao Li*, Shubhashis Roy Dipta*, and 2 more authors
    Preprint 2026
    * Equal contribution
  2. AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators
    Submitted
    AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators
    Aritra Mazumder, Shubhashis Roy Dipta, Nusrat Jahan Lia, and 10 more authors
    Preprint 2026
  3. PA3: Policy-Aware Agent Alignment through Chain-of-Thought
    Submitted
    PA3: Policy-Aware Agent Alignment through Chain-of-Thought
    Shubhashis Roy Dipta, Daniel Bis, Kun Zhou, and 4 more authors
    Preprint 2026
    Work done during internship at Amazon Alexa AI
  4. †DAGGER: Distractor-Aware Graph Generation for Executable Reasoning in Math Problems
    Submitted
    †DAGGER: Distractor-Aware Graph Generation for Executable Reasoning in Math Problems
    Zabir Al Nazi, Shubhashis Roy Dipta, and Sudipta Kar
    Preprint 2026
  5. Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention
    Submitted
    Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention
    Zabir Al Nazi*, Shubhashis Roy Dipta*, and Md Rizwan Parvez
    Preprint 2026
    * Equal contribution
  6. GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
    ACL
    GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
    Shubhashis Roy Dipta, Khairul Mahbub, and Nadia Najjar
    ACL 2026
  7. Advancing Reference-free Evaluation of Video Captions with Factual Analysis
    ACL
    Advancing Reference-free Evaluation of Video Captions with Factual Analysis
    Shubhashis Roy Dipta, Tz-Ying Wu, and Subarna Tripathi
    ACL 2026
  8. Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks
    ACL
    Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks
    Nobin Sarwar, Shubhashis Roy Dipta, Zheyuan Liu, and 1 more author
    ACL 2026
  9. Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
    AACL
    Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
    Shubhashis Roy Dipta, and Francis Ferraro
    AACL 2025
  10. If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
    *SEM
    If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
    Shubhashis Roy Dipta, and Francis Ferraro
    *SEM 2025
  11. Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
    Preprint
    Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
    Ningning Xu, Yuxuan Jiang, and Shubhashis Roy Dipta
    MathAI @NeurIPS 2025
  12. Semantically-informed Hierarchical Event Modeling
    *SEM
    Semantically-informed Hierarchical Event Modeling
    Shubhashis Roy Dipta, Mehdi Rezaee, and Francis Ferraro
    *SEM 2023