Reviewed 28+ papers across top venues (2023â2025).
Conferences
*SEMCOLINGNAACLACLNeurIPS
Workshops
W-NUTSemEvalSRWTrustNLPELVM
Journals
Computational and Structural Biotechnology JournalPlant MethodsScientific ReportsBMC Bioinformatics
Shubhashis is a final-year Ph.D. researcher in Computer Science at the University of Maryland, Baltimore County (UMBC), advised by Dr. Frank Ferraro. Modern LLMs answer fluently but fail predictably - under complex reasoning, under tool-use, and under modality conflict. He targets each failure mode: structured decomposition for verifiable reasoning, RL-trained agents for robust tool-use, and benchmarks that surface multimodal calibration gaps.
Reasoning & Decomposition. Presupposition-free question decomposition makes claim verification robust [De-Presuppose]; hierarchical variational structure yields compositional event representations [SHEM]; distractor-aware computational graphs match reasoning-specialized models at 89% lower token cost [DAGGER]; curriculum-driven GRPO brings competitive math reasoning to under-resourced Bengali [GanitLLM].
Reinforcement Learning & LLM Agents. Policy-aware tool-calling agents outperform in-context baselines by 16 points while using 40% fewer tokens via RL Alignment [PA3]; multi-agent benchmarks expose where good agents become bad collaborators [AgentCollabBench]; pattern-aware tool-integrated reasoning learns when to fire, not just how [Double-TIR]; rock-token analysis explains why on-policy distillation works [Rock Tokens].
Multimodal Learning & Evaluation. Fact-aware reference-free evaluators flag unreliable video captions [VC-Inspector]; benchmarks that break modality consensus probe calibrated abstention in omni-modal models [OMD]; query-to-event decomposition enables zero-shot multilingual text-to-video retrieval [Q2E]; a survey maps multimodal unlearning across vision, language, video, and audio [Survey].
Industry Experience
Amazon Science (Alexa AI) - Applied Scientist Intern, Summer 2025 + Summer 2026. Extended Deliberative Alignment to tool-calling LLMs in PA3, cutting tokens by 40% while gaining 16 points over in-context baselines.
Scale AI - ML Research Intern, Summer 2024. Trained text-to-SQL with a small LLM-as-a-judge using Online-KTO (Scale AI blog).
Graduating Spring 2027 - actively seeking Research Scientist roles in NLP / Multimodal AI. Please reach out if you have an opening.
Beyond Research
Shubhashis founded UniShopr (2017â2021), a cross-border e-commerce platform serving consumers in Bangladesh.
He has also competed internationally in robotics and algorithms - placing 9th at the University Rover Challenge 2015 (Utah, USA) and 22nd at the European Rover Challenge 2016 (Poland), ranking 8th out of 300+ teams at the 2018 ACM ICPC Asia Dhaka Regional with multiple regional and national placements, and reaching the top 70 on Kaggle đ„ in the Birdcall Identification competition. See more.