Harshit Bokadia

Research Associate, SUTD (Singapore University of Technology and Design)

Harshit Bokadia

Research Associate · SUTD | Canadian Citizen

I am a Research Associate at SUTD (Singapore University of Technology and Design). My research builds AI pipelines for real-time estimation of learners' internal world models from motion-capture kinematic data, modelling movement exploration as a Markov Decision Process, running Bayesian inverse RL to recover reward structure and state-transition dynamics, and using diffusion models and inferred world models to scaffold open-ended embodied creativity in real time. This work connects foundational questions in world model learning and generative modelling to the physical, embodied setting.

My background spans a deliberate sequence of research and engineering roles, each building toward my current focus:

2008–2014BTech, Mechanical Engineering followed by nearly two years as a Project Engineer in Industry 4.0 manufacturing in India.
2016–2018MS, Industrial and Systems Engineering, Rutgers — depth in optimization, statistics, and systems thinking; thesis on deep learning for semiconductor process control.
2018–2020Computational neuroscience and cognitive science, Rutgers — electroencephalography (EEG), motor control and sensorimotor integration, Brain-Machine Interfaces (BMI), neural connectivity analysis, and digital biomarkers for autism including the Autism Diagnostic Observation Schedule (ADOS).
2020–2021Explainable AI (XAI) research under DARPA (Defense Advanced Research Projects Agency) / AFRL (Air Force Research Laboratory), Rutgers-Newark — novel perceptual and semantic interpretability metrics for medical imaging.
2021NeuroAI Intern, Mila — neural ODE modeling, EEG analysis, and generative modeling of neural dynamics.
2022Medical AI group, University of Waterloo — vector-symbolic architectures for structured knowledge encoding in clinical NLP.
2023–2025Research Engineer, Holland Bloorview Kids Rehabilitation Hospital — clinical data science for neurodivergent populations; medication recommendation AI; biosignal processing; large-scale POND (Province of Ontario Neurodevelopmental Disorders) network clinical trial data.
2025Data Scientist, Waypoint Centre for Mental Health Care — production AI deployment in healthcare; EHR integration via Azure ML; regulatory-compliant ML pipelines.

My current work sits at the intersection of these threads and points toward where I want to go next: robot learning, robot foundation models, and diffusion-based generative models for embodied agents, world model architectures that generalize across tasks and embodiments, and the post-training and inference machinery that makes large models deployable and trustworthy. My interpretability work, building evaluation frameworks for explanation faithfulness under DARPA/AFRL, gives me a concrete foothold in mechanistic interpretability of LLMs: understanding what circuits and features actually encode, not just what outputs look like.

Research interests

Embodied AI / Robotics World model learning and representation in agents · robot learning and robot foundation models · diffusion models for embodied action generation · model-based RL for generalization under partial observability · inferring latent dynamics from high-dimensional behavioral data
LLMs / MLLMs Post-training (RLHF — Reinforcement Learning from Human Feedback; DPO — Direct Preference Optimization; instruction tuning) · inference efficiency (speculative decoding, vLLM, quantization) · VSA (vector-symbolic architecture)-structured knowledge encoding
Interpretability Mechanistic interpretability of transformer circuits · feature geometry and superposition in LLMs · evaluation frameworks for explanation faithfulness (background: saliency metrics, semantic overlap, DARPA XAI)
Clinical AI Biosignal processing (EEG, accelerometry) · medication recommendation in neurodivergent populations · regulatory-compliant ML deployment

Publications

Google Scholar · 66 citations · h-index 4 ↗

Embodied AI · World Models · Physical AI

SUTD · Science of Learning Project · 2026–ongoing

Bokadia H. Research Associate · Uchiyama R. et al.

AI-Driven Scaffolding of Embodied Creativity

End-to-end pipeline for real-time estimation of learners' internal world models from motion-capture data. Human movement exploration is modelled as a Markov Decision Process (MDP); Bayesian inverse RL recovers reward structure and state-transition dynamics; diffusion models generate adaptive goal states; UMAP-clustered movement embeddings drive scaffolding decisions. MuJoCo humanoid simulation grounds model learning in physically plausible motor behaviour.

Methods: Bayesian inverse RL · MDP · Diffusion models · MuJoCo · UMAP · wearable physiological sensors

ACM MOCO 2020 · 5 citations · DOI ↗

Bokadia H. First Author , Cole J., Torres E.B.

Neural Connectivity Evolution during Adaptive Learning with and without Proprioception

Network-theoretic analysis of EEG-derived neural connectivity as participants learn novel motor tasks with and without proprioceptive feedback. Reveals how sensorimotor deprivation reshapes cortical communication graphs, directly informing world-model design for embodied agents learning from impoverished sensory streams.

Methods: EEG · EEGLAB · Cross-coherence · Graph-theoretic connectivity analysis

ACM DL ↗

XAI · Interpretability · Medical Imaging

Saliency map comparison grid for melanoma

Applied AI Letters · 2022 · 13 citations · DARPA/AFRL funded · DOI ↗

Bokadia H. First Author , Yang S.C.H., Li Z., Folke T., Shafto P.

Evaluating Perceptual and Semantic Interpretability of Saliency Methods: A Case Study of Melanoma

Designed two novel evaluation metrics for saliency-based XAI: visual incoherence (perceptual coherence of attribution maps) and textbook feature overlap (semantic alignment to dermatologist-defined ABCDE features). Benchmarked six saliency methods on VGG-16 across the ISIC melanoma dataset. Enables adaptive method selection for high-stakes clinical AI, a framework directly transferable to mechanistic interpretability of LLMs and VLMs.

Methods: VGG-16 · GradCAM · SHAP · LIME · RISE · Occlusion · PyTorch · ISIC dataset

Wiley ↗

LLMs · Vector-Symbolic Architectures · Knowledge Representation

ACM KDD Workshop on Knowledge-Infused Learning · 2024 · 1 citation · OpenReview ↗

Hu B.X., Yu T., Tuinstra T., Rezai R., Bokadia H. Co-Author, DiMaio R., Tripp B.P.

Encoding Medical Ontologies with Holographic Reduced Representations for Transformers

Introduces holographic reduced representations (HRRs), a vector-symbolic architecture, as a structure-preserving method to encode medical ontologies into transformer token spaces. HRRBase embeddings outperform unstructured embeddings on out-of-distribution disease prediction, including patients with entirely unseen ICD codes. Directly relevant to knowledge-infused LLM post-training, structured inference, and multimodal grounding.

Methods: HRR · VSA · SNOMED CT ontology · HRRBERT · MIMIC-IV · ICD coding

OpenReview ↗ Co-Author · Waterloo collaboration

Computational Neuroscience · Biosignals · Digitized ADOS

Journal of Personalized Medicine · 2020 · 20 citations · DOI ↗

Bokadia H. First Author , Rai R., Torres E.B.

Digitized ADOS: Social Interactions beyond the Limits of the Naked Eye

Applied wearable biosensors and statistical signal processing to the Autism Diagnostic Observation Schedule (ADOS), extracting high-dimensional sub-second kinematic features from socio-motor dyads invisible to human observers. Network connectivity analysis of cross-coherence matrices reveals dynamic social coordination patterns, a biosignals foundation for embodied AI systems that model social and adaptive behaviour.

Methods: Wearable inertial sensors · Cross-coherence · Network connectivity · ADOS-2 protocol

MDPI ↗

Clinical AI · Precision Health · Biosignals

medRxiv · 2025 · 1 citation · Holland Bloorview / McMaster

Vandewouw M.M., Niroomand K., Bokadia H. Co-Author, Lenz S., et al.

A Precision Health Approach to Medication Management in Neurodivergence

Multi-cohort international study (four datasets) developing and validating ML models for medication management in neurodevelopmental conditions. Combines structured clinical features with probabilistic modeling to support individualized treatment recommendations.

medRxiv ↗

Clinical Child and Family Psychology Review · 2024 · 26 citations

Mahjoob M., Paul T., Carbone J., Bokadia H. Co-Author, et al.

Predictors of Health-Related Quality of Life in Neurodivergent Children: A Systematic Review

Systematic review synthesizing evidence on quality-of-life predictors across neurodevelopmental conditions. Informed precision health AI pipelines at Holland Bloorview.

Springer ↗

SSRN Preprint · 2025 · Holland Bloorview / University of Toronto

Syed B., Vandewouw M.M., Cardy R.E., Carbone J., Niroomand K., Bokadia H. Co-Author, Paul T., Monga S., Kushki A. et al.

The Contributions of Autism Traits, Physiological Arousal, and Emotion Dysregulation to Anxiety: A Structural Equation Modeling Study

Structural equation modeling study examining how autism features, physiological arousal, and emotion dysregulation jointly predict anxiety in children. Contributed biosignals analysis using wearable physiological sensor data from the POND Network dataset.

Methods: Structural equation modeling · respiratory sinus arrhythmia (RSA) / physiological arousal · wearable biosensors · POND Network

SSRN ↗

Zuhair Qureshi research poster on sociodemographic bias

Holland Bloorview Kids Rehabilitation Hospital · 2024–2025 · Research Poster

Qureshi Z., Bokadia H. Mentor, Kushki A.

Characterizing Sociodemographic Biases in Adaptive Functioning Data in Neurodivergent Children Mentorship

Mentored intern Zuhair Qureshi (McMaster) on this study using the POND Network dataset (n=1,254). XGBoost with 15-fold cross-validation identified statistically significant disparities in adaptive functioning composite scores across socioeconomic status, sex, and ethnicity subgroups, and the findings inform bias-aware precision health tools for pediatric populations.

Methods: XGBoost · Mann-Whitney testing · Feature importance · POND Network dataset

Poster PDF ↗

Projects

To be updated soon.

Blog

To be updated soon.

CV

Career Timeline

2026 – present Research Associate SUTD, Singapore Embodied AI · World Models

2024 – present MS Artificial Intelligence University of Texas at Austin Graduate Studies

May–Sep 2025 Data Scientist Waypoint Centre for Mental Health Care, ON Production AI · Deployment

Jan 2023 – May 2025 Research Engineer Holland Bloorview Kids Rehabilitation Hospital, Toronto Clinical AI · Biosignals · Precision Health

May–Dec 2022 Research Assistant II Medical AI Group, University of Waterloo Vector-Symbolic Architectures · LLMs

Jul–Nov 2021 Intern, NeuroAI Mila – Quebec AI Institute, Montreal Neural ODEs · EEG · Generative Models

Dec 2020 – Jul 2021 Research Staff (Coadjutant) Cognitive & Data Science Lab, Rutgers–Newark DARPA AFRL · XAI · Bayesian Machine Teaching

Sep 2018 – Nov 2020 Research Staff (Coadjutant) Rutgers Centre for Cognitive Science, New Brunswick EEG · BMI · ADOS · Neural Connectivity

2016 – 2018 MS Industrial & Systems Engineering Rutgers University Optimization · Statistics · Deep Learning

2012 – 2014 Project Engineer UltraTech, Aditya Birla Group, India Industry 4.0 · Process Optimization

2008 – 2012 BTech Mechanical Engineering Rajasthan Technical University, India Foundation

Education

MS Artificial Intelligence University of Texas at Austin · 2024–2027
MS Industrial & Systems Engineering Rutgers University · 2016–2018 Thesis: Deep learning based virtual metrology for semiconductor manufacturing processes
BTech Mechanical Engineering Rajasthan Technical University · 2008–2012

Research Experience

Research Associate SUTD, Singapore · Mar 2026–present Embodied AI · World model estimation · Bayesian inverse RL · MuJoCo
Research Assistant II Medical AI Group, University of Waterloo · May–Dec 2022 Vector-Symbolic Architectures · language models for EHR · SNOMED-CT · MIMIC-IV · transformer fine-tuning · FHIR
Intern, NeuroAI Mila – Quebec AI Institute, Montreal · Jul–Nov 2021 (remote) Neural ODE modeling · EEG analysis (MNE) · neural dynamics · generative modeling
Research Staff (Coadjutant) Cognitive & Data Science Lab, Rutgers–Newark · Dec 2020–Jul 2021 DARPA AFRL · Bayesian Machine Teaching · PLDA (Probabilistic Linear Discriminant Analysis) · XAI for medical imaging
Research Staff (Coadjutant) Rutgers Centre for Cognitive Science, New Brunswick · Sep 2018–Nov 2020 EEG · wearable biosensors · ADOS digital biomarkers · socio-motor dyads · network connectivity · Brain-Machine Interface

Industry Experience

Data Scientist Waypoint Centre for Mental Health Care, ON · May–Sep 2025 Production AI deployment · Azure ML · EHR integration · scalable ML pipelines · clinical decision support
Research Engineer Holland Bloorview Kids Rehabilitation Hospital, Toronto · Jan 2023–May 2025 Precision health · medication recommendation AI · biosignal processing · multi-site clinical trials · POND network · REDCap · PostgreSQL
Project Engineer UltraTech, Aditya Birla Group, India · Sep 2012–Apr 2014 Industry 4.0 · IoT sensor data analysis · process monitoring · supply chain analytics · Arena simulation

Technical Skills

Languages & Core

Python MATLAB SQL (PostgreSQL) Bash

ML / DL Frameworks

PyTorch Hugging Face Transformers Scikit-learn NumPy · Pandas

World Models & Embodied / Physical AI

World Model Learning Embodied AI Diffusion Models MuJoCo Bayesian Inverse RL Model-based RL

LLM Training & Inference

LoRA / QLoRA Mixed-precision (bf16/fp8) FlashAttention ZeRO-stage Distributed Training Speculative Decoding PagedAttention / vLLM RLHF DPO Instruction Tuning Quantization (int4/int8)

Representation & Knowledge

Vector-Symbolic Architectures Graph Neural Networks Probabilistic Modeling

Biosignals & Neuroscience

EEG / EEGLAB / MNE Neural ODEs Network Connectivity Analysis Wearable Biosensors

Infrastructure & MLOps

Azure ML Git / GitHub REDCap Linux

CV available on request — harshitbokadia [at] gmail.com

Harshit Bokadia

Harshit Bokadia

Publications

Embodied AI · World Models · Physical AI

XAI · Interpretability · Medical Imaging

LLMs · Vector-Symbolic Architectures · Knowledge Representation

Computational Neuroscience · Biosignals · Digitized ADOS

Clinical AI · Precision Health · Biosignals

Projects

Blog

CV

Career Timeline

Education

Research Experience

Industry Experience

Technical Skills

Contact