Research Associate · SUTD | Canadian Citizen
I am a Research Associate at SUTD (Singapore University of Technology and Design). My research builds AI pipelines for real-time estimation of learners' internal world models from motion-capture kinematic data, modelling movement exploration as a Markov Decision Process, running Bayesian inverse RL to recover reward structure and state-transition dynamics, and using diffusion models and inferred world models to scaffold open-ended embodied creativity in real time. This work connects foundational questions in world model learning and generative modelling to the physical, embodied setting.
My background spans a deliberate sequence of research and engineering roles, each building toward my current focus:
My current work sits at the intersection of these threads and points toward where I want to go next: robot learning, robot foundation models, and diffusion-based generative models for embodied agents, world model architectures that generalize across tasks and embodiments, and the post-training and inference machinery that makes large models deployable and trustworthy. My interpretability work, building evaluation frameworks for explanation faithfulness under DARPA/AFRL, gives me a concrete foothold in mechanistic interpretability of LLMs: understanding what circuits and features actually encode, not just what outputs look like.
Research interests
SUTD · Science of Learning Project
AI-Driven Scaffolding of Embodied Creativity
End-to-end pipeline for real-time estimation of learners' internal world models from motion-capture data. Human movement exploration is modelled as a Markov Decision Process (MDP); Bayesian inverse RL recovers reward structure and state-transition dynamics; diffusion models generate adaptive goal states; UMAP-clustered movement embeddings drive scaffolding decisions. MuJoCo humanoid simulation grounds model learning in physically plausible motor behaviour.
Methods: Bayesian inverse RL · MDP · Diffusion models · MuJoCo · UMAP · wearable physiological sensors
ACM MOCO 2020 · DOI ↗
Neural Connectivity Evolution during Adaptive Learning with and without Proprioception
Network-theoretic analysis of EEG-derived neural connectivity as participants learn novel motor tasks with and without proprioceptive feedback. Reveals how sensorimotor deprivation reshapes cortical communication graphs, directly informing world-model design for embodied agents learning from impoverished sensory streams.
Methods: EEG · EEGLAB · Cross-coherence · Graph-theoretic connectivity analysis
Applied AI Letters · DOI ↗
Evaluating Perceptual and Semantic Interpretability of Saliency Methods: A Case Study of Melanoma
Designed two novel evaluation metrics for saliency-based XAI: visual incoherence (perceptual coherence of attribution maps) and textbook feature overlap (semantic alignment to dermatologist-defined ABCDE features). Benchmarked six saliency methods on VGG-16 across the ISIC melanoma dataset. Enables adaptive method selection for high-stakes clinical AI, a framework directly transferable to mechanistic interpretability of LLMs and VLMs.
Methods: VGG-16 · GradCAM · SHAP · LIME · RISE · Occlusion · PyTorch · ISIC dataset
ACM KDD Workshop on Knowledge-Infused Learning · OpenReview ↗
Encoding Medical Ontologies with Holographic Reduced Representations for Transformers
Introduces holographic reduced representations (HRRs), a vector-symbolic architecture, as a structure-preserving method to encode medical ontologies into transformer token spaces. HRRBase embeddings outperform unstructured embeddings on out-of-distribution disease prediction, including patients with entirely unseen ICD codes. Directly relevant to knowledge-infused LLM post-training, structured inference, and multimodal grounding.
Methods: HRR · VSA · SNOMED CT ontology · HRRBERT · MIMIC-IV · ICD coding
Journal of Personalized Medicine · DOI ↗
Digitized ADOS: Social Interactions beyond the Limits of the Naked Eye
Applied wearable biosensors and statistical signal processing to the Autism Diagnostic Observation Schedule (ADOS), extracting high-dimensional sub-second kinematic features from socio-motor dyads invisible to human observers. Network connectivity analysis of cross-coherence matrices reveals dynamic social coordination patterns, a biosignals foundation for embodied AI systems that model social and adaptive behaviour.
Methods: Wearable inertial sensors · Cross-coherence · Network connectivity · ADOS-2 protocol
medRxiv
A Precision Health Approach to Medication Management in Neurodivergence
Multi-cohort international study (four datasets) developing and validating ML models for medication management in neurodevelopmental conditions. Combines structured clinical features with probabilistic modeling to support individualized treatment recommendations.
Clinical Child and Family Psychology Review
Predictors of Health-Related Quality of Life in Neurodivergent Children: A Systematic Review
Systematic review synthesizing evidence on quality-of-life predictors across neurodevelopmental conditions. Informed precision health AI pipelines at Holland Bloorview.
SSRN Preprint
The Contributions of Autism Traits, Physiological Arousal, and Emotion Dysregulation to Anxiety: A Structural Equation Modeling Study
Structural equation modeling study examining how autism features, physiological arousal, and emotion dysregulation jointly predict anxiety in children. Contributed biosignals analysis using wearable physiological sensor data from the POND Network dataset.
Methods: Structural equation modeling · respiratory sinus arrhythmia (RSA) / physiological arousal · wearable biosensors · POND Network
Holland Bloorview Kids Rehabilitation Hospital
Characterizing Sociodemographic Biases in Adaptive Functioning Data in Neurodivergent Children Mentorship
Mentored intern Zuhair Qureshi (McMaster) on this study using the POND Network dataset (n=1,254). XGBoost with 15-fold cross-validation identified statistically significant disparities in adaptive functioning composite scores across socioeconomic status, sex, and ethnicity subgroups, and the findings inform bias-aware precision health tools for pediatric populations.
Methods: XGBoost · Mann-Whitney testing · Feature importance · POND Network dataset
To be updated soon.
To be updated soon.
CV available on request — harshitbokadia [at] gmail.com
Open to research collaborations, thesis advising discussions, and roles in AI research and engineering across the US, Canada, and Singapore.