Harshit Bokadia

Harshit Bokadia

Research Associate, SUTD (Singapore University of Technology and Design)

Harshit Bokadia

Research Associate · SUTD  |  Canadian Citizen

I am a Research Associate at SUTD (Singapore University of Technology and Design). My research builds AI pipelines for real-time estimation of learners' internal world models from motion-capture kinematic data, modelling movement exploration as a Markov Decision Process, running Bayesian inverse RL to recover reward structure and state-transition dynamics, and using diffusion models and inferred world models to scaffold open-ended embodied creativity in real time. This work connects foundational questions in world model learning and generative modelling to the physical, embodied setting.

My background spans a deliberate sequence of research and engineering roles, each building toward my current focus:

  • 2008–2014BTech, Mechanical Engineering followed by nearly two years as a Project Engineer in Industry 4.0 manufacturing in India.
  • 2016–2018MS, Industrial and Systems Engineering, Rutgers — depth in optimization, statistics, and systems thinking; thesis on deep learning for semiconductor process control.
  • 2018–2020Computational neuroscience and cognitive science, Rutgers — electroencephalography (EEG), motor control and sensorimotor integration, Brain-Machine Interfaces (BMI), neural connectivity analysis, and digital biomarkers for autism including the Autism Diagnostic Observation Schedule (ADOS).
  • 2020–2021Explainable AI (XAI) research under DARPA (Defense Advanced Research Projects Agency) / AFRL (Air Force Research Laboratory), Rutgers-Newark — novel perceptual and semantic interpretability metrics for medical imaging.
  • 2021NeuroAI Intern, Mila — neural ODE modeling, EEG analysis, and generative modeling of neural dynamics.
  • 2022Medical AI group, University of Waterloo — vector-symbolic architectures for structured knowledge encoding in clinical NLP.
  • 2023–2025Research Engineer, Holland Bloorview Kids Rehabilitation Hospital — clinical data science for neurodivergent populations; medication recommendation AI; biosignal processing; large-scale POND (Province of Ontario Neurodevelopmental Disorders) network clinical trial data.
  • 2025Data Scientist, Waypoint Centre for Mental Health Care — production AI deployment in healthcare; EHR integration via Azure ML; regulatory-compliant ML pipelines.

My current work sits at the intersection of these threads and points toward where I want to go next: robot learning, robot foundation models, and diffusion-based generative models for embodied agents, world model architectures that generalize across tasks and embodiments, and the post-training and inference machinery that makes large models deployable and trustworthy. My interpretability work, building evaluation frameworks for explanation faithfulness under DARPA/AFRL, gives me a concrete foothold in mechanistic interpretability of LLMs: understanding what circuits and features actually encode, not just what outputs look like.

Research interests

  • Embodied AI / Robotics World model learning and representation in agents · robot learning and robot foundation models · diffusion models for embodied action generation · model-based RL for generalization under partial observability · inferring latent dynamics from high-dimensional behavioral data
  • LLMs / MLLMs Post-training (RLHF — Reinforcement Learning from Human Feedback; DPO — Direct Preference Optimization; instruction tuning) · inference efficiency (speculative decoding, vLLM, quantization) · VSA (vector-symbolic architecture)-structured knowledge encoding
  • Interpretability Mechanistic interpretability of transformer circuits · feature geometry and superposition in LLMs · evaluation frameworks for explanation faithfulness (background: saliency metrics, semantic overlap, DARPA XAI)
  • Clinical AI Biosignal processing (EEG, accelerometry) · medication recommendation in neurodivergent populations · regulatory-compliant ML deployment

Embodied AI · World Models · Physical AI

XAI · Interpretability · Medical Imaging

LLMs · Vector-Symbolic Architectures · Knowledge Representation

HRR t-SNE embedding visualization

ACM KDD Workshop on Knowledge-Infused Learning · 2024 · 1 citation · OpenReview ↗

Hu B.X., Yu T., Tuinstra T., Rezai R., Bokadia H. Co-Author, DiMaio R., Tripp B.P.

Encoding Medical Ontologies with Holographic Reduced Representations for Transformers

Introduces holographic reduced representations (HRRs), a vector-symbolic architecture, as a structure-preserving method to encode medical ontologies into transformer token spaces. HRRBase embeddings outperform unstructured embeddings on out-of-distribution disease prediction, including patients with entirely unseen ICD codes. Directly relevant to knowledge-infused LLM post-training, structured inference, and multimodal grounding.

Methods: HRR · VSA · SNOMED CT ontology · HRRBERT · MIMIC-IV · ICD coding

Computational Neuroscience · Biosignals · Digitized ADOS

Clinical AI · Precision Health · Biosignals

medRxiv · 2025 · 1 citation · Holland Bloorview / McMaster

Vandewouw M.M., Niroomand K., Bokadia H. Co-Author, Lenz S., et al.

A Precision Health Approach to Medication Management in Neurodivergence

Multi-cohort international study (four datasets) developing and validating ML models for medication management in neurodevelopmental conditions. Combines structured clinical features with probabilistic modeling to support individualized treatment recommendations.

Clinical Child and Family Psychology Review · 2024 · 26 citations

Mahjoob M., Paul T., Carbone J., Bokadia H. Co-Author, et al.

Predictors of Health-Related Quality of Life in Neurodivergent Children: A Systematic Review

Systematic review synthesizing evidence on quality-of-life predictors across neurodevelopmental conditions. Informed precision health AI pipelines at Holland Bloorview.

SSRN Preprint · 2025 · Holland Bloorview / University of Toronto

Syed B., Vandewouw M.M., Cardy R.E., Carbone J., Niroomand K., Bokadia H. Co-Author, Paul T., Monga S., Kushki A. et al.

The Contributions of Autism Traits, Physiological Arousal, and Emotion Dysregulation to Anxiety: A Structural Equation Modeling Study

Structural equation modeling study examining how autism features, physiological arousal, and emotion dysregulation jointly predict anxiety in children. Contributed biosignals analysis using wearable physiological sensor data from the POND Network dataset.

Methods: Structural equation modeling · respiratory sinus arrhythmia (RSA) / physiological arousal · wearable biosensors · POND Network

Zuhair Qureshi research poster on sociodemographic bias

Holland Bloorview Kids Rehabilitation Hospital · 2024–2025 · Research Poster

Qureshi Z., Bokadia H. Mentor, Kushki A.

Characterizing Sociodemographic Biases in Adaptive Functioning Data in Neurodivergent Children Mentorship

Mentored intern Zuhair Qureshi (McMaster) on this study using the POND Network dataset (n=1,254). XGBoost with 15-fold cross-validation identified statistically significant disparities in adaptive functioning composite scores across socioeconomic status, sex, and ethnicity subgroups, and the findings inform bias-aware precision health tools for pediatric populations.

Methods: XGBoost · Mann-Whitney testing · Feature importance · POND Network dataset

Projects

To be updated soon.

Blog

To be updated soon.

CV

Career Timeline

2026 – present Research Associate SUTD, Singapore Embodied AI · World Models
2024 – present MS Artificial Intelligence University of Texas at Austin Graduate Studies
May–Sep 2025 Data Scientist Waypoint Centre for Mental Health Care, ON Production AI · Deployment
Jan 2023 – May 2025 Research Engineer Holland Bloorview Kids Rehabilitation Hospital, Toronto Clinical AI · Biosignals · Precision Health
May–Dec 2022 Research Assistant II Medical AI Group, University of Waterloo Vector-Symbolic Architectures · LLMs
Jul–Nov 2021 Intern, NeuroAI Mila – Quebec AI Institute, Montreal Neural ODEs · EEG · Generative Models
Dec 2020 – Jul 2021 Research Staff (Coadjutant) Cognitive & Data Science Lab, Rutgers–Newark DARPA AFRL · XAI · Bayesian Machine Teaching
Sep 2018 – Nov 2020 Research Staff (Coadjutant) Rutgers Centre for Cognitive Science, New Brunswick EEG · BMI · ADOS · Neural Connectivity
2016 – 2018 MS Industrial & Systems Engineering Rutgers University Optimization · Statistics · Deep Learning
2012 – 2014 Project Engineer UltraTech, Aditya Birla Group, India Industry 4.0 · Process Optimization
2008 – 2012 BTech Mechanical Engineering Rajasthan Technical University, India Foundation

Education

  • MS Artificial Intelligence University of Texas at Austin  ·  2024–2027
  • MS Industrial & Systems Engineering Rutgers University  ·  2016–2018 Thesis: Deep learning based virtual metrology for semiconductor manufacturing processes
  • BTech Mechanical Engineering Rajasthan Technical University  ·  2008–2012

Research Experience

  • Research Associate SUTD, Singapore  ·  Mar 2026–present Embodied AI · World model estimation · Bayesian inverse RL · MuJoCo
  • Research Assistant II Medical AI Group, University of Waterloo  ·  May–Dec 2022 Vector-Symbolic Architectures · language models for EHR · SNOMED-CT · MIMIC-IV · transformer fine-tuning · FHIR
  • Intern, NeuroAI Mila – Quebec AI Institute, Montreal  ·  Jul–Nov 2021 (remote) Neural ODE modeling · EEG analysis (MNE) · neural dynamics · generative modeling
  • Research Staff (Coadjutant) Cognitive & Data Science Lab, Rutgers–Newark  ·  Dec 2020–Jul 2021 DARPA AFRL · Bayesian Machine Teaching · PLDA (Probabilistic Linear Discriminant Analysis) · XAI for medical imaging
  • Research Staff (Coadjutant) Rutgers Centre for Cognitive Science, New Brunswick  ·  Sep 2018–Nov 2020 EEG · wearable biosensors · ADOS digital biomarkers · socio-motor dyads · network connectivity · Brain-Machine Interface

Industry Experience

  • Data Scientist Waypoint Centre for Mental Health Care, ON  ·  May–Sep 2025 Production AI deployment · Azure ML · EHR integration · scalable ML pipelines · clinical decision support
  • Research Engineer Holland Bloorview Kids Rehabilitation Hospital, Toronto  ·  Jan 2023–May 2025 Precision health · medication recommendation AI · biosignal processing · multi-site clinical trials · POND network · REDCap · PostgreSQL
  • Project Engineer UltraTech, Aditya Birla Group, India  ·  Sep 2012–Apr 2014 Industry 4.0 · IoT sensor data analysis · process monitoring · supply chain analytics · Arena simulation

Technical Skills

Languages & Core
Python MATLAB SQL (PostgreSQL) Bash
ML / DL Frameworks
PyTorch Hugging Face Transformers Scikit-learn NumPy · Pandas
World Models & Embodied / Physical AI
World Model Learning Embodied AI Diffusion Models MuJoCo Bayesian Inverse RL Model-based RL
LLM Training & Inference
LoRA / QLoRA Mixed-precision (bf16/fp8) FlashAttention ZeRO-stage Distributed Training Speculative Decoding PagedAttention / vLLM RLHF DPO Instruction Tuning Quantization (int4/int8)
Representation & Knowledge
Vector-Symbolic Architectures Graph Neural Networks Probabilistic Modeling
Biosignals & Neuroscience
EEG / EEGLAB / MNE Neural ODEs Network Connectivity Analysis Wearable Biosensors
Infrastructure & MLOps
Azure ML Git / GitHub REDCap Linux

CV available on request  —  harshitbokadia [at] gmail.com

Contact

Open to research collaborations, thesis advising discussions, and roles in AI research and engineering across the US, Canada, and Singapore.