Hi, I'm Dhrubo
Building intelligent systems with LLMs, NLP, and production ML pipelines. Currently pursuing M.Sc. Data Science at Deakin University with focus on LLM operations and reliable AI systems.
About Me
I'm a Data Scientist and ML Engineer with hands-on experience deploying LLMs in production at enterprise scale — including document classification systems serving DHL. My work spans the full ML lifecycle: from dataset curation and model fine-tuning to building FastAPI backends with Elasticsearch analytics.
Currently pursuing my Master's at Deakin University, with research interests in Active Inference, multi-agent LLM systems, and reliable LLM operations. Published in Nature Scientific Reports for Bangla sign language recognition.
Experience
Data Scientist
Sep 2024 – Jul 2025AIDocbuilder INC / Inteliweave Ltd. · Toronto, Canada (Remote)
Designed a document classification pipeline using Llama 3.2 3B with vLLM — achieved 95% training and 90% evaluation accuracy, reporting accuracy and macro-F1 for robustness.
Maintained a production NLP codebase used by DHL for document ingestion and classification — diagnosed misclassifications, OCR errors, and pipeline regressions, then implemented targeted fixes across preprocessing, rule-based cues, and spaCy components.
Built an Elasticsearch-based missing-key detection platform with a FastAPI backend supporting batch analysis and JSON reports, plus Kibana dashboards visualizing field-wise gaps and red-alert files.
Refactored legacy classification scripts into a modular architecture with type hints, unit/integration tests, structured logging, and centralized config — reducing production issues and improving developer velocity.
Managed the document classification department — maintaining the master taxonomy, key dictionaries, and spaCy patterns, and triaging production tickets to keep classification accurate and stable.
Projects
Full-stack LLM operations platform with four production tools (Profiler, Evaluator, Drift Monitor, Cost Analyzer) powered by a custom fine-tuned Llama 3.1 8B — no hosted APIs. Achieved 96/100 LLM-as-judge quality after LoRA fine-tuning on 10K synthetic profiles across 14 domains. FastAPI + LangChain backend with semantic drift detection and hallucination flagging; Next.js 15 dashboard with multi-temperature comparison and cross-provider cost projections across 11 models.
Fine-tuned Llama 3.1 8B with Unsloth + LoRA on a 10,547-example emotion-tagged TOON dataset (9 emotion classes, final loss 0.024). Hybrid voice/text emotion pipeline fuses emotion2vec+ with DistilRoBERTa via reliability-weighted late fusion with emotional momentum. XTTS v2 fine-tuned on 16 minutes of curated samples for character-specific speech preserving emotional prosody. ChromaDB + Whisper enable long-term semantic memory and real-time multimodal interaction.
Converts YouTube lectures into structured timestamped notes with key concepts, definitions, and formulas. Recommends credible external resources aligned to each extracted concept. Enables transcript-grounded chat and adaptive quizzes with explanations to assess understanding and guide revision.
Open-source Socratic dialogue web app with FastAPI backend and Next.js frontend, integrating Gemini for conversational reasoning. Implemented NLP preprocessing (tokenization, lemmatization) and a scikit-learn decision tree to categorize user inputs. Designed for swappable LLM providers and editable training data; deployed on Vercel.
TensorFlow/Keras Transformer for Bangla Sign recognition using MediaPipe Holistic landmarks as inputs, processing video clips into framewise pose and hand features with positional embeddings and encoder layers. Reproducible Jupyter pipeline with training, evaluation, and per-class precision–recall outputs. Published in Nature Scientific Reports.
Technical Skills
Languages
Frameworks
ML & AI
Data & DevOps
Education & Achievements
Master of Data Science
Deakin University
Melbourne, Australia · Jul 2025 – Jul 2027
B.Sc. in Computer Science
BRAC University
Dhaka, Bangladesh · Jan 2020 – Jan 2024
Get in Touch
Have an opportunity, collaboration idea, or just want to say hello? Drop a message or reach me at tahsinul.haque.dhrubo@gmail.com