Projects
- Mixture of Experts Small Language Model
From-scratch implementation of Mixture of Experts architecture for efficient small language models. Implements sparse expert routing, gating networks, and conditional computation to achieve improved model capacity with reduced computational overhead. Explores scaling laws and expert specialization patterns.
- nanoGPT - Transformer Implementation
Clean, production-quality GPT architecture implemented from first principles in PyTorch. Explores autoregressive language modeling, multi-head self-attention, positional encodings, and layer normalization. Educational deep-dive into transformer mechanics and training dynamics.
- Model-Agnostic Meta-Learning (MAML)
Research implementation of MAML algorithm for few-shot learning based on Finn et al. Enables rapid adaptation to new tasks with minimal examples through second-order gradient-based meta-learning. Explores inner/outer loop optimization and task distribution strategies.
- PrivAI Cloud - Privacy-Preserving ML
Production framework for privacy-preserving machine learning in cloud environments. Implements federated learning (client-server architecture), differential privacy (ε-δ guarantees), and secure aggregation for distributed model training without centralizing sensitive data.
- Attention Mechanisms from Scratch
Comprehensive implementation of attention mechanisms from first principles: scaled dot-product attention, multi-head attention, cross-attention, and self-attention. Includes visualizations of attention weights and exploration of key/query/value transformations—foundational building blocks for understanding transformer architectures.
- Variational Autoencoder (VAE)
From-scratch implementation of Variational Autoencoders for generative modeling. Explores latent space representation learning, reparameterization trick, KL divergence regularization, and probabilistic generation. Includes latent space interpolation and disentanglement experiments.
- TextFusion - Advanced Text Generation
Text generation and fusion system combining multiple NLP techniques for coherent long-form content generation.
- ML Optimization Algorithms
Collection of optimization algorithms implemented from scratch including Adam, SGD with momentum, RMSprop, and advanced techniques for neural network training.
- Langevin Dynamics Sampling
Implementation of Langevin dynamics for sampling from complex distributions. Applications in Bayesian inference and generative modeling.
- FlashFeat - Fast Feature Extraction
High-performance feature extraction pipeline for ML preprocessing. Optimized for speed and scalability in production environments.
- AI-Driven Course Generation Platform
🏆 Hackathon Winner: Auto-generates comprehensive training courses for enterprises within seconds using fine-tuned LLM system. Reduced content creation time by 90% (hours → minutes). Architected serverless backend using AWS Lambda, S3, and API Gateway handling real-time course assembly and delivery at scale.
- Automated Multi-Cloud Deployment
🏆 Hackathon Winner: Intelligent multi-tier, multi-cloud deployment automation using Terraform and AWS Lambda. Achieved 30% faster release cycles through infrastructure-as-code templating and automated cost analysis for informed financial decisions across AWS/GCP/Azure environments.
- Multi-Cloud Observability Tool
Production-grade serverless monitoring solution for multi-cloud infrastructure. Achieved 65% reduction in operational costs and 30% increase in system availability through event-driven serverless architecture. Aggregates metrics, logs, and traces across AWS/GCP environments in real-time.
- nanoChat - Lightweight LLM Chat
Minimal chat interface powered by small language models. Focus on efficiency and low-latency inference for production deployment.
- LSTM Networks from Scratch
Educational implementation of Long Short-Term Memory networks. Explores sequential modeling and gradient flow in recurrent architectures.
- makemore - Character-Level LM
Character-level language model for generating text. Builds understanding of autoregressive modeling and neural language generation from first principles.
- Grawl - Data Collection Framework
Web crawling and data extraction framework for ML training data collection. Scalable architecture for gathering and processing large-scale datasets.
- KAUTILYA - ML Strategy Framework
Strategic framework for ML system design and deployment. Named after the ancient Indian strategist, focuses on principled approaches to AI engineering.