Experience

  1. Senior AI Software Solutions Engineer

    Intel Corporation
    • LLM Enablement & Distributed Training: Enabled and scaled LLM training on Intel Gaudi 2 & 3 accelerators using DeepSpeed, Megatron-LM, with various distributed parallel strategies; productionized distributed checkpoint conversions for Hugging Face compatibility to external customers.
    • Optimized Edge AI: Designed and deployed Intel Automated Vision Checkout at retail sites in India, achieving low-latency (~70ms) deep learning inference and scalability on Intel’s integrated GPU, supporting 100+ daily transactions at the store front.
    • Federated AI: Contributed to Linux Foundation’s securefederatedai/openfl by introducing JAX/FLAX support, federation long and short lived component timeout feature, and interactive examples for secure, efficient federated learning on private medical records.
    • Data Pipelines: Built an event-driven, cost-optimized AWS data pipeline, reducing costs by 8x and publishing product data in 5 minutes (down from 6 hours), supporting ~40M API hits monthly.
    • LLM Fine-Tuning as a Service: Designed a ZenML/cnvrg.io-based fine-tuning framework for internal use as a premium offering.
    • Performance Profiling: Experience in accelerators compute/memory profiling, device & host trace analysis to optimize distributed training workloads and application level optimization for edge inference.
    • Public Speaking: Presented demos and delivered technical talks on cloud cost optimization and product innovations at Intel India Innovation day, Intel ConnectiON and Intel India tech talk series (Cloud Community of Practice) events.
  2. Software Engineer

    Western Digital (SanDisk)
    • As a part of Competitive Analysis team, Successfully developed a Performance Studio client server distributed application from idea to inception for internal teams to benchmark, test, validate & qualify uSD, SD cards & enterprise SSDs.
    • Experience in configuring and chaining 48/24 ports Brocade Network switches & tune power class ports to support 80+ PoE cameras for automated uSD/SD card testing.
  3. Software Engineer Trainee

    sketchmyroom (Rhythm of space)
    • Interiors and architecture design-related experience with Full Stack Web Application Development. Primarily responsible for the development of a backend application that exposes in-house portfolios to clients via APIs, giving clients access to a variety of architecture designs offerings and an online customization option.
  4. Engineering Trainee/Intern

    Bharat Sanchar Nigam Limited (BSNL)
    • Acquired practical exprerience with Operational testing of wireless equipments, configured router, modems and mainline distribution frames. Hands on approach to optical fiber fusion splicing, monitoring various signal parameters using OTDR devices and its functioning at various stages.
    • Foundational learnings on Digital Switching & Transmission Systems, Telecom support infrastructure, Optical Fiber technology, Networking & mobile communications etc.

Education

  1. MSc. Artificial Intelligence

    Liverpool John Moores University
    Thesis on “Effectiveness of Knowledge Distillation on various neural network architectures”. Supervised by Bharath K. Bolla.
    Read Thesis
  2. 7th Summer School, Machine learning

    IIIT Hyderabad
  3. PGD in Machine Learning & AI

    IIIT Bangalore

    Coursework
    Probability & Statistics
    Exploratory Data Analysis
    Supervised & Unsupervised Machine Learning Natural Language Processing
    Deep Learning for Computer Vision
    Reinforcement Learning
    MLOps

    Capstone CycleGAN - Style Transfer using Generative Adversarial Network (GAN) - Built a Generative adversarial model (modified U-Net Architecture) that generates artificial T1 to T2 and vice-versa MRI images of different contrast levels from existing MRI scans.

  4. B.E. in Electronics & Communication

    Visvesvaraya Technological University
Awards/Certs
Certification on Accelerators for Deep Learning
IIT Roorkee ∙ November 2024
Executive certificate on Accelerators for Deep Learning covering deep learning algorithms and computer architecture with an emphasis on AI acceleration on various computing systems, such as FPGAs, mobile/desktop GPUs, smartphones, ASICs, DSPs and CPUs.
Rockstar of the Year
Intel Corporation ∙ December 2023
Recognized as ‘Rockstar of the Year’ for consistently delivering on commitments and exceeding performance goals within the short span.
Projects
Custom & efficient CNN architectures from scratch
  • Designed and Implemented 25k, 143k, 340k, 600k & 1M parameter efficient custom CNN architectures.
  • Upon knowledge distillation on these custom CNN architectures, model accuracy surpasses ResNet-18/34/152 baselines with 10-20x less model parameters on FMNIST & CIFAR-10.
Intel AI Everywhere Conference
  • Feature-engineered using VIF, RFE, and PCA. Built a hyperparameter-tuned logistic regression, random forest, and XGBoost model to predict the outcome (pass or fail) of new turn-ins using historical records of turn-ins.
  • Selective filtering was done based on the classification outcome of the turn-ins to save execution time in the DevOps pipeline in the context of HW design validation.
GradCam Visualization of CIFAR-10 dataset with Albumentations.
  • Focuses on building and training a ResNet-18 model on the CIFAR-10 dataset.
  • Implemented data augmentation using the Albumentations library, a custom dataset loader, plotting train and test loss curves, GradCam visualization of randomly sampled misclassified images, and visualization of misclassified images with labels and appropriate legends.
Style Transfer using Generative Adversarial Network (GAN)
  • Built a Generative adversarial model(modified U-Net Architecture) which can generate artificial T1 to T2 and vice-versa MRI images of different contrast levels from existing MRI scans.