Ashutosh Tiwari Masters in Computational Data Science student at Indiana University, Bloomington

I am a Machine Learning Engineer with a Master's degree in Computational Data Science from Indiana University, Bloomington, and over 7 years of experience in designing, building, and deploying scalable machine learning systems. My expertise spans ML Data Platforms, Feature Stores, Generative AI, Natural Language Processing, Time Series Forecasting, and Search Relevance, honed at industry leaders like Adobe, EvolutionIQ, Swiggy, and Flipkart. I am passionate about developing robust, efficient, and fair AI solutions, with a research focus on Fairness Aware Graph Recommendation Systems. I enjoy tackling complex real-world problems and contributing to the ML community through research and competitive data science.

Profile Pic

Let's Connect!

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.

  • Aug 2021 - May 2023
    Indiana University, Bloomington (Graduating: 1st Week May 2023)

    MS (Computational Data Science)

    GPA - 3.86/4.0

  • Jul 2011 - Jun 2015
    National Institute of Technology, Patna

    B-Tech (Computer Science & Engineering)

    CGPA - 8.32/10.0

Education
  • May 2023 - Present
    Indiana University Network Science Institute

    Working on novel model training methods to produce "Fairness Aware Graph Recommendation" models

  • Aug 2023 - Present
    Kelly School of Business

    Working as a paid RA on "User Intent as a Network". Project is a collaboration with Luddy and is funded by Kelly School of Business.

  • Aug 2021 - May 2023
    NLP LAB @ IUB (Fall 2021)

    Contributed extensively to design of TieML and Events' Timeline modelling

Research Experience
  • Jul 2024 – Present
    Adobe

    Machine Learning Engineer (ML Data Platform)

    • Currently part of Machine Learning Data Platform, focused on offline inference for asset enrichment and feature store.
    • Part of a small team that wrote our first version of the feature store.
    • Our feature store supports training large machine learning models trained on billions on assets.
    • Wrote first data quality framework used for asset enrichment. This is used across enrichment pipelines to ensure the correctness of output.
    • Working on feature store as an on-demand service, as we move towards bringing your own model for feature generation.
  • Sep 2023 – Jul 2024
    EvolutionIQ

    Senior Software Engineer (Data Platform)

    • Part of Scoring and Quality sub-team, working at the intersection of generative AI, fin-tech, and health sector.
    • Responsible for writing data pipelines that ingest, and process data to be ingested, and used for training by our machine learning models.
    • Writing pipelines that train our models to predict a claimant's expected time to return to work, ICD extractions from diagnosis notes, alternate VOC recommendations, etc.
    • Working on a framework to evaluate that our models are unbiased and fair to different demographic groups.
  • Jan 2019 – Jul 2021
    Swiggy

    Software Dev Engineer II (ML Platform)

    Bengaluru, India

    • Was part of team that worked on Feature Store and pipeline which feeds on-demand features to deployed ML models at production scale(4Bn rows, 10K QPS). Pipeline-supported multichannel ingestion, i.e. Spark, Flink and user files etc.
    • Founding member of Forecasting and Correlation Platform which was considered by many teams to forecast concerned time series. These forecasts power critical scaling decisions across organizations in real time.
    • Led DAQ, a tool used to scrape APIs at scale. Used to collect data for analysis/ model training at a scale of 15 M rows daily.
  • Sep 2017 – Jan 2019
    Flipkart

    Software Development Engineer (Search Relevance)

    Bengaluru, India

    • Was responsible for improvements/inception of search intent models(CRF/Neural Network based), identifying error classes, coming up with solutions, and fixing them. These models power user search and discovery for millions every day.
    • Implemented a FastText based query store classifier, which predicts the category of a tail query.
    • Implemented the first workflow to automate training and auto-deployment of various search models in Flipkart. First was written using Luigi and later migrated to Airflow.
    • Wrote a generic framework using Airflow which at runtime creates generic dags for different ML models and orchestrates their training to deployment flow, including data and model validations.
    • Implemented large scale (4Bn+ datapoints) pipelines using Cascading/HDFS to extract data from user events and then transform it to be used for training these models.
  • Sep 2016 – Sep 2017
    Groupon

    Software Development Engineer

    Bengaluru, India

    • Worked on a component called Cyclops, an interface between Customer representatives and internal services.
    • This service is live in all countries in which Groupon operates.
  • Sep 2015 – Aug 2016
    Netspeed Systems

    Software Engineer

    Bengaluru, India

    • Led engineering efforts on modules like Polarity based Arbitration, Multi-Cast Filtering, Structural Latency Breakdown, etc.
Work Experience
Teaching Experience

Stocks Prediction (16th Rank AnalyticsVidhya)

This is a project based on competition held by AnalyticsVidhya.


Topic modelling (19th Rank AnalyticsVidhya)

In this contest solution, contestants had to come up with a solution to a multiclass text classification problem.


Hospitalization Period Prediction (115th Rank)

This was again a AnalyticsVidhya contest, where contestants were supposed to predict the period for which a patient is going to be hospitalized.


Workation Price Prediction Challenge

MachineHackWorkation Price Prediction Challenge.


JOB-A-THON (186 / 2362 ~ 8 Percentile)

Analytics Vidhya JOB-A-THON.


Humana-Mays Healthcare Analytics Case Competition (11th on leaderboard)

LEADERBOARD.



Competitive D.S. Experience

BiasNet: Learning to fight in StreetFighter II with induced Relational Bias from Differential Scenes

A novel deep on-policy model free actor critic reinforcement learning approach to act in a large action space using only the difference in scenes.


Bias Manifolds: Investigating Structure of Bias Manifolds and Bias Evolution

A survey of different approaches to study the structure of bias manifolds in different datasets. Also, a novel approach to study the evolution of bias in a dataset over a period of time.


BlindNet: Distilling world knowledge in Neural Networks

A survey of different possible neural network architectures to learn to understand the world using MS COCO dataset.


DeepFoodie: Clustering Food Items using Ingredient Embeddings

A novel approach to cluster food items using deep self supervised learning which uses ingredient embeddings.


Flipkart Hackday 9 Ekart Winner Hack 2018

Built a Multiclass model(inspired by Inception v3) to annotate images of lifestyle products. We also used Google OCR API to extract selected text from the tag. The end goal was to find top candidate FSNs. Text from tag was primarily used for features like price and brand. Others more important ones came from annotations(color, type, cloth type etc). On top of this to search that product we formed a query using this information and predicted using a CRF model trained on clickstream data of Flipkart using features from features generated using Flipkart's catalog. It was so appreciated that it is in the process of going to production(which is the reason, not providing code pointer here). We did use differential learning rates to tune accuracies in the last stages of training to reach a 99.6% validation accuracy.


Groupon Geekon 2017

Deal recommendations to a user based on NSVD, using it as an unsupervised, collaborative filtering algorithm. Language: Python. Packages: Tensorflow, py2neo, pandas, and numpy. DB used was Neo4j. Dataset used was movielens dataset.


Continuous Dominant Set in a Graph

Selection and simulation of continuous dominant set in case of a distributed sensor network (CDS) and recovery from failure of one and multiple Dominant nodes.


Snake and Ladders

An android game (A variant of Snake and Ladders). Language: Java(Android Framework).



Projects
  • Jan 2019
    Udacity Advanced Machine Learning Engineer Nanodegree

    Udacity

  • Jan 2018 - May 2018
    P.G. Diploma (Deep Learning)

    Indian Institute of Science, Bangalore

  • Mar 2019 -Sep 2019
    External Internship

    School of AI

  • Sep 2020
    Natural Language Processing with Deep Learning in Python

    Udemy

Certifications