Data Scientist

Location:

San Jose, CA

Posted:

November 09, 2023

Contact this candidate

Resume:

Dharun Suryaa Nagarajan

• San Jose, California • 669-***-**** • ***************@*****.***

• LinkedIn •GitHub

EDUCATION:

Northeastern University, CA, USA

Master of Science, Data Science Sept 2023 – May 2025 Amrita Vishwa Vidyapeetham, Coimbatore, India

Bachelor of Technology, Engineering: 3.6 GPA July 2016 – May 2020 TECHNICAL SKILLS:

Techniques: Predictive modeling, Hypothesis Testing, Feature Engineering, Regression, Clustering, Exploratory Data Analysis, Time Series forecasting, Query Optimization, Data Warehousing, Data Visualization, Casual Inference. Languages and Tools: Python, SQL, JavaScript, PowerBI, Snowflake, Databricks, Google Sheets, Excel, Jupyter Notebook, MS Office.

Python Packages: NumPy, Pandas, scikit-learn, SciPy, matplotlib, seaborn, stats models. WORK EXPERIENCE:

SWIGGY, Bengaluru, India - Food Delivery with over 1.5M+ orders/day Data Scientist: May 2022 - July 2023

• Achieved 40,000 incremental clicks/day and a 16% reduction in uninstalls of the PN recommendation model by improving feature diversity of the model using feature analysis and feature splitting and by removing poor campaigns through performance analysis. Leveraged Databricks, PySpark, AWS S3 buckets, Airflow, and streamlined SQL jobs.

• Slashed discount cost by 15% by devising a rule-based statistical model at 82% efficiency. Filtered out four parameters from a 200M+ large dataset with nine features for the rule-based model using Linear Regression and MSE to find the best fit. Leveraged Snowflake for EDA and the scikit-learn of Python, and PySpark for modeling.

• Boosted NU Activation/day by 5%. Analyzed pain in the app flow by hypothesis t-distribution statistic. Conducted A/B testing on 250k+ users to establish statistical results. Utilized the stats models, and NumPy package of Python.

• Curtailed manual work of SHs from 4 days to 6 hours by automating the referral funnel real-time with incoming traffic of 2.4M/day and used PowerBI to monitor drops at each stage of the funnel across different demographic/KPI clusters. Google – TCS Contract, Bengaluru, India

Data Scientist: Oct 2020 – Apr 2022

• Amplified the Recurring revenue prediction accuracy from 65% to 78% using the ARIMA forecasting on the proprietary Trix clusters using SQL and Google Sheets connectors.

• Decreased MRR RCA average duration from 1 day to 4 hrs. Broke down the RCAs to four metrics attributing to MRR change using descriptive analysis on SQL. Automated the updates of the change of the metrics using Dremel and Google Connectors.

• Shrank the pipeline execution time by 18 mins of the migrated scripts of Partner and Sales team from Dremel to Flex Warehouse (Google Proprietary tools) using SQL Optimization techniques like indexing, and memory optimized tables. ACADEMIC PROJECTS:

• Google Extension Email Classification: Used Gaussian Naive Bayes to classify spam emails and ad emails at 91% accuracy for personal Gmail accounts with an option to unsubscribe. JavaScript for Google extension design and JSON objects for Python integration.

• Career Fair Website: Helping high school students connect with professionals in their career of choice. Used Hidden Markov Model using hmlearn library in Python. Developed a complete end-to-end project using LARAVEL and PHP. ACHEIVEMENTS AND LEADERSHIP:

• Awarded the Most Valuable Player for the months Jan and Feb 2023 by Swiggy.

• Received the Digital Cadre in recognition as one of the top performers.

• Managed a 13-member cross-functional (Finance, Shelter, Resource) team and raised a sum of 75000 INR for the shelter children of Coimbatore.

• Leading the Data Science Club of Northeastern University. Conducted various webinars on ML in the real world.

Contact this candidate