ad467s@r.postjobfree.com
Haddonfield, NJ, 08033
https://www.linkedin.com/in/sunillaudari/
https://github.com/sunil7634
EDUCATION
University of Alabama in Huntsville Aug 2017 – May 2022
Ph.D. in Physics (Astrophysics) Huntsville, AL
G.P.A. 4.0/4.0
Relevant coursework: Data Analysis Math I & II
SKILLS
Programming: Python, R, SQL, Bash
Optimization: Gurobi, Pyomo
Machine Learning: Cross Validation, PCA, Logistic Regression, KNN, Random Forest, Gradient Boosting, SVM, K-
means Clustering
Data Science Tools: Pandas, NumPy, Scikit-learn, Keras, TensorFlow, PyTorch
Visualization: Matplotlib, Seaborn, Plotly, Tableau
DevOps Tools: Jira, Confluence, Jenkins, ELK, Git, GitHub, SourceTree, postman, PyCharm, Data Bricks
EXPERIENCE
Client: Comcast Corporation Sep 2022 – March 2024
Data Scientist/Python Developer Philadelphia, PA
Customized channel distribution with SQL, increasing solver speed by threefold and reducing compute costs
for Wi-Fi mesh network.
Spearheaded the design, development, and deployment of ML solutions to optimize business decisions, saving
$1M in workforce expenses by accurately forecasting radio channels.
Architected an implemented an ensemble model integrating Scikit-learn random forest and XGBoost
algorithms,, achieving a remarkable 97% accuracy in predicting pipe seam type.
Achieved 95% info retention by reducing data dimensionality from 27 to 15 features for 30k points with PCA.
Leveraged operational data sources and optimization techniques to create tools for developing scenarios with
cost and enrollment optimization, delivering related real-world data insights.
Delivered actionable insights to senior management through compelling data visualization and comparative
analysis of 1M+ observations, facilitating data-driven decision-making process using Matplotlib, Plotly, Tabeau.
Collaborated with cross-functional teams to understand business requirements and translate them into
practical ML solutions.
University of Alabama in Huntsville (UAH) Aug 2020 - May 2022
Research Aide/Research Assistant Huntsville, AL
Featured on the cover page of Nature Astronomy, showcasing a significant galaxy mosaic crafted with Python
libraries including Pandas, Seaborn, and Plotly, leveraging 100 Gigabytes of Hubble Space Telescope Data.
Built a machine learning model (Laplacian Edge Detection Algorithm) to remove Cosmic rays and artifacts from
Hubble Space telescope data, reducing computation time by 10 min per filter (image).
Developed a tracking system for nearby galaxies with Astroquery (like SQL) to improve catalogue accuracy,
reducing error by 20%.
Enhanced data quality by 17% through cleaning and preprocessing using a comprehensive suite of Python
libraries, including NumPy, Pandas, Scikit-learn, and additional tools.
Client: BBVA Bank Feb 2017 - May 2019
Junior Data Scientist Birmingham, AL
Automated classifier models like Random Forest, SVM for specific segments of a customer base, saving 22
hours of labor per month.
Constructed operational reporting and data visualization tools, reducing contractor scheduling costs by10% in
the annual budget.
Deployed Auto-Sklearn to automate machine learning model selection, reducing modeling time by 2 hours per
session.
Devised scalable solutions for Amazon EC2-based cloud environments, boosting storage efficiency by 20% and
accelerating data analysis tools’ processing speed by 10% within AWS infrastructure.
Adapted configurations to align with client requirements, resulting in a positive increment in system
functionality and a 7% improvement in overall performance.
TRAINING
Pragmatic Institute May 2022 - July 2022
Data Science Fellow Remote
Employed NLTK on thousands of scraped Reddit posts to train classification models, reaching 92% accuracy
with the top-performing model (Naïve Bayes with Count vectorize).
Forecasted the success of bank marketing campaigns using various machine learning techniques. The best
model (Logistic regression) achieved 92% accuracy, 93% precision, and 97% recall.
Developed multiple ML models for predicting customer churn in the European banking industry, with the
Random Forest model demonstrating the best performance (F1=87%, recall=83%, precision=91%).
Achieved an average classification accuracy of 90% using Natural Language Processing (NLP) techniques,
including Count Vectorizer/Hash Vectorizer, Term Frequency-Inverse Document Frequency (TF – IDF),
Tokenizing/Stemming, Multinomial Naïve Bayes for categorizing into various genres.