SHARFUDDIN ALAM
Charlotte, NC — 929-***-**** — **********.********@*****.***
GitHub: github.com/sharfuddinalam — LinkedIn: linkedin.com/in/ri88 Summary
Experienced Data Science professional who interprets and extracts intelligence from data and solves complex business problems with Data Visualization, Data Modeling, Statistical and Machine Learning. Proficient in furnishing executive leadership teams with insights, analytics, reports, and recommendations enabling effective strategic planning across all business units, distribution channels, and product lines. Published a technical blog regarding Anomaly Detection on LinkedIn. Technical Skills
Analytics Programs: Python (NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn, Keras, TensorFlow), SQL (MySQL, PostgreSQL, Teradata), Git, GitHub, Bitbucket, Tableau, PowerBI, Apache Spark. Real-world experience with: Regression, Classification, Anomaly Detection, Time-series Analysis, Linear/Logistic Regression, Decision Trees, Ensemble Learning (AdaBoost, Random Forest, XGBoost, Blending, Stacking), Clustering Algorithms (K-means, K-NN, K-prototype, Agglomerative, DBSCAN), Dimensionality Reduction (PCA, t-SNE), Statistical Analysis, Regularization, Hyperparameter Tuning, Deep Learning (Multilayer NNs, CNNs, RNNs), Explainable AI (LIME, SHAP). Data Science Skills: Data Wrangling/Collection, Hypothesis Testing, KPI Design, Exploratory Data Analysis, Data Visualization, Data Storytelling, A/B Testing. Professional Experience
Sunnova Energy; Sr. Data Scientist 2024 - Present
– Developed time-series forecasting models to predict energy demand across 90,000+ systems, covering 11 time zones, contributing to securing $3M in government funding.
– Engineered predictive models to forecast solar production using data from solar, grid, and battery import/export systems.
– Conducted exploratory data analysis (EDA) to identify missing values, detect causal patterns, and design SQL-based stratified sampling for training datasets.
– Modified and optimized the Sentient Energy Optimization Model, enhancing battery, solar, and grid interactions to reduce energy costs and CO2 emissions.
– Built a scalable Digital Twin framework to simulate residential solar energy consumption, generation, and storage across various battery vendors.
– Enhanced energy efficiency with predictive battery management systems and advanced forecasting models, improving system reliability and reducing costs. Tabner Inc. (Charter Communications); Sr. Data Scientist 2022 - 2023
– Developed advanced algorithms for anomaly detection and predictive modeling across business verticals using statistical modeling and machine learning.
– Implemented Random Forest and XGBoost models, optimizing operational costs and outlier detection.
– Aggregated data from diverse sources, defined KPIs, and used advanced analytics for reporting and decision-making.
– Deployed machine learning models using the DITTO Framework and SQL Teradata, communicating insights with stakeholders via SHAP scores and Tableau. 1
B.C. Forward (John Deere); Data Scientist 2022
– Implemented expert alerts triggering thousands of machines and providing analysis to dealers and customers.
– Developed Spark Streaming for real-time reporting of machine issues within seconds of data reading from the warehouse, reducing response time from hours to seconds. Collabra (United HealthCare; Optum); Data Scientist 2021
– Alleviated end-user pain points across the enterprise through innovative solutions.
– Utilized Neural Networks and Machine Learning to drive the operational data science team’s objective.
– Supported EEPS organization’s initiatives for high application availability via data-driven automation.
– Developed machine learning solutions specifically tailored for anomaly detection in internal end-user applications.
Daybreak IT Solutions (U.S. Department of Veteran Affairs); Data Scientist 2021
– Collaborated with clinical researchers to analyze the statistical significance of health improvement protocols for Veterans, focusing on conditions like Hypertension, Blood Pressure, Overweight, and Covid-19 effects.
– Conducted nonparametric Wilcoxon Signed Rank Test and derived key statistics (W-value, p-value) for granular-level comparisons among Veterans.
– Implemented Spark RDD framework and optimized queries for scalable data processing, while devel- oping custom UDFs for statistical calculations and visualizing significance with color-coded columns.
– Anonymized Veterans’ data using Python Faker library for confidentiality, and collaborated with Data Engineers to build data integration and pre-processing pipelines for final dashboard creation. Strategic Staffing Solutions (Duke Energy); Data Scientist 2019-2021
– Created a Work Order Prioritization classifier using word embeddings from descriptions and feature engineering to identify and prioritize high-priority tasks.
– Utilized NLP techniques to compute sentence similarity scores between long and short descriptions of work orders as part of feature engineering.
– Implemented Object Detection Models(RetinaNet and YoloV5) leveraging Deep Neural Networks to detect faults on Solar Panel Images and for Total Plant Inspections.
– Developed a document summarizer and classifier using NLP techniques and Topic analysis to streamline information processing.
– Trained Time Series models including LSTM,RNN and traditional algorithms such as ARIMA for forecasting purposes.
– Utilized Amazon Lex service to build a conversational chatbot providing assistance to customers regarding billing, usage and outage events.
Mentorship
Great Learning, Post Graduate Program in AI and DS; Mentor 2021 - Present
– Assess learners’ comprehension levels and tailor teaching strategies accordingly.
– Encourage active participation and critical thinking among learners. Education
Data Science Career Track Certification 2018
Masters in Computer Science, Lamar University; Texas, USA 2017 Bachelors of Electrical Engineering, Carleton University; ON, Canada 2014 2