Post Job Free
Sign in

Data Analytics Analyst

Location:
Seattle, WA
Posted:
September 29, 2025

Contact this candidate

Resume:

RITU GOTHWAL

***********@*****.*** 646-***-**** https://www.linkedin.com/in/ritugothwal https://github.com/gothwalritu EXECUTIVE SUMMARY

Experienced researcher & data scientist, with over 5 years of leadership in building quantitative solutions and developing predictive models. Proven track-record in managing cross-disciplinary teams, leveraging expertise in data analytics, statistical modeling, and machine learning to drive projects and innovations. Hold Ph.D. in Environmental Engineering from Indian Institute of Technology Hyderabad, and Data Analytics and Visualizations Certificate from University of California Berkeley.

EDUCATION

Data Analytics and Visualizations Certificate, University of California Berkeley Doctoral Research (PhD), Indian Institute of Technology Hyderabad M.Tech, Indian Institute of Technology Delhi

PROFESSIONAL EXPERIENCE

Stok Carbon & Energy Data Engineer July 2024- August 2025 Built ML solutions for energy forecasting and spend categorization with GHG emission estimation, applying agentic AI

(rule-based, fuzzy matching, NLP embeddings) and advanced time-series models (XGBoost, ARIMA). Reviewed deployment of scalable pipelines with Azure ML, and interactive dashboards (Streamlit, Power BI) for business users, achieving high automation and accuracy.

Skills: Python, SQL, Azure ML, XGBoost, ARIMA, NLP, Streamlit, Power BI Energy Consumption Prediction

• Data Ingestion & Storage: Automated collection of 15-minute interval energy data, building metadata, and weather feeds; performed anomaly detection with Python and stored cleaned datasets in Azure SQL Database.

• Feature Engineering: Queried and aggregated data to create time-series features (rolling averages, lag values, seasonal trends), integrating occupancy and calendar patterns.

• Model Development: Trained predictive models (XGBoost, ARIMA) in Azure ML.

• Deployment & Monitoring: Reviewed model deployment method as a REST API on Azure Services. Smart Spend Categorization & Emissions Estimation (Agentic AI + NLP Project)

• Built agentic AI pipeline (rule-based, fuzzy matching, NLP embeddings) to map spend labels to standardized categories.

• Ingested, cleaned monthly Excel spend data, integrated with PostgreSQL datasets, logged anomalies with Python.

• Calculated GHG emissions by linking spend categories with emissions factors, enabling accurate CO & CH reporting.

• Delivered an interactive dashboard (Streamlit) for spend uploads and AI-generated mappings.

• Automated >85% of classification and emissions tasks, cutting manual effort and improving accuracy in enterprise workflows.

Mozilla Firefox Open-Source Software Development Contributor October 2023 – January 2024 Enhanced maintenance effectiveness in Mozilla's software development processes with time series analysis. Streamlined Firefox's testing process using machine learning, leading to enhanced efficiency and performance. Skills: Python, Git Version Control, Machine Learning, Data Visualization.

• Conducted feature ablation study to assess impact of feature removal on software performance, influencing project design.

• Collected and analyzed data from Bugzilla on key metrics like maintenance effectiveness index, burn down time, and the number of incoming and closed issues.

• Developed a script for calculating yearly MEI values and led the data analysis and visualization efforts to develop time series charts.

International Water Management Institute (IWMI) Consultant Researcher August 2021 – December 2021 Applied data science methods to environmental research by developing a water quality modeling framework to study antimicrobial resistance in water, supporting global decision-making aligned with Sustainability development goals. Skills: Python, Data Collection & Exploration, Statistical Analysis, Visualization, Reporting, Cross-functional Collaboration.

• Developed data-driven water quality modeling framework integrating statistical analysis and predictive modeling.

• Collaborated with cross-functional international teams to design analytics pipelines and deliver high-impact insights.

• Collected, cleaned, and analyzed large environmental datasets; applied exploratory data analysis (EDA) to identify trends and drivers of resistance.

• Built visualizations and metrics for leadership reporting, translating complex scientific results into actionable insights. Indian Institute of Technology Hyderabad Doctoral Researcher January 2013 –September 2018 Developed a deterministic predictive water quality mathematical model using hydrological and water quality data for rivers and incorporated uncertainties in the model by employing stochasticity in environmental parameters. Skills: R, MATLAB, GIS, Predictive Modeling, Statistical Analysis, Data Handling, Research Methodology

• Designed and implemented a deterministic predictive model for water quality using environmental data.

• Incorporated stochastic modeling to capture uncertainties in environmental parameters, enhancing model robustness.

• Managed end-to-end research workflow: data collection, statistical analysis, model development, validation, and reporting.

• Published peer-reviewed research and presented at international conferences, demonstrating expertise in communicating technical results.

PROJECTS on github (https://github.com/gothwalritu)

• Emission Factor Unit Conversion Tool Python, Streamlit Developed and deployed a Python-based tool on Streamlit to automate emission factor unit conversions, with code hosted and version-controlled on GitHub for reproducibility and accessibility.

• Water Quality time series forecasting model Python, ML Created Time series forecasting tool using ARIMA for the water quality index of lakes of King County, WA.

• Moja global open-source project Geospatial Analysis, Python, Research, Documentation Performed geospatial analysis using python for monitoring greenhouse gas emissions and removals from landuse and landuse change, integrated Landuse Sector with Sustainable Development Goals.

• Fraud detection using Machine learning Imbalanced-learn, Scikit-learn, Random Forest Classifier Applied supervised machine learning to a credit card dataset to recognize and predict the fraud.

• Talent Retaining Analysis QuickDBD, PostgreSQL, pgAdmin, SQL Developed an entity-relationship diagram (ERD) and Schema, identified employee retiring and eligible for mentorship.

• Amazon Vine Review Analysis PySpark, AWS RDS, SQL Performed ETL and analysis of Amazon Vine video game reviews with PySpark, storing results in AWS RDS. Identified bias with ~51% of paid Vine reviews rated 5 stars vs. 38% of unpaid reviews. TECHNICAL SKILLS:

Data Science Tools: Pandas, NumPy, Exploratory Data Analysis, VS Code, RStudio, Azure Cloud Services Data Visualization: Power BI, Tableau, Matplotlib, Excel, Streamlit Statistical Methods: Predictive Analysis, Hypothesis Testing, Statistical Inferences, Optimization, Stochastic Modeling Languages: SQL, Python, MATLAB, R, VBA, Pyspark

Building Data Science Models: Machine Learning, Neural Network, Time-Series Forecasting, Data Preprocessing, Linear & Logistic Regression, Decision Trees, Model Performance



Contact this candidate