Data Scientist Machine Learning

Location:

Arlington Heights, IL, 60004

Posted:

May 23, 2025

Contact this candidate

Resume:

Swathi Suddala

Data Scientist

ResearchGate Google Scholar

Wheeling, IL, USA 331-***-**** *******************@*****.***

Professional Summary

Highly analytical and process-oriented Data Analyst with 8+ years of IT experience as a Data Scientist specializing in technology and innovation using Data Science, Data Analysis, and Data Engineering, providing data-driven and action-oriented solutions to business problems.

Proficient in the entire Data Science Life Cycle. Worked on Data Analysis, Data Preparation, Data manipulations, Data Exploration & Visualization using Pandas, NumPy, Matplotlib & Seaborn.

Hands-on experience in Machine Learning (ML), worked on Model inspection, Model analysis, Model training, Model evaluation, and improving Model performance using Scikit, PyTorch, and TensorFlow.

Hands-on experience with tools like R, visualizations in R and Python, and Tableau.

Experience using multiple IDEs such as PyStudio, PyCharm, and Jupyter Notebook for Python development.

Worked on developing models based on Machine learning techniques and algorithms. Such as k-NN, Naïve Bayes, Decision Trees, Random Forests, Regressions, logistic Regressions, PCA, SVM, Clustering, Dimensionality Reduction Algorithms

Hands-on experience in working with large sets of complex datasets, converting unstructured data into structured and workable datasets.

Proficient in creating interactive data visualizations, dashboards, and reports using Tableau, Power BI Desktop, and Power BI Service.

Experience in Prediction Modelling techniques like simple Regression (Linear Regression, Multiple Linear Regression, and Logistic Regression) Analysis techniques.

Experience in R, SQL, Tableau, Python, and MS EXCEL (VLOOKUP, Pivot charts, Macros) for

Statistical analysis of data.

Education

Master of Science in Information Technology Management University of Wisconsin, Milwaukee, USA

Dec 2023

Master of Technology in Environmental Geomatics JNTU University, Hyderabad, India Dec 2014

Skills

Programming Languages: Python, R, SQL

IDE: PyCharm, PyStudio, Jupyter Notebook, R Studio, Anaconda

Python Libraries: Pandas, NumPy, SciPy, Scikit-Learn, Matplotlib, Seaborn, TensorFlow

Data modeling: Entity Relationship Diagrams(ERD), Snowflake schema, Star schema

Data Visualization Tools: Tableau, Power BI, Microsoft Excel, Visio

Data Management: Data Cleaning, Data Manipulation, Statistical Modeling, Data privacy and Security Compliance.

CI /CD pipelines, MLOps

RAG Architecture, LLM evaluation, Semantic search

Databases: MSSQL Server, SQL Server, SPARK SQL

Version Control: Git, GitHub

Others: Microsoft Office Suite, PowerPoint

Certifications

Salesforce data cloud consultant certified.

Work Experience

Charles Schwab September 2023 – Current

Role: Data Scientist

Performed in-depth data analysis on large-scale financial datasets to uncover inefficiencies and performance gaps, driving a 20% improvement in business throughput through strategic recommendations.

Designed and automated KPI dashboards and performance reports in Power BI to monitor business health and identify trends, enabling real-time decision-making across key departments.

Designed and implemented RAG pipelines using LLMs to enhance document summarization and answering systems across financial datasets, significantly improving response relevance and accuracy.

Built and deployed machine learning models (e.g., Linear Regression, Multivariate Regression, Naive Bayes) using Python (Pandas, NumPy, Scikit-learn) to support predictive analytics and risk assessment in the finance domain.

Integrated semantic search capabilities using vector embeddings and FAISS to enable context-aware document retrieval, improving knowledge base precision and search experience.

Conducted advanced statistical analyses using Python, R, and SQL in cloud-based environments, providing actionable insights that informed business and marketing strategies.

Collaborated cross-functionally with business leaders, analysts, and engineering teams to understand data requirements, performance levers, and operational goals, ensuring alignment of analytics solutions with business objectives.

Developed and fine-tuned Large Language Models using TensorFlow and PyTorch, applying few-shot learning and prompt engineering techniques to reduce training data dependency while maintaining high accuracy.

Conducted LLM evaluation and hallucination analysis, implementing confidence scoring and reranking strategies to eliminate misleading outputs in generative responses.

Integrated CI/CD pipelines for ML model monitoring and maintenance, ensuring scalable and compliant deployment of data science solutions across production environments.

Presented data-driven insights and strategic recommendations to senior stakeholders, contributing to decision-making across financial planning, credit analysis, and digital marketing initiatives.

Client: IT Intellect Micro Solutions PVT LTD Mar 2020 - July 2022

Role: Data Scientist

Responsibilities:

Analyzed large datasets related to supply chain operations, including inventory levels, demand forecasts, and transportation costs, to identify inefficiencies and optimization opportunities.

Participated in all phases of Data Mining, including data collection, cleaning, validation, model development, visualization, and gap analysis, to uncover actionable insights and support strategic decision-making of supply chain management.

Built predictive models for the segmentation of customers and personalized marketing using Python libraries such as Pandas, NumPy, Scikit-learn, and TensorFlow.

Designed and deployed predictive models for sales forecasting and inventory management across multiple locations, achieving significant accuracy improvements in the supply chain cycle.

Created complex SQL queries and stored procedures for data extraction and transformation, enhancing accessibility for analytical reporting and business intelligence applications.

Developed statistical models and machine-learning algorithms to analyze customer behavior and refine menu offerings for improved customer satisfaction and retention.

Designed and maintained interactive dashboards using Power BI and Tableau, enabling visualization of key performance indicators to improve data-driven decision-making.

Worked closely with data engineering teams to optimize data pipelines using big data technologies such as Spark and Hadoop, improving data processing speed and reliability.

Worked closely with suppliers and logistics teams to ensure data accuracy, streamline operations, and improve overall supply chain reliability.

Deployed machine learning models in production environments using cloud platforms like AWS SageMaker and Azure Machine Learning, ensuring scalability and performance.

Client: Cred Avenue, India July 2014 to Sep 2017

Role: Data Scientist

Responsibilities:

Utilized Python collections for efficient manipulation and iteration of user-defined objects, streamlining data processing tasks.

Developed business logic using Python to implement planning and tracking functions, enabling better project and resource management.

Conducted Gap Analysis to identify discrepancies between current data sets and desired outcomes, addressing critical areas for improvement in data processes and analytics frameworks.

Collaborated with stakeholders to define project requirements, performed impact assessments, and aligned data solutions with business needs, enhancing data accuracy and usability.

Designed and maintained relational databases (SQL Server, Oracle) by ensuring data normalization, indexing, and performance tuning for optimized data retrieval.

Partnered with the QA team to build and populate databases, ensuring adherence to data standards and quality benchmarks.

Used Python’s Unit Test library to test programs and validate code functionality, ensuring the robustness and reliability of developed solutions.

Client: Hoch Technologies, India Aug 2012- June 2014

Role: Data Analyst

Responsibilities:

Accountable for improving Data Quality and for designing or presenting conclusions gained from analyzing data using Microsoft Excel as a statistical tool.

Worked on data analysis to evaluate the data quality and resolve data-related issues.

Participated in identifying reporting needs and helping the business in creating specifications and worked closely with the report development team to achieve the reporting goal.

Used various professional statistical techniques to analyze and interpret data from customers and partners.

Used SQL and Excel to ensure data quality and integrity. Identified and eliminated duplicates and inaccurate data records.

Created SQL queries to validate ETL transmissions and validate functional requirements.

Identified and reported any data issues, conducted detailed weekly reports, and proactively participated in team meetings.

Worked on the development and implementation of predictive models to stabilize the business and maximize efficiency.

Developed SQL scripts on relational and non-relational databases, and worked on query optimization, and data modeling.

Created technical documentation and presented information to senior management.

Academic Projects

Appointment scheduling for vaccines:

I developed a service-oriented analysis and service-oriented design for the project. This comprehensive project entailed creating use cases, object models, interface designs, and database designs.

The primary objective was to implement a robust system that optimizes vaccine appointment scheduling through service-oriented architecture principles. This approach ensured scalability and flexibility in managing and booking appointments efficiently.

Forest fire prediction:

Applied data science techniques, including regression analysis, to identify key factors influencing the spread of forest fires using U.S. wildfire occurrence data.

Conducted the analysis in R Studio, Python, and Excel, achieving a prediction accuracy of 80% in identifying the variables most responsible for fire spread.

Social Network Analysis using Gephi

Utilized Gephi to design, develop, and analyze complex social network structures. Performed in-depth network analysis to uncover connectivity patterns, node centrality, and relationships between entities.

Visualized large-scale networks to identify key influencers and community clusters, enhancing insights into overall network behavior.

Gained hands-on experience in graph theory concepts and network metrics, such as degree distribution, modularity, and betweenness centrality.

Development of web-based solutions:

Developed multiple web-based solutions by creating dynamic and interactive user interfaces using JavaScript.

Implemented required functionalities to improve responsiveness, usability, and overall user experience across various applications.

Gained hands-on experience with JavaScript frameworks and libraries such as React, jQuery, resulting in scalable and maintainable web solutions.

SAP Data Modeling

Applied Dimensional Modeling and data acquisition techniques using Eclipse and SAP modeling tools to design optimized data structures for analytical reporting.

Developed data models tailored for efficient querying and analysis, ensuring high performance and reliability in business intelligence environments.

Focused on enhancing data accessibility and streamlining reporting processes by creating scalable and well-structured data models aligned with organizational requirements.

Web scraping:

Leveraged Python scripting for efficient web scraping to systematically extract and structure data from different websites.

Developed regression models to analyze data trends and uncover relationships between variables for performing deeper analytical insights. Utilized libraries such as Pandas, NumPy, and Matplotlib for data preprocessing, statistical analysis, and result visualization.

Box Office Analysis:

Led a team of four in a data analysis project focused on Box Office performance, guiding the end-to-end analytical process.

Conducted correlation analysis, examined multicollinearity, and explored relationships between dependent and independent variables using Microsoft Excel.

Developed a predictive data model that estimated movie performance with 70% accuracy by analyzing key influencing variables and patterns within the dataset.

Geographic Data Manipulation Using Python:

Utilized specialized Python libraries such as GeoPandas, Matplotlib, and Scikit-learn to handle, analyze, and visualize geographic data.

Performed tasks like reading and parsing spatial data formats (e.g., Shapefiles), conducting spatial operations (e.g., buffering, intersection), and transforming coordinate systems.

Created map-based visualizations to interpret spatial patterns and trends and applied machine learning techniques such as clustering and regression for geospatial analysis.

Leveraged Python’s geospatial ecosystem to support data-driven insights across domains like environmental science, urban planning, and geospatial intelligence.

Development of Soya Paneer (Tofu) as a Nutritional Alternative:

Explored the traditional method of preparing soya paneer as a plant-based substitute for conventional buffalo milk paneer, aiming to enhance nutritional value by increasing protein content and reducing fat levels.

Conducted comparative analysis on texture, taste, and shelf life of soya paneer versus traditional paneer, without the use of preservatives.

Highlighted the potential of soya paneer as a healthier, protein-rich, and sustainable alternative, suitable for diverse culinary applications.

Research and Publications

Title of the article: Cloud-Native Data Science for Edge Computing and IoT Applications

Journal name: IJCSRR (International Journal of Current Science Research and Review)

DOI: https://doi.org/10.47191/ijcsrr/V7-i10-61

Title of the article: Dynamic Demand Forecasting in Supply Chains Using Hybrid ARIMA-LSTM Architectures

Journal name: IJAR (International Journal of Advanced Research)

DOI: https://dx.doi.org/10.21474/IJAR01/19738

Title of the article: Enhancing Generative AI Capabilities Through Retrieval-Augmented Generation Systems and LLMs

Journal name: Library Progress International, available at BPAS Journals

Contact this candidate