Post Job Free
Sign in

Machine Learning Data Analysis

Location:
Baltimore, MD
Posted:
December 02, 2024

Contact this candidate

Resume:

Sai Swetha Vadrevu

Email: *******.***********@*****.*** Mobile: 667-***-****

PROFESSIONAL SUMMARY

• Data Scientist with 4+ years’ experience in data analysis and engineering with a wide range of knowledge of machine learning, data analysis, advanced analytics, big data, and building ETL pipelines specializing in ML model development, experimentation, and deployment.

• Expertise in Python, R, and SQL for predictive and statistical modeling. Skilled in deploying ML models with frameworks like TensorFlow, PyTorch, and Scikit-Learn.

• Experience building models using statistical techniques such as Supervised and Unsupervised Machine Learning models, Deep learning, and Text classification.

• Effective data storyteller, adept at simplifying complex technical data insights for diverse, non- technical audiences, enhancing decision-making.

• Experience working in Agile teams, leveraging cloud platforms (AWS, GCP) and containerization technologies like Docker and Kubernetes to build scalable AI-driven solutions. TECHNICAL SKILLS

• Machine Learning: Pytorch, Scikit Learn, Spacy, Matplotlib, NLP, TensorFlow, Seaborn, LLM, Keras, Hugging Face, Lang chain, Computer Vision, Deep Learning.

• Analytical Skills: MS Power BI (DAX, RLS), MS Excel (Pivot Tables, VBA, Macros), Tableau, Looker, Data Modeling, Statistical Analysis, Exploratory Data Analysis, Data Wrangling.

• ML Models: Random Forest, Decision Trees, Gradient Boost, DBSCAN, Naïve Bayes, K means Clustering, Linear Regression, Logistic Regression, Principal Component Analysis, Reinforcement Learning, SVM, K-Nearest Neighbors, Time Series forecasting.

• Languages: Python (NumPy, Pandas, Stats models, Matplotlib, Seaborn, Folium, spaCy, Scikit- Learn, Beautiful Soup, Pyspark, Flask, TensorFlow, Plotly), R (RShiny, GGPlot), MATLAB, Scala, SAS, SQL (Window Functions, CTEs, Joins), Java, Spark, C, C++, HTML, CSS, Unix, Linux, Data Structures and Algorithms, Hadoop (HDFS, Hive).

• Databases: MYSQL, SQL Server, PostgreSQL, Snowflake, DBT, SSIS, SPSS, Teradata, MongoDB.

• Tools And Technologies: Power Query, Git, GitHub, Tableau, Hadoop (Hive, Spark), RStudio, Jira, Microsoft Office (Word, Excel, PowerPoint), Jupyter Notebook, Microsoft Azure Databricks, IBM DataStage, Visual Studio, AWS (S3, EC2, SQS, DynamoDB, Quick Sight, SageMaker), GCP, Apache Airflow, Docker, Kubernetes, Google Big Query, Google Sheets, Adobe Analytics.

• Version Control: GitHub, GitLab, Jenkins, CI/CD. PROFESSIONAL EXPERIENCE

HCA Healthcare, Nashville, TN Apr 2024 to Present

Role: Data Analyst

Responsibilities:

• Worked with cross-functional teams to convert business requirements into data-driven solutions, focusing on skill estimation and experience optimization through ML models.

• Developed and deployed predictive models using Python with frameworks such as TensorFlow and PyTorch, resulting in a 20% boost in operational efficiency.

• Built and managed ML pipelines, including data preprocessing, feature engineering, and model tuning for production-level deployment.

• Conducted hypothesis testing, A/B testing and anomaly detection to ensure high-quality data, contributing to data governance initiatives and enhancing operational decision-making.

• Communicated insights via Tableau dashboards, providing clear, actionable data to both technical and non-technical stakeholders.

Tech Mahindra, India Dec 2019 – Aug 2022

Role: Data Analyst

• Built and optimized big data pipelines and architectures by performing in-depth data mining, resulting in a 30% improvement in data retrieval times and analytics accuracy.

• Performed root cause analysis on recurring data issues, leading to strategic improvements in data quality and reduced data redundancy.

• Developed automated ETL processes using SQL and Python, reducing manual data processing time by 40% and enhancing overall productivity.

• Designed interactive Power BI dashboards that empowered executives with real-time insights, directly improving campaign effectiveness by 20%.

• Created statistical models and experimentation frameworks that refined product features, contributing to a 15% boost in customer satisfaction metrics. Cohesity, India Jan 2019 to Nov 2019

Role: Data Engineer

Responsibilities:

• Built and automated ETL processes using SSIS and SQL to streamline data pipelines, supporting real-time analytics and enhancing product development.

• Collaborated with cross-functional teams to identify and address data requirements, establishing scalable data structures for impactful insights.

• Supported large-scale data handling and integrated Spark for data processing, optimizing performance for data-intensive applications.

ACADEMIC PROJECTS

Destination Dynamics: Prediction of Crowds and Optimizing Experiences

• Implemented predictive models like CNN, ARIMA, XGBoost, and GRU for visitor forecasting; optimized tourist flow, boosting visitor experience and decreasing peak season congestion by 40%.

• Developed and deployed a flexible pricing strategy using linear regression, optimizing rates for economic accessibility, which resulted in a 25% surge in off-peak visits and a 15% revenue increase.

Multi-PDF Chatbot with LangChain and Streamlit

• Developed a multi-PDF RAG chatbot using LangChain and Streamlit, integrating ChatGPT and GitHub Co-Pilot to process, retrieve, and interact with PDF content, enhancing user engagement and document accessibility.

• Optimized the AI's search efficiency through vector search using FAISS, significantly improving response accuracy and speed.

E-commerce Data Analysis using DBT and Snowflake

• Developed and implemented a star schema to transform 10,000 rows of raw e-commerce data into optimized fact and dimension tables, enhancing data warehouse solutions.

• Built and optimized DBT models for advanced analytics, including sales aggregation and customer segmentation, while leveraging Snowflake features like Tasks, Streams, and stored procedures for improved data integrity and performance.

EDUCATION

Master of Professional Studies in Data Science, University of Maryland, Baltimore County, MD – May 2024 CERTIFICATIONS

Databricks: Generative AI Fundamentals Oracle: SQL Coursera: Power BI NPTEL: Python



Contact this candidate