Post Job Free
Sign in

Data Scientist Machine Learning

Location:
Concrete, WA
Posted:
September 10, 2024

Contact this candidate

Resume:

Anushka Sarode

Austin, TX ***************@*****.*** (682) – 376 – 0808 LinkedIn

EXPERIENCE

Data Scientist Research Assistant – University of Texas at Arlington, USA Jan 2024 – Present

• Leveraged SQL to extract, clean and transform data from IBM Db2 and SAP, resolving data quality and integrity issues with a financial impact of 2 million USD. Automated spreadsheets in Excel by streamlining the process with Alteryx.

• Developed ETL workflows for data ingestion/extraction process, leading to significant reduction in data processing time.

• Implemented advanced analytical techniques such as machine learning, predictive modeling, and statistical analysis to derive customer-centric insights and optimize business performance.

• Performed Root-Cause analysis to detect anomalies and provide targeted solutions to address them. Developed detailed dashboards in Power BI and provided strategic recommendations to the Executives. Data Analytics Engineer– Infosys, Pune, India July 2021 – Aug 2022

• Leveraged Python for descriptive data analysis on competitor data, industry trends, target markets, leading to 10% increase in sales. Partnered with product team to analyze performance metrics and identify areas of improvement.

• Built an ETL pipeline in Azure Data Factory to load on-premises SQL server data into Azure Data Lake, supporting seamless integration throughout the build, test, and release stages of development.

• Performed data transformation using Pyspark in Databricks. Designed and optimized data models in Synapse Analytics to streamline data analysis.

• Increased query efficiency using advance SQL functions (Stored Procedure, Window, CTE) reducing execution time by 40%.

• Developed an interactive Tableau dashboard enabling real-time product performance data access to stakeholders. Data Analyst – Trivia Softwares, Mumbai, India Apr 2020 – May 2021

• Performed EDA on 1000 dealers using Python. Extracted data from SAP BW & Salesforce, produced reports using SQL.

• Implemented multivariate, time series analysis, hypothesis testing to analyze customer behavior and identify patterns using R. Developed modules in Python for reusability and generated insights through visualizations using Matplotlib.

• Utilized ETL tools like SSIS for efficient data extraction, transformation, and loading processes. Built regression model

(89% accuracy) to predict CSI rate for every service of each dealer enhancing customer retention.

• Automated interactive Power BI dashboards for trend and KPI tracking and utilized SSRS for detailed reporting. PROJECTS

Ride Share Analytics (BigQuery, Mage, Looker)

• Created a comprehensive project utilizing Google Cloud services to optimize ride share and maximize driver efficiency.

• Developed an ETL pipeline using Mage tool to clean and transform raw data. Google Bucket for cloud storage.

• Performed data analysis using SQL in BigQuery and developed a dashboard on Looker providing insights to stakeholders. Data Driven Marketing Ad Campaign (AWS, Spark, SQL)

• Built a Dashboard using AWS Quicksight for most popular YouTube videos in each category that affect popularity.

• Used Amazon S3 for data storage. Developed Lambda functions to clean unstructured data and automate the process.

• Utilized Amazon Glue for ETL workflows and Spark for data transformation, Amazon Athena to analyze using SQL. EDUCATION

• Master of Science in Information Systems, The University of Texas at Arlington, Arlington, TX Aug 2022 - May 2024 Coursework: Python Programming, R Programming, Data Science, Cloud Computing, Advanced Methods of Statistics, Data Mining, Data Warehousing, Applied Database Management System, Enterprise Resource Planning

• Bachelor of Computer Engineering, Mumbai University, Mumbai, India July 2017 - June 2021 SKILLS

• Programming & Databases: Python, R, PySpark, SQL, NoSQL, PL-SQL, Redshift, Snowflake, Big Query, PowerBI (DAX).

• Cloud Systems & Tools: AWS, GCP, Azure, Alteryx, Hadoop, Tableau, PowerBI, SAS, SAP, GA4, Excel.

• Library & Machine Learning: Pandas, NumPy, TensorFlow, Regression, Classification, Neural Network, SVM.

• Certifications: HackerRank SQL (Intermediate), Google Data Analytics, Python and SQL (Govt of India).



Contact this candidate