Post Job Free

Resume

Sign in

Senior Data Quality Engineer

Location:
Boston, MA
Posted:
June 27, 2023

Contact this candidate

Resume:

DARSHIT SHAH **

adxycb@r.postjobfree.com 857-***-**** Boston, MA LinkedIn/darshit131998 Github/Darshit98 EDUCATION

Northeastern University, Boston GPA: 3.92/4 Expected May 2024 Master of Science in Information Systems

Courses: Data Science, Design Advance Data Architecture for Business Intelligence, Application Development, Web Design

University of Mumbai, India May 2019

Bachelor of Technology in Information Technology

Courses: Advance Database, Data Mining, Big Data Analytics, Software Project Management, Cloud Computing TECHNICAL SKILLS

Programming : Java Python, R, HTML5, CSS3, JavaScript, JQuery, Selenium, Cucumber DB & Analytics Tools : MySQL, Postgres, IRIS, MongoDB, Tableau, PowerBI Data Engg Tools : Apache Spark, Airflow, Informatica, Talend, Alteryx, ER Studio Libraries : NumPy, Pandas, Matplotlib, Seaborn, Beautifulsoup, Tweepy, Scikitlearn Other Technology : AWS (Glue, S3, Athena, EC2), Statistics, Excel, React, Redux, NodeJS, Postman, A/B Testing Project Management : Agile Model, Scrum, GitHub, Microsoft Office Suite, Atlassian JIRA Certifications : Advance Excel, Tableau, Python

PROFESSIONAL EXPERIENCE

Senior Data Quality Engineer, Bengaluru LTIMindtree Jun 2019 – Jul 2022

• Led team to improve data accuracy and efficiency by 70% using kafka and Splunk for real-time message validation

• Conceptualized annual product review to enhance delivery by 90%, supervised Jira Dashboard for team to track testing progress in agile framework model

• Analyzed position and balance data from AWS S3 using MS Excel to investigate outliers, and collaborated with senior management, and cross-functional teams to take appropriate action

• Developed ETL pipelines with Apache Spark, launched security record dashboards, and conducted workload assessment resulting in 75% productivity increase using SQL server and Python

• Improved efficiency KPIs by 50% through Tableau visualization reports and cross-functional collaboration to identify and mitigate concerns with transaction data

• Implemented efficient and scalable ETL processes using Airflow to automate 3 million rows of data, improving data processing accuracy and speed

• Identified concerns and issues about functional database requirement using complex SQL queries, and mitigate same with management to minimize risks and create reports, optimized output around 30% PROJECTS

Flight Status Prediction Linear Regression, Random Forest, Decision Tree, xgboost Mar 2023 – Apr 2023

• Spearheaded the development and implementation of a comprehensive suite of machine learning algorithms to effectively forecast and predict flight delays and cancellations

• Utilized Principal Component Analysis (PCA) to reduce data dimensionality and eliminate noise, resulting in improved model accuracy and achieving an impressive accuracy rate of over 95%

• Conducted extensive exploratory data analysis (EDA) and incorporated multiple factors such as flight schedules, arrival/departure delay, weather conditions, and historical data to provide insights into flight status forecasting Grocery Store Analysis Matplotlib, Seaborn Jan 2023 – Feb 2023

• Analyzed sales data of different products using boxenplot and stripplot to gain deeper insights and evaluated customer preferences to discover that 59.5% of customers prefer e-wallet and Cash payment over credit card

• Figured out valuation method and accomplished data churning methodology to determine number of customers who churned store membership, grouped by different cities

• Analyzed hourly sales data from multiple branches using relplot in Python, resulting in a 99% improvement in decision-making accuracy for resource hiring

Find Your Dream University Tableau, Tweepy, Pandas Oct 2022 – Dec 2022

• Exhibited proficiency in utilizing Python and libraries such as beautifulsoup, tweepy, and pandas to extract data from multiple sources and transform it into CSV format for further analysis and insights

• Normalized and cleaned raw data, reduced redundancy, and improved readability by 75%, resulting in better decision making and faster analysis

• Generated a Tableau dashboard to visualize insights on number of students who applied and received admissions with average score for specific program



Contact this candidate