Post Job Free
Sign in

Data Analyst Science

Location:
Boston, MA
Salary:
70000
Posted:
April 20, 2025

Contact this candidate

Resume:

Shreya Thakur

******.******@************.*** 857-***-**** Linkedin

Education

Northeastern University May 2025

MS in Information Systems

Relevant Courses: Application Engineering and Development, Data Management and Database Design, Designing Data Architecture Business Intelligence, Data Science Engineering Methods and Tools, Prompt Engineering. Mumbai University May 2023

BE in Information Technology

Relevant Courses: C, Java, Python, Database Management Systems, Data Structures and Algorithms, Data Mining Warehousing,Data Science and Visualization, Human-Computer Interaction, Software Development Life Cycle. Experience

Astra Chemtech Pvt. Ltd Data Analyst (Intern) – Mumbai, India Aug 2022 – Oct 2022

• Optimized data processing by cleaning and preprocessing raw datasets using Python, Pandas, and NumPy, enhancing accuracy and reducing data errors by 20%, ensuring stakeholder-ready insights.

• Conducted statistical analysis using R and Python (SciPy, StatsModels), deriving actionable insights and enhancing data-driven decision-making by 10%.

• Streamlined workflows by resolving data challenges with cross-functional collaboration, using SQL and ETL tools (Apache Airflow, Talend, Alteryx), reducing preparation time by 15%. Pearl Thermoplast Pvt. Ltd Data Analyst (Intern) – Mumbai, India Feb 2022 – April 2022

• Researched and implemented emerging technologies in dataset management, analyzing 50+ datasets to enhance analytical proficiency, accuracy, and data processing efficiency by 25%.

• Collaborated with supervisors to enhance data administration workflows by automating repetitive tasks, standardizing data pipelines, and implementing best practices, resulting in a 15% increase in operational efficiency and improved data accessibility for cross-functional teams.

• Gained extensive experience with advanced analytical tools such as SQL, Tableau, and Power BI, improving data visualization and reporting efficiency by 30% through data modeling, dashboard development, and predictive analytics, ultimately accelerating data-driven decision-making.

Projects

AI-Powered Interactive Smart Board Development

• Developed and implemented a machine learning-based gesture recognition system using Python, OpenCV, and Mediapipe, enabling real-time virtual sketching and improving interaction efficiency by 30%.

• Engineered and preprocessed large datasets to enhance computer vision models, leveraging NumPy and data augmentation techniques for improved accuracy in hand-tracking algorithms.

• Led real-time image processing in data pipelines for reducing latency by 25% and ensuring seamless integration of AI-driven analytics for predictive user behavior insights.

• Migrated system performance metrics to Tableau and Power BI, visualizing gesture recognition accuracy and user engagement patterns to refine AI model predictions. ETL Pipeline Development and Data Warehousing

• Designed and implemented an ETL pipeline using Talend, processing 3+ million traffic accident records from Austin, Chicago, and NYC, reducing data ingestion time by 20% and ensuring high data integrity across MySQL staging tables.

• Established and optimized a dimensional data model with 6+ dimension tables and 3 fact tables, enabling 60% faster reporting and trend analysis on traffic incidents, injuries, and fatalities using SQL and data warehousing techniques.

• Performed data profiling on datasets containing 2M+ records and 50+ variables, addressing 21-29% missing data, ensuring 100% uniqueness in records, and improving data completeness for predictive modeling.

• Built and deployed interactive dashboards in Power BI and Tableau, leveraging DAX and SQL to analyze 3+ million accident records, enabling real-time monitoring and data-driven policy decisions to improve urban traffic safety. Publications

Smart Art Board using AI

Authored a research publication on AI-powered virtual smart boards utilizing Python, OpenCV, and Mediapipe, achieving real-time gesture-based interaction and enhancing creative collaboration by 30% through optimized hand-tracking algorithms.. Technologies

Languages and Tools: SQL Python PowerBI Tableau Javascript CDMP (Aspiring) VBA DAX Visual Studio Google Cloud Platform Excel Matplotlib Talend Alteryx SQL Server Management Studio Figma Illustrator SDLC Data Analysis: Data Profiling ETL Pipelines Dimensional Modeling Data Warehousing Data Visualization Descision Making Statistical Analysis Predictive Modeling Data Architecture Business Intelligence Trend Analysis



Contact this candidate