Post Job Free
Sign in

Data Engineer Analyst

Location:
New Haven, CT
Posted:
May 18, 2025

Contact this candidate

Resume:

IMRAN SHAIK

New Haven, CT *****

+1-475-***-****

**********.*****@*****.***

Experienced Data Engineer skilled in constructing and managing data pipelines, streamlining ETL workflows, and handling diverse data types. Specializes in transforming disorganized data into accurate, dependable resources for team utilization. Proficient in SQL, Python, Apache Spark, Airflow and AWS. Experienced in data warehousing, batch processing, and real-time streaming projects with a focus on crafting efficient code and scalable systems for long-term success. Skills

Work History

Data Engineer

Cognizant, Bangalore October 2022 - November 2023

Junior Data Analyst

INFODOT SYSTEMS PRIVATE LIMITED, Bangalore,IN September 2021 - July 2022 Education

Master of Science in Computer and Information Sciences Sacred Heart University, Fairfield, CT

Bachelor of Science in Software Engineering

Vellore Institute of Technology, Amaravathi

• Languages: Python, C++, SQL • IDEs: IntelliJ IDEA, PyCharm

• Databases: MySQL, MongoDB • Web Technologies: HTML, CSS, JavaScript

• Frameworks: Apache Spark, Flask Technologies: Apache Airflow, REST APIs, ETL

• Pipelines, AWS EC2, Tableau

• Version Control: Git, GitHub Methodologies: Agile, DevOps, CI/CD, Data

• Modeling, ETL/ELT, Data Warehousing

• Assisted in the migration of a legacy ETL application to the AWS cloud.

• Designed, developed, and implemented ETL pipelines using the Python API (PySpark) of Apache Spark on AWS EMR.

• Experienced with AWS EC2 and Lambda for data processing tasks.

• Experienced in using SQL as well as NoSQL databases such as MongoDB and Cassandra. Gathered requirements from business and created JIRA tickets with detailed descriptions, mapping documents, logic, and

• acceptance criteria.

• Delegated work to offshore teams, conducted code reviews, performed debugging, and fixed bugs logged by QA team.

• Cleaned and prepared data sets from raw data sources and engineered features for training models. Developed Python codebases and PostgreSQL queries to join multiple databases and retrieve relevant information required

• for report generation.

Developed predictive models using Python (Scikit-learn) to forecast customer churn & sales trends, improving retention by

• 15%.

• Automated reporting workflows with SQL scripts, reducing manual effort by 40% and increasing accuracy.

• Built automated Excel reports integrated with SQL & Power Query, cutting reporting time by 50%.

• Performed deep-dive exploratory data analysis (EDA) to uncover patterns and influence business strategies.

• Optimized customer segmentation using K-Means clustering, enhancing marketing campaign efficiency.

• Improved data warehouse performance, reducing query execution time by 30% through collaboration with engineering teams.

• Delivered data-driven insights for leadership by providing ad-hoc analysis on business-critical decisions.

• Led training sessions on data visualization best practices, enhancing data literacy across teams.

• Migrated legacy reports to Tableau, ensuring real-time data availability & consistency.

• Maintained 99% data accuracy by monitoring and resolving data quality & integrity issues.



Contact this candidate