Post Job Free
Sign in

Data Engineer Engineering

Location:
Chicago, IL
Salary:
85000
Posted:
July 05, 2024

Contact this candidate

Resume:

Niharika Sanjay Pande

DATA ENGINEER

New Jersey, USA ***************@*****.*** 201-***-**** LinkedIn SUMMARY

• Data Engineer with 5+ years of experience in designing, developing, and maintaining robust data infrastructure solutions for leading financial and healthcare organizations.

• Proficient in a wide range of tools and technologies, including but not limited to SQL, Python, Apache Spark, AWS, Azure, ELK stack, and various database management systems.

• Skilled in end-to-end ETL pipeline development using Apache Spark, Airflow, and Talend, facilitating seamless extraction, transformation, and loading of data from diverse sources.

• Adept at managing data warehouses on AWS (Amazon Redshift) and Snowflake, employing horizontal scaling with Apache HBase, and leveraging Azure Data Factory for seamless integration. TECHNICAL SKILLS

Methodologies SDLC, Agile (Scrum), Waterfall

Programming Language: Python, R, SQL, Java

IDE’s: PyCharm, VS Code, Jupiter Notebook

Big Data Ecosystem: Hadoop, MapReduce, Hive, Apache Spark, Pig, HDFS, Apache Kafka ETL Tools: SSIS, Informatica Power Center, AWS Glue Cloud Technologies: AWS Redshift, Microsoft Azure, Snowflake, GCP Frameworks/Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, PySpark, Spring MVC, Spring Boot, Hibernate, JDBC

Reporting Tools: Tableau, Power BI, SSRS

Database: MS SQL Server, Oracle, PostgreSQL, MongoDB, Cassandra Other Tools: Tableau, JIRA, Jenkins, Postman, Google Collab, Eclipse (STS), Shell scripting, Docker, Maven, MS Office

Version Control Tools: Git, GitHub, Bitbucket

Operating Systems: Windows, Linux

Certifications: 5-Star SQL Gold Badge-Hacker Rank, AWS Cloud Practitioner Essentials-Amazon PROFESSIONAL EXPERIENCE

Data Engineer Goldman Sachs, NJ Feb 2022 – Present

• Designed and maintained a centralized repository for large volumes of financial data, encompassing both structured and unstructured datasets.

• Implemented an efficient data model using Erwin Data Modeler to optimize storage and retrieval, reducing query response times by 30% through careful indexing and partitioning.

• Developed end-to-end ETL pipelines using Apache Spark and Airflow to extract, transform, and load data from diverse sources into the data warehouse, achieving a 25% improvement in data processing speed.

• Managed the data warehouse on AWS using Amazon Redshift and Snowflake, optimizing database performance, and ensuring reliable data storage and retrieval.

• Employed horizontal scaling with Apache HBase and optimized SQL queries to enhance scalability, resulting in a 15% reduction in infrastructure costs.

• Collaborated with data scientists and analysts using Jupiter Notebooks, fostering an environment for seamless integration of analytics insights into the data warehouse.

Data Engineer NextGen Healthcare, India Feb 2018 – Dec 2020

• Led the design and development of a Health Information Exchange platform using Apache Kafka and for real-time, secure data sharing among healthcare providers.

• Developed ETL processes using Talend to map and transform data from diverse healthcare sources into a standardized format for exchange, reducing data integration time by 25%.

• Integrated audit trail mechanisms using Apache NiFi to track and monitor data access and modifications, achieving a 30% improvement in data governance and compliance.

• Implemented Azure Data Factory for seamless data integration, orchestrating data workflows and ensuring the smooth flow of information across diverse data sources.

• developed a thorough monitoring system that tracks system health in real time and generates alarms using the ELK stack

(Elasticsearch, Logstash, Kibana), resulting in a 20% reduction in downtime.

• Implemented automated data quality checks using Python and SQL scripts, reducing data discrepancies, and ensuring high-quality data within the HIE platform.

EDUCATION

Master of Science in Information Technology & Analytics – Rutgers University, Newark, New Jersey, USA Bachelor of Technology in Computer Science and Technology – SNDT University, Mumbai, Maharashtra, India



Contact this candidate