Post Job Free
Sign in

Data Engineer Warehouse

Location:
Cleveland, OH, 44199
Salary:
80000
Posted:
October 01, 2024

Contact this candidate

Resume:

Pavan M

***********@*****.***, 551-***-****, Jersey City, NJ, LinkedIn

EXPERIENCE

DATA ENGINEER / Mitaja Corporation (Full-time)- Maryland, USA 01/2023 - Current

● Ensured data consistency across systems through configuration management. Built optimized data pipelines with Apache Spark, Kafka, and AWS Lambda, reducing processing time by 30%.

● Developed robust data architectures for efficient storage and processing. Created and maintained ETL pipelines using Snowflake.

● Successfully migrated on-premise data warehouse to AWS Redshift, achieving a 25% cost reduction. Integrated Redshift with Glue, Lambda, and S3 for seamless data flow.

● Collaborated with UI/UX teams to align wireframes with requirements. Integrated PLC data into central databases.

● Designed data visualizations with Sisense, Tableau, and Power BI. Implemented MDM/PIM solutions with Stibo Systems to improve data governance.

● Validated data consistency using SQL for transaction testing. Applied SPC techniques to monitor and control manufacturing processes, ensuring consistent quality.

DATA ENGINEER / UBER (Intern) - San Francisco, USA Mar – Dec 2023

● Built scalable data pipelines with Apache NiFi, Talend, and Kafka. Implemented data governance for privacy and compliance.

● Ensured data availability and reliability through replication, backup, and disaster recovery. Developed real-time streaming apps integrated with downstream systems.

● Optimized operations using data-driven methodologies. Managed data sourcing for OFAC/BSA/AML teams.

● Created reports with Power BI and managed relational/non-relational databases for business applications.

● Automated CDC processes and data workflows in DataMart using SSIS/SQL and DevOps tools like Bamboo and Bitbucket.

● Migrate SSIS packages to Azure Data Factory; utilize GCP services like BigQuery and Dataflow.

● Improved data quality with automated validation and transformation. Enhanced pipeline efficiency by 40%, enabling better business decisions through real-time analytics.

DATA ANALYST / TCS (Full-time)- Hyderabad, India June/2021 – Nov/2022

● Identified relevant data and developed use cases with domain experts. Ensured accurate ERP data exchange.

● Assisted in creating predictive risk models and standardized data formats for system consistency.

● Managed big data clusters using Hadoop and Hive; analyzed data with Python libraries like Pandas and Matplotlib.

● Developed middleware solutions and data models for integration and reporting. Created ETL workflows with Informatica.

● Implemented data security, optimized SQL queries, and managed databases. Used Jira for ticket management and Teradata for database tasks.

● Enhanced query performance by 50% and integrated multiple data sources into a unified data warehouse. TECHNICAL SKILLS

Programming: Python, MySQL, Java, R, NumPy, Pandas, CSS, BigData Tools: Machine Learning, AI Builder, Chatbots, LLMs, APIs, Regression Models, Kubernates, Git, Jenkins, Apache Spark, Hadoop, Kafka.

Tools: Excel, Tableau, PostgreSQL, SAS, MongoDB, Talend, Apache NiFi, Informatica, Redshift. Certifications: SQL, Python, AWS.

PROJECTS

Revolutionizing Conversational AI with RAG and Cohere Command R+

● Enhanced chatbot accuracy by 35% using Cohere's Command R+, LLama-3, and RAG techniques, integrated with the LLamaindex library.

● Improved response times by 40% through optimized context retrieval and embedding utilization with LLamaindex.

● Boosted user satisfaction by 25% with a Streamlit-based interface, enabling seamless, code-free interactions. Industry Capstone Project – Project Portfolio BI Reporting & Analysis For Studio Lab

● Currently developing business insights by analyzing and comparing initial estimated hours with actual hours worked on Studio Lab’s

● Leveraged Jira for data access and utilized eazyBI for report generation, increasing project time-tracking accuracy by 30%. Automatic PowerPoint Generator

● Led the development of an automated PPT tool, reducing preparation time by 30% with Python-based web scraping and NLP for topic-driven slide generation.

● Authored a thesis on the process, showcasing a 40% enhancement in content accuracy through integrated data extraction and text summarization.

Cloud Data Migration

● Led the migration of an on-premise data warehouse to AWS using tools like AWS DMS, S3, and Redshift, which improved scalability by 40% and reduced operational costs by 25%. EDUCATION

Stevens Institute of Technology, New Jersey, USA

Master’s in Business Intelligence and analytics. Dec 2022 – May 2024 JB Institute of Technology and Engineering, Hyderabad, India Bachelor of Technology in Computer Science Engineering. July 2018 – May 2022



Contact this candidate