Post Job Free
Sign in

Data Engineer Power Bi

Location:
New Delhi, Delhi, India
Posted:
October 15, 2025

Contact this candidate

Resume:

ABHIPSA KALYANSHETTI

DATA ENGINEER

USA +1-945-***-**** *******************@*****.*** LinkedIn SUMMARY

Data Engineer with around 5 years of experience designing, optimizing, and managing data pipelines using SDLC, Agile, and Waterfall methodologies. Expert in Python, SQL, and big data technologies, including Hadoop, Apache Spark, Hive, and MapReduce. Skilled in enhancing data accuracy and recovery processes, contributing to a 30% revenue increase through strategic data integration with Kafka and Power BI. Proficient in data analysis and visualization with tools like NumPy, Pandas, Power BI, and Tableau. Manages projects effectively with Git,Jira, and Jenkins, ensuring robust version control and efficient continuous integration.

TECHNICAL SKILLS

Methodologies: SDLC, Agile, Waterfall

Programming Language: Python, R, SQL, Scala

IDE’s: PyCharm, Jupyter Notebook

Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, ggplot2 Databases: MySQL, SQL Server, PostgreSQL, Oracle

Big Data Ecosystem: Hadoop, MapReduce, Hive, Pig, Apache Spark, Sqoop, Pyspark, Snowflake, HDFS ETL Tools: SSIS, Apache NiFi, Apache Kafka, Talend, Apache Airflow, Informatica, Snowflake Cloud Technologies: AWS, Azure, GCP, DataBricks

Reporting Tools: Tableau, Power BI, SSRS

Version Control Tools: Git, GitHub, GitLab

Other Skills: Data Cleaning, Data Wrangling, Critical Thinking, Communication & Presentation Skills, Problem-solving, Data Management

Operating Systems: Windows, Linux, Mac

PROFESSIONAL EXPERIENCE

Goldman Sachs, USA Data Engineer Mar 2024 - Current

● Developed, tested, and deployed ETL workflows using SSIS to automate data extraction, transformation, and loading processes, reducing manual effort by approximately 18%.

● Designed and implemented end-to-end data pipelines by integrating PySpark with orchestration tools like Apache Airflow, AWS Step Functions, and Azure Data Factory, improving pipeline scalability and reliability.

● Automated data workflows using Airflow DAGs for scheduling and monitoring PySpark jobs, leading to more consistent job execution and reduced failure rates.

● Improved data processing efficiency by around 30% by building and maintaining scalable data pipelines with Apache Spark and Hive.

● Designed and deployed a unified relational database structure using Snowflake and Amazon Redshift, enhancing data consistency and accuracy by about 15%.

● Created interactive Power BI dashboards for real-time data visualization, which supported faster reporting and reduced business decision-making time by nearly 15%.

● Collaborated cross-functionally using Git for version control, Jira for task tracking, and Jenkins for CI/CD automation, contributing to a 7–10% drop in deployment-related issues.

● Enhanced data quality by roughly 15% through the application of robust data cleaning and validation techniques using Pandas and NumPy.

● Developed and fine-tuned PySpark-based ETL pipelines for processing large-scale datasets in distributed systems such as Hadoop, AWS EMR, Databricks, and Azure Synapse.

● Built and maintained optimized data models to support efficient querying and analysis, reducing query times and improving data accessibility.

● Utilized Spark UI and various job monitoring tools to detect performance bottlenecks, leading to targeted optimizations.

● Applied partitioning and tuning strategies that resulted in an estimated 12–15% improvement in data processing performance.

Accenture, India Data Engineer Associate Jun 2021 - Dec 2022

● Designed and optimized ETL workflows using Informatica, streamlining the integration of complex financial data from disparate systems to support high-precision forecasting, budget reconciliation, and monthly reporting cycles.

● Built robust, scalable real-time data pipelines with Hadoop, Apache Spark, and Kafka, enabling ingestion and processing of over 10 million IoT telemetry records per day, significantly enhancing vehicle health diagnostics and predictive maintenance capabilities.

● Developed end-to-end data models and transformation logic to standardize and enrich financial datasets, reducing data latency by 35% and improving data readiness for downstream analytics and visualization tools.

● Created interactive Tableau dashboards for C-level stakeholders to monitor revenue trends, expenditure patterns, and resource utilization, driving data-informed decisions that improved operational efficiency by 20%.

● Led resolution of mission-critical issues in Hyperion Financial Management (HFM), conducting impact analysis and system audits to ensure 100% SLA compliance, thus preserving integrity of regulatory and internal reporting.

● Implemented data quality frameworks and validation checks within ETL pipelines, increasing trust in reporting accuracy and reducing manual error resolution efforts by 40%.

● Collaborated cross-functionally with finance, operations, and IT teams to translate evolving business requirements into data architecture solutions, accelerating delivery timelines and aligning technical outputs with strategic goals. JP Morgan Chase & KMPG by Nefroverse, India Data Engineer Jun 2019 - Jun 2021

● Engineered robust ETL workflows using SQL and Python to efficiently populate and update data warehouse systems, ensuring high data fidelity across business units.

● Designed and deployed scalable data ingestion pipelines using Apache Spark and Pig, reducing data processing latency by 35% through optimized memory usage and parallel computation strategies.

● Achieved 99.4% data pipeline uptime by implementing resilient streaming and transactional ingestion mechanisms across eight heterogeneous data sources.

● Refactored and optimized complex SQL logic in MySQL and MongoDB environments, leveraging indexing, partitioning, and logical restructuring to improve query execution speed by 25%.

● Automated end-to-end ETL processes using SSIS, streamlining integration cycles and reducing data refresh time by 20% across enterprise datasets.

● Built cloud-native data lakes and analytical warehouses utilizing AWS S3 for storage and Redshift for scalable query processing, enabling unified access to structured and semi-structured data.

● Developed dynamic Tableau dashboards and custom data reports with automated data refreshes, cutting manual reporting efforts by 30% and enhancing decision-making efficiency.

● Elevated data reliability by 20% through systematic cleaning, normalization, and validation routines implemented in Pandas and NumPy.

● Deployed batch-driven data workflows through CI/CD pipelines, ensuring reliable and repeatable data operations in production environments.

● Utilized PySpark and Spark SQL for distributed data transformation tasks, handling large-scale datasets with improved runtime performance and code modularity.

EDUCATION

Master of Science, Information Technology & Management The University of Texas at Dallas

Bachelor of Engineering in Computer Science

Visvesvaraya Technological University, India

PROJECTS

AWS Website building – S3, Cloud Watch, Amplify, API Gateway, Lambda Nov 2024

Built AWS-based cloud solutions for secure, scalable static website hosting enabling high availability.

Used AWS Amplify to automate deployment and manage frontend hosting with CI/CD pipelines.

Developed serverless backend APIs with API Gateway and Lambda to handle dynamic functionality.

Monitored site performance and operational health using Amazon CloudWatch dashboards and alerts. Scalable Data Pipeline for E-Commerce Analytics Sep 2024

Designed and implemented an end-to-end ETL pipeline using Apache Kafka, Spark Streaming, and AWS S3 to process real-time user clickstream data.

Transformed and stored structured data in AWS Redshift for querying and business reporting.

Reduced data latency by 60% and enabled near real-time dashboard updates in Tableau.



Contact this candidate