PRANAY REDDY YELTI
Data Engineer
Phone: +1-469-***-****
E-Mail: **************@*****.***
LinkedIn:www.linkedin.com/in/yeltipranay
PROFILE SUMMARY:
Analytical and process-oriented Data Engineer with a demonstrated history of 8 years of expertise in Data Integration and Data Analytics in the financial services industry with solid understanding of Data warehousing technology with Informatica PowerCenter. Data Analysis, Data Mining, Data Visualization, Data Pipelines and Business Objects as Reporting tool.
●Experienced in the full SDLC: requirements gathering, root cause analysis, design, development, documentation, verification, and testing.
●Proficient in managing databases: Oracle, DB2, MySQL, SQL Server.
●Skilled in ETL tools: Informatica, Google Cloud Dataflow, Apache Beam, and Snowflake for data pipelines, modeling, and transformations.
●Developed scalable pipelines and optimized Snowflake data processing, reducing costs and improving insights by 50%.
●Expertise in real-time data streaming with Apache Spark, Kafka, and PySpark.
●Implemented ITIL best practices in incident, problem, and change management, enhancing service quality.
●Configured ServiceNow for asset and configuration management, improving operational efficiency.
●Established data security protocols in Google Cloud using IAM and VPC Service Controls.
●Designed and deployed private cloud infrastructures with OpenStack for scalability and efficiency.
●Integrated and migrated data using Azure Data Factory, Databricks, and ADLS.
●Experienced in Python, C++, SQL, PL/SQL, and web technologies (XML, HTML, CSS).
●Created ServiceNow dashboards and reports for real-time data insights and automation.
●Proficient in CI/CD tools like Jenkins, Git, Docker, Kubernetes, and Maven for project management and builds.
●Skilled in Jira, Buganizer, Git, and Bitbucket for issue tracking and version control.
●Trained teams on ServiceNow, ITIL practices, and data engineering processes for improved efficiency.
●Strong knowledge of data warehousing concepts: Star Schema, Snowflake, SCDs, and surrogate keys.
●Quickly adapts to new environments and technologies, fostering effective collaboration between business and IT teams.
TECHNICAL SKILLS:
Data Integration
Azure Data Factory, Informatica Power Center 10.5/10.1.1 (Designer, Workflow Manager, Workflow Monitor)
Operating System
Windows, LINUX, UNIX
Languages
PL/SQL, SQL, and Python
Databases
Oracle 11g/12c, SQL Server, MySQL, MS Access, and Postgres.
Methodologies
Dimension Modeling, ER Modeling, Star Schema Modeling, Snowflake Modeling
Cloud
Azure Data Lake, Azure, ADF, Azure Data Bricks and Snowflake
Tools & Utilities
Informatica PowerCenter 10.5(ETL Tool), MDM, SQL Developer, DB Viewer.
Data Visualization
Tableau, Power BI
Scripting languages
Unix Shell Scripting, Python Scripting
Version Control
GIT, IntelliJ, Eclipse
PROFESSIONAL EXPERIENCE:
Data Engineer.
Vizio, Dallas, TX. Aug 2023 to till date
●Proficient in Waterfall and Agile methodologies (Scrum, TDD); performed unit, integration testing, and validation.
●Skilled in gathering functional requirements and scoping projects with business stakeholders.
●Developed Python-based Spark programs for batch processing of large datasets; implemented efficient Databricks ETL pipelines with Spark SQL and DataFrames.
●Created reusable ADF pipelines and leveraged operations like For Each, Lookup, and Switch for automated data workflows.
●Built optimized Snowflake data models and implemented performance tuning, reducing query times by 60%.
●Integrated Snowflake with AWS services (S3, Lambda) and configured role-based access for data governance.
●Designed Star/Snowflake schema data marts using Erwin and implemented Data Vault 2.0 for enterprise data warehousing.
●Developed ELT pipelines in Snowflake, reducing manual effort by 80%; automated deployments with Jenkins and GitLab CI.
●Designed microservices in Golang and APIs with Kafka/AWS Kinesis for scalable data ingestion and transformation.
●Worked with structured, semi-structured, and unstructured data in formats like Avro, ORC, and Parquet.
●Ingested data from RDBMS and exported transformed data to Cassandra and NoSQL databases like HBase.
●Leveraged OpenStack for scalable data storage, monitoring (Ceilometer), and secure role-based access control.
●Collaborated with DevOps to integrate CI/CD pipelines into OpenStack deployments.
●Integrated cloud-based data fabrics (IBM Cloud Pak, Informatica) for seamless on-prem and cloud data governance.
●Managed Hadoop-based data lakes and processed customer requirements for semi-structured data.
●Participated in Agile sprint planning and retrospectives to deliver data engineering solutions efficiently.
Environment: Spark, Python, Shell, SQL, Azure, ADLS, Azure Data Factory (ADF).
Data Engineer.
Artha Solutions, Chicago, IL. Jan 2022 – July 2023
●Developed interactive Tableau reports from diverse data sources.
●Built ad-hoc analysis solutions using Azure Data Lake Analytics/Store.
●Optimized SQL scripts with PySpark SQL and developed Spark applications for ETL and data aggregation in Databricks.
●Skilled in dimensional modeling (Star/Snowflake schemas), transactional modeling, and SCDs.
●Automated data transfers with Azure CLI commands and created Azure HD Insights clusters using PowerShell.
●Loaded real-time data into NoSQL databases like Cassandra.
●Worked with Azure services: SQL Database, Data Lake, Data Factory, SQL Data Warehouse, and Analysis Services.
●Built ETL pipelines in Databricks using Spark SQL, DataFrames, and Python scripting.
●Used Golang for RESTful APIs, concurrent data processing with goroutines, and cloud integration (AWS/GCP).
●Leveraged Polybase for ETL processes and optimized Azure Data Warehouse tables for Power BI reports.
●Migrated legacy data warehouses to Data Vault architecture for improved query performance and consistency.
●Enhanced data governance and lineage with tools like Collibra and Alation.
●Conducted performance tuning of data pipelines, reducing latency significantly.
●Trained teams in Data Vault modeling and data fabric integration.
●Participated in code reviews and implemented unit tests, deploying solutions with Docker.
●Managed source code with Git, GitHub, and GitLab, fostering collaborative development.
Environment: Spark, HIVE, Python, Pyspark, Hadoop, Azure Data Factory (ADF), MYSQL.
ETL Developer
Intergraph Corporation, Hyderabad, India. Jan 2016 to July 2021
●Collaborated with Business Analysts to gather requirements for the PaymentNet4 platform.
●Designed functional and detailed mapping documents to extract data from SQL Server and load into Oracle/Teradata.
●Created and reviewed robust architecture and designs with stakeholders.
●Supported DataStage migration from v8.1 to v8.5, including UNIX directory setup and best practices.
●Configured ODBC connections and TNS entries for DataStage interaction with Oracle databases.
●Developed DataStage jobs for data extraction, profiling, standardization, and ETL processes.
●Built base fact, aggregate fact, dimension, and data mart tables for data warehouse projects.
●Used Control-M for scheduling and managing event-based/time-based jobs.
●Administered DataStage tasks: project setup, environmental variables, server management, and message handling.
●Loaded data into Teradata using Multi-load, T-Pump, and FastLoad utilities.
●Implemented Java-based ETL pipelines using Apache Camel and Spring Batch, and created APIs with Spring Boot.
●Enhanced data processing performance with Java multithreading, reducing execution time for large datasets.
●Logged job execution details in audit tables using shared containers and sub-routines.
●Optimized DataStage job performance with partitioning techniques and node configurations.
●Provided production on-call support during migrations and raised access requests (EURC, WRM).
●Utilized ClearCase and ClearQuest for version control and change management.
Environment: IBM InfoSphere DataStage 8.5, IBM InfoSphere DataStage 8.1.0, Oracle 11i/10g, DB2, AIX 6.1, SQL, PL/SQL, SQL Server 2005, TOAD, SQL Developer, Teradata V2R5, Teradata SQL Assistance, Control-M, UML, Crystal Reports, Altova Map Force, Erwin 4.1, Mainframes, Windows XP.