POREDDY SANDYA
Senior Software Engineer
***************@*****.***
SUMMARY
Certifications:
AWS Certified Solutions Architect – Associate – [2023]
Snowflake SnowPro Core Certified – [2023]
Microsoft Certified: Azure Data Engineer Associate (DP-203) – [2024]
Core Skills
Cloud Platforms: AWS, Azure (ADF, Synapse, Blob Storage, Data Lake Storage)
Big Data & ETL Tools: Snowflake, Apache Spark (PySpark), Azure Data Factory, Airflow (DAGs, Operators, Scheduling)
Programming Languages: Python, Java, SQL
Databases: SQL Server, DB2, Azure Synapse, Oracle
Data Processing: SQL Performance Tuning, Data Modelling, Incremental Data Loads
CURRENT EXPERIENCE
TAVANT
Apr 2022 – Jan 2025
Senior Software Engineer at TAVANT TECHNOLOGIES, Hyderabad
PROJECT-NAME: Composable workspace migration
Client: Disney
Duration: Oct 2024- Jan 2025
Responsibilities:
Migrated data from a legacy Snowflake database to a new Snowflake database, ensuring data consistency and minimal downtime.
Developed and optimized Airflow DAGs to orchestrate end-to-end data migration workflows.
Modified and enhanced Airflow operators in Python to meet custom data processing requirements.
Implemented efficient data ingestion and transformation processes using Snowflake features like staging, COPY command, and table cloning.
Optimized Snowflake queries and warehouse usage to improve performance and reduce compute costs.
Developed Python scripts for automated data validation and reconciliation post-migration.
Ensured data quality and integrity checks, identifying and resolving discrepancies during migration.
Worked closely with cross-functional teams to align migration strategies with business objectives.
Documented technical workflows, Airflow modifications, and data migration best practices.
PROJECT-NAME: Data migration
Client: Hussmann
Duration: Sep 2023- May 2024
Responsibilities:
Migrated data from SQL Server, DB2, and Azure Synapse to a new Azure Synapse workspace using Azure Data Factory (ADF) pipelines.
Designed and implemented ETL workflows in ADF for seamless data movement and transformation.
Developed and scheduled ADF pipelines to automate data ingestion, ensuring efficient and reliable execution.
Optimized data migration processes to handle large datasets efficiently with performance tuning techniques.
Ensured data consistency and integrity throughout the migration process with validation and reconciliation mechanisms.
Implemented incremental data load strategies to minimize processing time and reduce resource consumption.
Monitored and troubleshot ADF pipeline failures, improving system reliability and performance.
Worked closely with cross-functional teams to align data migration strategies with business objectives.
Documented migration processes, data flow architectures, and ADF pipeline configurations for future reference.
PROJECT-NAME: ETL migration
Client: Doordash
Duration: Aug 2022- April 2023
Responsibilities:
Applied SQL code changes, including adding new columns and implementing business logic transformations.
Optimized SQL queries for better performance and efficiency in data processing.
Modified and enhanced Airflow operators to support the new ETL framework.
Ensured data consistency and integrity while transitioning from the old ETL system to the new one.
Conducted testing and validation to verify that the migrated ETL processes met business requirements.
Collaborated with cross-functional teams to streamline the migration process and minimize downtime.
Documented code changes, migration processes, and best practices for future reference.
PREVIOUS EXPERIENCE
Company: Mridha Technology & Solutions
5/27/2019 - 4/14/2022
Project Name: IPA (Ingredient Price Analysis).
Project Description: Plants data will be uploaded into Blob Storage. We create ADF pipelines from the Blob storage; those will get triggered and move the data into Azure Data Lake Gen2. For each application, there will be different plants. Here we have multiple folders. When a specific file is uploaded then triggers the Azure Data Factory pipelines. By using exchange rates, they will convert into their USD. Here exchange rates play a crucial role; they use to purchase ingredients from different locations.
Technologies: PySpark, Databricks.
Role: Data Engineer
Responsibilities:
Extracted the data from different source systems.
Created ADF pipelines performed data transformations using ADF and Pyspark with Databricks.
The business logic will be done in Azure Databricks.
Plants will be uploading different file formats.
The business logic operations are done in notebooks.
We can write transformation logic inside the notebooks using Pyspark and schedule those notebooks for execution.
Created triggers for jobs scheduled activity.
Build utilities, user-defined functions, and frameworks to enable data flow patterns better.
Project Name: Plan Service Centre
Project Description: Plan service Centre is a record-keeping for plan administration. Plan sponsors and third-party administrators use this application. With the help of this application will maintain the customer’s information, report, plan details, and loan information.
Technologies: Hadoop, Hive, Apache Spark, HBase, Sqoop, MapReduce, Shell Scripting, Teradata SQL Assistant.
Role: Data Engineer
Responsibilities:
Created Python Notebooks to do Metadata validations, Data Validation, and business aggregations.
Used Azure data factory for ingesting data from on-Premises Oracle Database with the help of Self-Hosted Integration runtime and copying data into Azure Data Lake Store Gen2.
Created ADF pipelines performed data transformations using ADF and Pyspark with Databricks.
Applied Transformations on log data files processed data using data frames using Pyspark on Databricks Notebooks.
Built Dynamic methods, end to end process using Pyspark for data validations and processing data from on-premises to ADLS Gen2.
Called multiple notebooks from the main notebook and passed data dynamically to notebooks in the end-to-end process.
Separated common functions, config notebooks utility methods notebook from regular notebooks.
Worked on text, CSV, JSON, and excel files processing.
Identifying best clusters for reducing cost in Databricks.
Debugging data pipelines, investigating issues, fixing failures, etc.
Created triggers for jobs scheduled activity.
SYSTEMS PROFICIENCY
Apache Spark : Spark Core API and Spark SQL
Programming : Core java, Python
Analytics : Python
DBMS : Oracle, MySQL, MS SQL
Family IDE : Eclipse & PyCharm
EDUCATION
Master of Technology from JNTU, Hyderabad.
Bachelor of Technology from JNTU, Hyderabad.