VARUN REDDY
Data Engineer
806-***-**** ***********@*****.***
PROFESSIONAL SUMMARY:
Highly accomplished and results-driven Data Warehouse Engineer with over 5 years of extensive experience in designing, implementing, and optimizing large-scale data warehousing solutions and ETL processes. Possesses a strong emphasis on Linux-based processes and infrastructure, expert-level Shell Scripting, and robust Oracle development capabilities, including practical experience with Oracle Exadata. Proven ability to identify and implement critical system and architecture improvements, enhancing Linux-based toolsets, scripts, jobs, and overall data load/extract processes. Proficient in Python for data manipulation and automation, with practical knowledge of Unix file systems and Agile methodology. Passionate about driving continual process improvement and automation, leveraging orchestration tools like Apache Airflow with Python and ETL tools such as Informatica to deliver scalable and efficient data solutions. TECHNICAL SKILLS:
Programming Languages: Python, SQL, Scala, C++, Java
Scripting & Linux: Shell Scripting, Linux, Unix, Bash, Perl
Databases & Data Warehousing: Oracle Exadata, Oracle, MySQL, PostgreSQL, MS SQL Server, Snowflake, BigQuery, NoSQL, MongoDB
ETL & Orchestration: Apache Airflow, Informatica, Apache Spark, AWS Glue, Azure Data Factory, dbt
Cloud Platforms: AWS (S3, Glue, Lambda), Azure (Databricks, Data Lake), GCP (BigQuery, Dataflow)
Big Data Technologies: Hadoop, Hive, Kafka, Flink, PySpark
DevOps & Version Control: Git, Jenkins, Docker, Kubernetes, Terraform
Reporting & Visualization: Looker, Tableau, Power BI, Streamlit PROFESSIONAL EXPERIENCE:
Taco Bell — San Francisco, CA Jun 2024 – Present
Senior Data Engineer
Responsibilities:
Led the implementation and management of robust Linux-based processes and infrastructure supporting enterprise data warehousing solutions.
Designed and optimized complex ETL/database load and extract processes, significantly improving data ingestion and transformation efficiency.
Developed and enhanced advanced Shell scripts and Linux-based toolsets to automate critical data operations and system tasks.
Implemented scalable data warehousing solutions on Oracle Exadata, ensuring high performance and reliability for analytical workloads.
Contributed to system and architecture improvements, focusing on optimizing data flows and enhancing overall data pipeline stability.
Utilized Python extensively for data manipulation, scripting automation, and integrating various data sources within the warehouse environment.
Managed Unix file systems, including permissions and standard tools, to ensure secure and efficient data handling and storage.
Collaborated with cross-functional teams using Agile methodologies to deliver timely and impactful data warehouse enhancements.
Environment: Linux, Shell Scripting, Oracle Exadata, Python, SQL, Apache Airflow, AWS, GCP, PySpark, Snowflake, Docker, Kubernetes, Git, Agile
West Bend Insurance — West Bend, WI Oct 2020 – Jul 2023 Data Engineer
Responsibilities:
Designed and developed end-to-end data warehousing solutions within an Azure environment, focusing on performance and scalability.
Implemented and maintained complex ETL pipelines using Informatica, ensuring efficient data extraction, transformation, and loading.
Created and enhanced numerous Shell scripts to automate daily operational tasks and streamline Linux-based data processes.
Managed and optimized relational databases, including Oracle, to support high-volume data warehousing and reporting requirements.
Leveraged Python for scripting and data integration, automating various data quality checks and validation procedures.
Performed system and architecture improvements on existing data infrastructure, reducing processing times by 20% through optimization.
Configured and managed Azure Databricks for large-scale data processing, integrating it with existing data warehousing solutions.
Utilized Apache Airflow with Python to orchestrate intricate data workflows, ensuring timely and reliable data availability.
Environment: Azure (Data Factory, Databricks, Blob Storage), Linux, Shell Scripting, Oracle, Informatica, Python, Apache Airflow, Snowflake, PostgreSQL, GitHub, Agile
Varo Bank — San Francisco, CA Aug 2019 – Sep 2020
Data Engineer
Responsibilities:
Designed and implemented robust data solutions leveraging Azure Databricks and PySpark to process and transform large datasets efficiently.
Developed comprehensive Shell scripts to automate routine data operations and manage Linux-based environments effectively.
Implemented data validation frameworks supporting both relational and file-based systems, enhancing data quality and reliability.
Collaborated on enhancing ETL/database load and extract processes to support critical business intelligence requirements.
Utilized Azure Data Factory to orchestrate automated daily data loads into Azure Data Lake, ensuring data freshness.
Developed a framework to pull delta changes and ingest them into the Azure Data Lake, optimizing data refresh cycles.
Created and managed delta lake tables, implementing robust table-level security within Azure Databricks for data governance.
Worked within an Agile methodology framework to deliver high-quality data engineering solutions and meet project timelines.
Environment: Azure Databricks, Azure Data Factory, Azure Data Lake, PySpark, Python, Shell Scripting, Azure SQL, Power BI, Azure DevOps, Agile
EDUCATION:
Master of Science in Computer Science from Texas Tech University
CGPA: 3.92