Thrinadh Kunapaneni
Data Engineer
+1-405-***-**** Oklahoma, USA *********************@*****.***
www.linkedin.com/in/thrinadh-kunapaneni-381175184
Professional Summary
Experienced Data Engineer with a Master’s degree in Computer Science and hands-on experience in designing, developing, and optimizing large-scale data structures, ETL pipelines, and analytical workflows. Proficient in Python, PySpark, and SQL, with a proven track record of delivering innovative solutions that drive data-driven decision-making. Skilled in AWS and Azure cloud technologies, including serverless solutions and data warehousing. Committed to ensuring the integrity, security, and scalability of data systems.
Education
Masters In Computer Science Oklahoma Christian University Oklahoma, USA Jan2022 - Nov2023
Bachelor’s In Computer Science VelTech University Chennai, India July 2017 - May 2021
Professional Experience
Data Engineer intern, (Health Care Service Corporation) Tulsa, Oklahoma May 2023 - Dec 2023
•Enhanced data pipeline efficiency by 40% using Python and Scala, reducing data retrieval times for faster healthcare management decision-making.
• Implemented and optimized ETL processes with Python and T-SQL, reducing data processing time by 30% and improving clinical data analysis reporting cycles.
• Managed AWS S3-based Data-lake, integrating healthcare data from EHRs and medical imaging systems, ensuring HIPAA compliance and patient data security.
• Developed SQL queries and utilized Microsoft SQL Reporting Services for ad hoc data requests, supporting clinical teams with timely and accurate information.
• Led cross-functional teams in data integration projects, fostering collaboration and communication, and leveraging Java and C++ for data processing tasks.
• Stayed abreast of emerging technologies in healthcare analytics, contributing to research projects aimed at leveraging data for improved patient care using Python and Scala.
• Optimized data warehouses and data pipelines for scalability and performance, employing Python and Java, to handle large volumes of healthcare data efficiently.
Data Engineer, (Thermo Fisher Scientific) Hyderabad, India Jan 2020 - Nov 2021
•Enhanced data processing efficiency by 40% through optimized ETL operations to AWS Redshift using Python, improving data integration and analysis.
• Automated data loads from S3 to Redshift with AWS Data Pipeline, boosting data accessibility and processing efficiency significantly.
•Reduced manual data processing time by 30% by automating tasks with Python and Shell scripting, increasing team productivity by 15%.
• Developed and maintained data models in ERWIN, ensuring data integrity and consistency, and facilitated better decision-making with T-SQL created tables and stored procedures.
• Utilized Spark SQL in Pyspark for efficient data queries, and encrypted sensitive data with hashing algorithms, enhancing data security. • Designed Python-based API for revenue tracking and analysis, providing actionable insights and supporting strategic financial decisions.
•Collaborated on data analysis, compiling findings for the Director of Operations, which improved decision-making processes.
•Created Tableau and PowerBI reports, incorporating complex calculations for strategic planning and ad-hoc reporting.
•Optimized SQL queries, reducing runtime through effective indexing and execution plan adjustments.
•Performed ETL from source systems to Azure Data Storage services, ensuring efficient data migration with Azure Data Factory and Spark SQL.
•Ingested and processed data in Azure Databricks, improving data accessibility across Azure Services.
•Migrated data seamlessly using SQL Azure, Azure Storage, and Azure Data Factory, enhancing data transition processes.
•Implemented BI solutions on Azure, leveraging Azure Data Platform services to enhance business intelligence capabilities.
•Conducted ETL testing, ensuring data integrity by running jobs and extracting data for Data warehouse servers.
Big Data Intern, (Value Labs) Chennai, India July 2019 - Dec 2019
• Led data collection and processing, enhancing accessibility and analysis efficiency; optimized Apache Spark and Hadoop pipelines, boosting processing speed by 40%.
•Implemented data transformation pipelines in Apache Spark and Hadoop MapReduce, improving accuracy. Increased Hadoop clusters' reliability by 30% through strategic optimizations.
•Created data visualizations and dashboards in Tableau and matplotlib, enabling informed decision-making. Collaborated on research projects, utilizing Scala and Python for deeper insights.
•Managed big data infrastructure, ensuring optimal performance and scalability. Applied SQL and Java for data security, meeting regulatory compliance through encryption and access controls.
•Documented workflows and findings, producing technical documentation that enhanced team communication. Engaged in collaborative efforts to explore emerging big data technologies and methodologies.
Online Courses & Certifications
•AWS Certified Data Engineer – Associate
•Microsoft Certified: Azure Data Engineer Associate
•Databricks Certified Associate Developer for Apache Spark 3.0 – Coursera
•Cloudera Certified Professional: Data Engineer
•Data Engineering on Google Cloud Platform Specialization - Coursera
•Advanced SQL for Data Scientists - Datacamp
Skills
•Programming Languages: Python, Pyspark, Scala, Java, R, SQL, T-SQL, Shell Script, C, C++, C#.
•Web Languages: AJAX, JavaScript, HTML, DHTML, XHTML, XML, jQuery, Angular JS
•Big Data Technologies: Hadoop, Spark, Kafka, Hive, Yarn, Hortonworks, Cloudera
•Databases: AWS Redshift, Azure SQL Data Warehouse, MySQL, PostgreSQL, Oracle.
•ETL Tools: Apache NiFi, Talend, Informatica, Olik Replicate, Fivetran.
•Data Warehousing: Amazon Redshift, Google BigQuery, Snowflake.
•Cloud Platforms: IAM, AWS, GCP, Azure, EMR, PaaS
•Data Visualization: Tableau, PowerBI
•Data Processing: EMR, Step Functions.
•Frameworks: Django, Pyramid, AngularJS.
•Debugging tools: Selenium, IDE
•Tools: ERWIN, MB MDR, Git, GitHub, JIRA, Jenkins, Airflow, Terraform.
Projects & Achievements
•AWS Serverless Data Solutions: Designed and implemented serverless data solutions using AWS Lambda, Step Functions, and API Gateway, enhancing data processing efficiency and scalability.
•Real-Time Analytics: Implemented real-time analytics solutions using Snowflake, enabling timely insights and decision-making.
•Cloud Data Warehouse Migration: Led the migration of legacy data warehouses to Snowflake, resulting in improved data accessibility and performance.
• Cost Management: Developed strategies to manage and optimize Snowflake costs, ensuring efficient resource utilization.
•Data Security and Compliance: Developed and implemented security measures, including IAM roles and policies, ensuring compliance with best practices for security and regulatory requirements.
•Big Data Pipeline Optimization: Built and optimized big data pipelines, architectures, and data sets for efficient data processing.
•Stream Processing Systems: Implemented stream processing systems using Spark-Streaming and Storm.