Hemanth Kaja
Phone: +1-813-***-**** Email: *************@*****.*** Address: Tampa, FL, 33613 LinkedIn PROFESSIONAL SUMMARY
Data Engineer with over 5 years of experience in designing, developing, and optimizing scalable data pipelines and analytics platforms across financial and Manufacturing domain. Skilled in Python, SQL, AWS, and Tableau, with a proven ability to deliver actionable insights through advanced data engineering practices and data visualization tools. Adept at collaborating with cross-functional teams to drive data-driven decision-making and ensure data integrity and performance.
WORK EXPERIENCE
Edward Jones, MO Jun 2023 – Present
Data Engineer
• Developed and optimized ETL pipelines using Python and SQL, ensuring seamless data integration from multiple financial data sources.
• Developed custom data transformation scripts in Python, automating extraction and transformation processes to enhance data processing efficiency.
• Created interactive financial dashboards using Tableau and Power BI, translating complex datasets into actionable insights for key business stakeholders.
• Led data quality initiatives by implementing data validation, cleansing, and governance processes, ensuring the consistency, integrity, and accuracy of critical financial data.
• Collaborated with cross-functional teams to integrate machine learning models within data pipelines, enabling advanced financial forecasting and risk assessment.
• Designed and implemented normalized and denormalized data models for financial datasets in Amazon Redshift, improving data accessibility for business intelligence and reporting needs.
• Analyzed and cleansed raw data in Databricks using PySpark to ensure data quality and accuracy, leading to a 30% improvement in data reliability and consistency.
• Performed in-depth data analysis to identify trends and patterns, providing actionable insights to drive informed business decision-making.
• Implemented data validation, cleansing, and governance measures using AWS Glue Data Catalog and Lake Formation, ensuring data consistency and regulatory compliance.
• Managed AWS EC2, S3, IAM, VPC, and Elastic Load Balancing configurations for secure and scalable data storage and processing environments.
• Managed and optimized relational databases including PostgreSQL and MySQL, improving query performance by 20%.
• Automated data transformation tasks using Python, reducing manual data processing time by 30%.
• Built real-time data pipelines using Apache Kafka Streams, enabling faster access to financial metrics.
• Maintained comprehensive documentation of workflows, data pipelines, and system architecture for knowledge sharing and operational support. Infosys, India Dec 2019 – Jul 2022
Sr. Data Engineer
• Developed Python and Java scripts to streamline data processing workflows for Cummins. Led workflows using Azure Data Factory, reducing data processing times by 30%.
• Built and optimized scalable data pipelines using PySpark, Spark SQL, and Azure Synapse Analytics to handle large datasets, improving query performance and data processing speeds by 30%.
• Utilized Databricks and Apache Spark for managing large-scale data workflows, achieving a 40% improvement in processing speeds.
• Worked on data lifecycle processes, integrating data seamlessly and improving migration speeds by 25% using Data Factory and Data Lake Storage.
• Migrated data from PostgreSQL to Azure SQL Database using Azure Database Migration Service, maintaining uninterrupted services and boosting database performance by 20%.
• Monitored real-time ETL pipelines with tools like Azure Monitor, Application Insights, and Log Analytics, reducing downtime by 20% and improving resource utilization.
• Designed and implemented a data pipeline to automate data imports from multiple sources into Data Lake Storage and Synapse Analytics using Python functions, optimizing storage and analytics workflows. EDUCATION
University of South Florida, Tampa, FL May 2024
Master’s in business Analytics and Information Systems Bachelor’s in Electronics and Communication Apr 2019 GMR Institute of Technology, India
CERTIFICATIONS
Google Cloud Professional Data Engineer
Microsoft Certified: Azure Data Engineer Associate TECHNICAL SKILLS
Programming Skills: Java, Python, Scala, R, PL/SQL, SQL, JavaScript Big Data Tools: Hadoop, Spark, Talend (ETL), Microsoft Excel (VLOOKUP, pivot tables), SQL (MySQL), Microsoft SQL. Data Visualization tools: Tableau, Power bi, research, scikit-learn, Python (Pandas, Plotly, GGPlot, Seaborn, matplotlib), Jupyter notebook. Cloud Technologies: Azure, AWS, GCP, Apache Spark, Kafka, Hive. Database Technologies: MySQL, Postgres SQL.
NoSQL Databases: Apache HBase, MongoDB, Cassandra. Other Skills: Data Collection, Exploratory Data Analysis, Data Cleaning, Data processing, Data Quality Management, data validation, Quantitative analysis, A/B Testing, Snowflake, ad-hoc analysis, Excel, PowerPoint, Microsoft Office, agile, SAS, Project Management