SHIKHARA MALALA
St. Louis, Missouri +1-636-***-**** ********************@*****.*** Linkedin Github SUMMARY
Data Engineer with 3+ years’ experience building and maintaining scalable data pipelines, cloud solutions, and analytics platforms in healthcare, finance, and SaaS. Proficient in Python, SQL, Apache Spark, Kafka,and Airflow, with hands-on expertise in AWS (S3, Redshift, Glue, Lambda), Azure (Data Factory, Synapse, Data Lake Storage), and ETL tools like Snowflake, BigQuery and Informatica. Skilled in data modelling and creating actionable dashboards in Tableau and Power BI. Experienced in handling regulated datasets including Medicare claims and clinical data under strict governance and compliance. SKILLS
Programming & Scripting: Python (NumPy, Pandas, SciPy, Matplotlib, Seaborn, scikit-learn), R, SQL, HTML Big Data & Streaming: Apache Spark, Kafka, Apache Airflow, Hadoop, Hive, HDFS, HBase, YARN, MapReduce, Docker Cloud Platforms & Services: AWS (EC2, S3, Lambda, IAM, Glue, Athena, CloudWatch, DynamoDB, Kinesis, Redshift), Microsoft Azure (Data Factory, SQL Database, Synapse, Data Lake Storage) Databases & Version Control: PostgreSQL, MySQL, SQL Server, RDBMS, MongoDB, NoSQL, Git, GitHub, CI/CD, Terraform Data Engineering & ETL: ETL Development, SSIS, Informatica, Azure Data Factory, Snowflake, BigQuery, Redshift, Data Build Tool (dbt), Alteryx, Fivetran, SAPS, Jira, ServiceNow, Generative AI, Database Design. Data Modelling & Warehousing: Star & Snowflake Schemas, Kimball, Inmon, Data Vault, Data Warehousing, Data Mining. Data Analysis: A/B Testing, Regression Analysis, Predictive Modelling, Gradient Descent, Decision Trees, Sentiment Analysis, Tokenization, Fraud Detection, Supply Chain Analytics, SaaS, Machine learning Visualization & Reporting: Tableau, Power BI, Looker, Excel (Power Query, PivotTables, Advanced Reporting) Domain Expertise: Medicare Claims, Clinical Data (Epic, Cerner), SaaS Metrics (ARR, Churn, Retention, Engagement Analysis) Data Management & Governance: Data Cleaning, Data Wrangling, Data Preparation, Data Governance, Reporting, KPI Design PROFESSIONAL EXPERIENCE
Data Engineer Humana MO, USA May 2024 – Present
• Processed Medicare claims and clinical data from Epic and Cerner for CMS compliance reporting, ensuring 100% adherence to deadlines and preventing penalties through accurate, timely submissions.
• Designed ETL pipelines using Azure Data Factory integrating claims, enrolment, and provider datasets for population health analytics, reducing manual reporting workload by 40% and improving data delivery speed.
• Consolidated providers visit data into centralized Snowflake warehouse to enable care gap analysis, increasing preventive care outreach success rate by 18% across high-priority Medicare Advantage member groups.
• Developed Power BI dashboards tracking claims denial patterns, empowering operations teams to reduce overall denial rates by 21% within six months of implementation.
• Created Python-based anomaly detection models using scikit-learn for identifying suspicious billing activities, detecting an average of $1.2M in potential overpayments monthly.
• Collaborated with cross-functional teams to understand data requirements and resolve data pipeline issues, by implementing data governance practices, reducing downtime by 25%, ensuring data accuracy and consistency. Data Engineer Hexaware Technologies India Aug 2022 – Jul 2023
• Managed ETL workflows in Informatica and Azure Data Factory consolidating banking transactions, account, and loan datasets for Basel III compliance, ensuring timely and accurate regulatory reporting.
• under 90 seconds with immediate alerts to compliance teams.
• Migrated 2TB+ customer and financial records from SQL Server to AWS S3 and Redshift, achieving zero data loss and complete verification.
• Created Tableau dashboards for monitoring loan default trends, improving early-warning detection accuracy by 25% and enabling proactive credit risk interventions.
• Tuned SQL queries for monthly financial statement generation, reducing processing time by 48% and meeting strict month end closing deadlines.
• Optimized Apache Spark workflows on AWS EMR for risk adjustment score calculations, reducing computation time from two hours to 55 minutes with 100% accuracy.
• Automated KYC data validation using Python scripts, reducing account onboarding errors by 27% and improving customer satisfaction scores from 82% to 91%.
Data Analyst Cybage Software India Mar 2021 – Jul 2022
• Engineered 15+ ETL processes in SSIS and Python to integrate SaaS product usage, churn, and revenue data from multiple transactional and marketing systems, reducing reporting turnaround from 3 days to under 12 hours.
• Aggregated subscription activity logs and billing records into AWS Redshift for ARR, churn, and retention tracking on 200K+ active accounts, enabling targeted retention strategies that reduced churn by 9%.
• Designed SQL-based automated KPI tracking for activation and engagement, driving 17% growth in feature adoption rates.
• Migrated 4TB of customer analytics data from MySQL to AWS Redshift, reducing BI query times from 10 seconds to under 3 seconds.
• Conducted A/B testing on subscription pricing changes across 50K+ customers, identifying pricing tiers that boosted ARR by 12%. Built Python-based data cleaning workflows, reducing duplicate records in customer datasets by 92%.
• Created predictive customer lifetime value models that improved upsell conversion rates by 15% through data-driven marketing campaigns.
EDUCATION
Master’s in information systems and management
Saint Louis University, St. Louis, MO May 2025