SRIKAR V
United States
+1-682-***-**** *****************@*****.*** h t t p s : / / w w w . l i n k e d i n . c o m / i n / s r i k a r v 1 2 / Professional Summary
Results-driven Data Engineer professional with 5+ years of experience designing and delivering enterprise-scale data solutions, including processing over $50B in daily financial transactions with 98.5% accuracy. Adept in architecting cloud-native platforms across AWS, Azure, and GCP, with proven success in reducing operational costs by up to 50% and improving data efficiency by 70%. Expert in building modern data ecosystems integrating real-time stream processing, distributed computing, and data analytics. Recognized for transforming legacy systems into scalable, secure infrastructures with robust governance frameworks. Strong leadership in cross-functional teams, with a focus on regulatory compliance, fraud detection systems, and enabling data-driven decision-making on scale. Skills
• Programming Languages: Python (libraries: NumPy, Pandas, Matplotlib, SciPy), R (libraries: ggplot2), SQL (including experience with SSIS, SSRS, and SSAS), Java
• Data Processing & Pipelines: ETL/ELT, Apache Spark, Apache Airflow, Kafka, Terraform
• Cloud Platforms: AWS, Azure, GCP
• Databases: SQL, NoSQL (MongoDB, Cassandra), Data Warehousing, SQL Server, MySQL, Snowflake, Redshift, BigQuery, MongoDB, Cassandra
• Big Data Technologies: Hadoop, Spark, Hive, Pig
• Machine Learning: Python libraries (Scikit-learn, TensorFlow, PyTorch), Model building, Evaluation, A/B Testing
• Data Visualization: Tableau, Power BI, Python libraries (Matplotlib, Seaborn), Advanced MS Excel, Google Data Studio
• Data Engineering Tools: Apache Kafka, Apache NiFi, Luigi, Apache Beam
• Cloud Services: AWS Glue, EMR, Redshift, Azure Data Factory, Databricks
• Data Modeling: Dimensional modeling, Star/Snowflake schemas
• Data Governance: Data quality, metadata management, data lineage
• Soft Skills: Time management, Leadership, Problem-solving, Collaboration, Decision-Making, Documentation and Presentation, Verbal communication
Work History
PNC Financials Jan 2024 - Present
Data Engineer Dallas, TX
• Architected innovative cloud-native data lake solution on AWS, processing $10B+ in daily financial transactions with advanced encryption and partitioning strategies, reducing storage costs by 50% while meticulously ensuring SOX and FINRA compliance through multiple encrypted data zones and comprehensive security controls.
• Developed sophisticated real-time transaction monitoring pipeline using Apache Kafka and Spark Streaming, processing 5M+ complex financial events per second with 98.5% fraud detection accuracy, implementing advanced machine learning algorithms and real-time anomaly detection techniques.
• Engineered comprehensive automated data quality framework using Great Expectations and dbt, reducing financial reporting anomalies by 90%, automating 95% of regulatory compliance checks, and establishing end-to-end data validation processes that proactively identify and mitigate potential data integrity issues.
• Implemented cutting-edge delta lake architecture for financial transaction processing, achieving 70% faster settlement times, enabling full ACID compliance across all banking operations, and providing real-time transactional consistency for critical f inancial systems.
• Created advanced ML feature store serving 300+ risk assessment models, reducing feature engineering time by 75% and improving fraud detection accuracy by 30% through standardized, reusable computational pipelines and sophisticated feature transformation techniques.
• Designed and implemented enterprise-grade data mesh architecture, improving cross-functional team data access by 90% while maintaining extremely strict financial data governance, security standards, and implementing granular access control mechanisms.
• Optimized complex ETL workflows for regulatory reporting using Apache Airflow and dbt, reducing compliance report generation time by 65% and achieving 100% accuracy in regulatory submissions through automated validation and comprehensive error handling.
• Established advanced real-time monitoring system using Grafana and Prometheus, reducing mean time to resolve critical f inancial data pipeline issues by 80% through intelligent automated alerting, comprehensive observability, and proactive performance management.
MAQ Software Aug 2019 - Jul 2022
Client: Microsoft
Data Engineer Hyderabad
• Led comprehensive development of highly scalable data pipelines processing 30TB+ of complex enterprise data using Apache Spark and Azure Synapse, achieving 60% reduction in processing time and 40% cost optimization through advanced distributed computing techniques and intelligent resource allocation.
• Implemented robust automated data integration framework connecting 20+ diverse source systems, dramatically reducing manual intervention by 85%, improving data freshness by 70%, and creating a unified, consistent data ecosystem with seamless inter-system communication.
• Designed high-performance real-time analytics platform using Azure Event Hubs and Stream Analytics, processing 500K+ events per second with 99.95% accuracy, enabling near-instantaneous insights and supporting critical business decision-making processes.
• Created comprehensive data quality framework using Azure Data Factory, reducing data quality issues by 75% and automating 90% of quality checks through sophisticated validation rules, machine learning-driven anomaly detection, and comprehensive data profiling.
• Developed advanced CI/CD pipelines for data infrastructure using Azure DevOps, reducing deployment time by 80% and eliminating 95% of deployment-related issues through infrastructure-as-code, automated testing, and comprehensive release management strategies.
• Implemented enterprise-wide data governance solutions ensuring GDPR compliance, reducing audit findings by 90% through automated PII detection, advanced data masking techniques, and comprehensive data lineage tracking.
• Engineered distributed caching solution using Redis, improving query performance by 70% for frequently accessed datasets by implementing intelligent caching strategies and optimizing data retrieval patterns.
• Built sophisticated automated testing framework for data pipelines, achieving 95% test coverage and reducing production issues by 85% through comprehensive test scenarios, mock data generation, and continuous integration practices. DataOne Jan 2019 - Jul 2019
Data Engineer Hyderabad
• Developed comprehensive ETL pipelines using Python and SQL, processing 5TB+ of daily data with 99.9% accuracy while reducing processing time by 40% through optimized data transformation techniques and parallel processing strategies.
• Implemented advanced data quality checks and real-time monitoring systems, proactively identifying and resolving 95% of potential data anomalies before production impact, ensuring high data integrity and reliability.
• Created sophisticated automated reporting system using Power BI, reducing manual report generation time by 80% and improving data visualization accuracy by 90% through interactive, dynamic dashboard design.
• Optimized complex database queries and implemented strategic indexing techniques, improving query performance by 65% across critical business workflows and reducing system resource consumption.
• Developed sophisticated Python scripts for data cleaning and transformation, reducing data preparation time by 70% while maintaining 99% accuracy through advanced data manipulation and cleansing algorithms.
• Implemented robust version control for data pipelines using Git, improving code quality by 85% through systematic review processes, collaborative development, and comprehensive change tracking.
• Assisted in strategic cloud migration projects, helping achieve 50% reduction in on-premises infrastructure costs through careful workload assessment and cloud optimization techniques.
• Built comprehensive documentation for data processes and pipelines, reducing onboarding time for new team members by 60% and establishing clear knowledge transfer mechanisms.
Microsoft Certifications (Transcript Link)
• Exam 761: Querying Data with Transact – SQL
• Exam 778: Analyzing and Visualizing Data with Microsoft Power BI
• Exam 767: Implementing a Data Warehouse
• DP-200: Implementing an Azure Data Solution
• DP-201: Designing an Azure Data Solution
• Microsoft Certified: Azure Data Engineer Associate Education
The University of Texas at Arlington May 2024
Master of Science, Computer Science