Bhanu Prakash
ETL Developer/ Data Engineer
205-***-**** ***************.***@*****.***
PROFESSIONAL SUMMARY:
Results-driven ETL Developer / Data Engineer with 5+ years of experience in designing and optimizing data pipelines using SQL, Python, and PySpark within banking and financial environments. Proven expertise in building AML and financial crime data solutions, supporting compliance initiatives such as BSA/AML, FATCA, and KYC. Adept in developing robust, scalable ETL frameworks using Spark, Data bricks, AWS, and Snowflake, ensuring data quality, governance, and regulatory adherence. Strong track record of collaborating with cross-functional AML, Risk, and Compliance teams to deliver analytics-ready data for transaction monitoring, sanctions screening, and regulatory reporting. Proven track record in migrating on-premises Data Lakes to cloud-native architectures using scalable ETL/ELT pipelines
Expertise in designing and developing large-scale ETL pipelines using SQL, Python, and PySpark for financial data processing.
Strong background in banking, AML, and compliance systems, supporting BSA/AML, FATCA, and KYC/CDD initiatives.
Hands-on experience with Spark, Databricks, and AWS Glue for distributed data transformation and analytics.
Proven ability to integrate data from core banking, KYC, and transaction monitoring systems such as Actimize and Mantas.
Skilled in building data ingestion, cleansing, and validation frameworks ensuring compliance and data integrity.
Deep understanding of financial crime analytics, sanctions screening, and suspicious activity reporting (SAR) workflows.
Experience with data warehousing and modeling using Snowflake, Redshift, and Oracle (Star/Snowflake schema).
Proficient in data governance, lineage tracking, and audit readiness for regulatory data submissions.
Expertise in big data platforms (Spark, Hive, HDFS) and cloud ecosystems (AWS, GCP, Azure).
Strong in ETL performance tuning, SQL optimization, and managing high-volume financial data.
Experienced in Agile/Scrum development, collaborating closely with compliance, risk, and audit stakeholders.
Skilled in CI/CD implementation, Airflow orchestration, and production monitoring using Control-M.
Proven success in delivering data-driven AML insights and dashboards for compliance and fraud detection.
Familiar with Apache Airflow for orchestrating ETL job dependencies and execution in modern pipelines.
Migrated on-premise ETL workflows to Google Cloud Composer (Airflow) to enable scalable and reliable orchestration.
Created and published parameterized, drill-down, and subscription-based reports using SSRS to support business reporting requirements.
Used Git and GitHub for version control and collaborating with team members across environments.
Developed automation scripts using Python and Shell for ETL monitoring and file processing.
Effective communicator with the ability to translate business requirements into scalable data solutions.
TECHNICAL SKILLS:
ETL Tools
Informatica Power Center, AWS Glue, IBM DataStage, Talend, SSIS,
Data Integration
Mapping Designer, Workflow Manager, Data Cleansing, Transformations, Data Mapping
Databases
Snowflake, Oracle, MS SQL Server, DB2
Big Data Tools
Apache Spark, Hive, HDFS, PySpark, Kafka, Databricks.
Programming
SQL, PL/SQL, Python (scripting), Shell Scripting, XML
Cloud Platforms
AWS (S3, RDS, EC2, IAM, DMS, Glue Catalog), Google Cloud Platform (GCP), Azure.
Data Warehousing
Star Schema, Snowflake Schema, Fact & Dimension Modeling
Scheduling Tools
Control-M, Tivoli Workload Scheduler (TWS)
Orchestration
Apache Airflow, Databricks Workflows
Version Control
Git, GitHub
Testing & Quality
Data Validation, Data Profiling, Unit Testing
DevOps
CI/CD concepts, Git workflows, Python automation scripts, Azure.
Documentation & BI
MS Office Suite, TOAD, Oracle SQL Developer, Excel, PowerPoint, Power BI, Jaspersoft,SSRS,
Methodologies
Agile, Scrum, Release Entry Framework (REF)
Education Details :
Masters of Science in Information Technology from Lindsey Wilson University Aug 2023 – Mar 2025.
Bachelor of Technology in Computer science from Kl University June 2014 – May 2018
PROFESSIONAL EXPERIENCE
Verizon - New York, USA May 2024- Present
ETL Developer / Data Engineer
Key Responsibilities:
Designed and developed end-to-end ETL pipelines using PySpark and Databricks to process multi-terabyte structured and semi-structured data for compliance reporting and AML analytics.
Built and optimized SQL and Spark SQL transformations for integrating transactional, customer, and KYC data from multiple source systems into AWS and Snowflake-based warehouses.
Implemented data quality frameworks with validation, exception handling, and error logging to ensure compliance-grade data integrity.
Automated data ingestion from AWS S3 and RDS using Glue and Lambda triggers for seamless AML data updates.
Developed reusable PySpark modules for AML metrics such as transaction velocity, high-risk jurisdiction exposure, and suspicious pattern detection.
Collaborated with AML and Financial Crime teams to enable regulatory and investigative reporting pipelines aligned with BSA/AML and FATCA guidelines.
Deployed orchestration workflows using Apache Airflow and Databricks Workflows, reducing manual dependency and improving SLA compliance.
Partnered with Compliance and Data Governance teams to maintain lineage and audit trail for AML data assets using Unity Catalog.
Conducted performance tuning and query optimization, reducing Spark job runtimes by over 30%.
Delivered ad-hoc analysis and AML dashboards in Power BI, integrating risk KPIs and case tracking summaries.
Built cloud-native data pipelines using GCP services including BigQuery, Cloud Storage, and Dataflow for high-volume data processing.
Implemented data lake ingestion from GCS buckets and integrated GCP with third-party APIs for seamless data integration.
Integrated big data technologies such as Apache Spark, Hive, and HDFS to support data processing at scale.
Used Kafka and AWS DMS to handle real-time and batch data ingestion from on-prem and cloud environments.
Created and maintained Airflow DAGs to manage job dependencies and automate workflows across multiple ETL pipelines.
Wrote Python and Shell scripts for file processing, data validation, and ETL automation tasks.
Developed interactive Power BI dashboards using data from Snowflake and AWS RDS to provide stakeholders with real-time insights and key performance metrics.
Engineered ETL solutions to handle structured (RDBMS) and semi-structured (JSON, XML, Parquet) data from transaction, customer, and KYC systems.
Implemented data masking and encryption strategies in compliance with internal data security and privacy regulations.
Ensured AML data pipelines adhered to enterprise data governance policies, enabling traceable and auditable transformations.
Developed Spark ETL modules using both PySpark and Scala for high-performance data transformation and financial transaction aggregation.
Integrated ETL deployment into CI/CD pipelines using Jenkins and Git, automating build, validation, and deployment processes across environments.
Automated regulatory reporting data feeds to support BSA/AML and FATCA submissions, reducing manual compliance workloads.
HSBC Bank- India July 2021- June 2023
ETL Developer
Key Responsibilities:
Designed, developed, and maintained AML-focused ETL workflows using Spark, Python, and SQL to process transactional, sanctions, and KYC datasets supporting AML risk analytics and compliance teams.
Worked closely with AML analysts and investigators to integrate data feeds from Actimize, Mantas, and customer onboarding systems into centralized data warehouses.
Built data validation, profiling, and transformation layers to improve data accuracy and timeliness for Suspicious Activity Report (SAR) generation.
Developed data pipelines for sanctions screening, watchlist monitoring, and account linkage analysis.
Implemented error handling and exception management frameworks for regulatory data submissions under BSA/AML and FATCA.
Migrated legacy ETL workflows to Databricks and Azure Data Factory, reducing latency and enabling parallel execution of data jobs.
Participated in code reviews and compliance audits, ensuring ETL design adhered to AML policies and global standards.
Built reporting layers using SQL and Power BI to visualize AML rule hit ratios, case disposition trends, and high-risk entity summaries.
Collaborated in Agile sprints with cross-functional compliance, risk, and audit teams to deliver incremental AML data enhancements.
Designed and maintained new data ingestion processes to support warehouse and reporting requirements.
Built ETL architecture and developed Source-to-Target Mapping (STM) documents to streamline data movement.
Performed source system analysis, transformation logic implementation, data loading, and validation for data marts, ODS, and enterprise data warehouse layers.
Automated file ingestion processes and logging using SSIS Script Tasks and event handlers.
Processed structured and semi-structured AML datasets across customer due diligence (CDD), KYC, and transaction systems for analytics and risk scoring.
Designed ETL validation steps ensuring data met BSA/AML, FATCA, and global compliance standards before regulatory submission.
Collaborated with Compliance and Risk Officers to ensure data security and confidentiality in handling sensitive financial information.
Participated in governance and data stewardship initiatives, ensuring metadata accuracy and policy alignment across AML repositories.
Created detailed mapping documents outlining field-level data flow between source systems and data warehouse targets.
Implemented Spark Scala jobs for AML risk scoring models and alert rule computation, optimizing job execution and scalability.
Collaborated with DevOps teams to enable CI/CD automation using Azure DevOps for Databricks notebooks and ETL job promotions.
Partnered with Compliance Operations to automate SAR and AML regulatory report generation, ensuring timely and accurate data delivery to regulators.
Environment: Informatica Power Center, Oracle SQL Developer, SQL Server, VSTS (Azure DevOps), SharePoint, Jasper soft
TATA Consultancy Services (TIAA - FSDF Project) – India Oct 2019 -June 2021
ETL DataStage Developer
Key Responsibilities:
Developed and optimized ETL pipelines using IBM DataStage and SQL for enterprise financial data mart consolidation.
Enhanced data integration frameworks supporting customer profiling, investment transactions, and reporting dashboards.
Worked with financial data modeling (Star/Snowflake schema) and implemented parameterized job designs.
Conducted data validation and reconciliation ensuring financial reporting accuracy across multiple source systems.
Automated job scheduling and monitoring using Control-M for production reliability.
Documented ETL logic, mappings, and lineage for audit and compliance purposes.
Wrote UNIX Shell scripts to wrap ETL job execution, manage logs, and automate workflows.
Created and maintained JIL scripts for scheduling and automation of ETL processes in AutoSys.
Managed metadata and configuration repositories, using Subversion (SVN) for source code version control.
Maintained and updated production support documentation for ETL processes and job dependencies.
Provided production support by investigating data issues, job failures, and implementing fixes or performance optimizations.
Participated in job performance tuning and SQL query optimization for faster load times and better resource utilization.
Supported data quality initiatives by identifying and resolving data inconsistencies and mapping gaps across systems.
TATA Consultancy Services (Nordstrom Project)- India. Oct 2018- Sep 2019
ETL / Informatica Developer
Key Responsibilities:
Designed and implemented Informatica PowerCenter mappings and workflows to integrate retail sales and customer data.
Developed SQL and PL/SQL transformations and validation scripts for source-to-target consistency.
Built parameterized jobs and performance-tuned ETL workflows for faster data loads.
Supported production jobs, error resolution, and post-load validation as part of L2 support.
Performed unit testing and data validation to ensure data accuracy before loading into target systems.
Created and maintained user roles, user groups, and database connections using Informatica Administrator (Supervisor) for access control and environment setup.
Participated in code reviews, testing cycles, and deployment activities to ensure high-quality deliverables.
Supported ETL production jobs, troubleshooting failures, and performing root cause analysis for resolution.
Maintained technical documentation for all ETL processes, data flow, and job dependencies.
Environment: Informatica Power Center, Oracle, MS SQL Server, SQL, PL/SQL, SQL Loader, UNIX Shell Scripting.