OBJECTIVE
Seeking a challenging position in ETL Developer Position
SUMMARY
** ***** ** ****-** data migration experience in multiple data sources / targets data warehouse environments
Strong in MS SQL Server SSIS: designing, creating and deploying integration service packages; developing control flow and data flow; familiar with Connections, tasks, transformations, variables and package configuration
Solid work experience in ETL tools Informatica Power Center mapping and workflow design
Designing and building scalable DataStage solutions for financial data mart
Extensive experience in ETL, Data Warehousing/Data Mart, OLTP (Online Transaction Processing), Relational online analytical processing (ROLAP) and Multidimensional OLAP (MOLAP)
Extensive experiences on Enterprise BI reports performance optimization and data organization
Design and implement BI / reporting solutions as per requirements gathered with business stakeholders by collecting the required data with BI Tools Power BI, Dremio, Alteryx Tableau, MicroStrategy
Rich experiences in Database: RedShift, S3, Netezza, Oracle and MS SQL Server as well as Postgre SQL, SQL, PL/SQL and T-SQL programming
Hand on experience on AWS S3, Lambda, EC2, CloudWatch, RDS, RedShift as well as PostgreSQL skills
Utilize Python 3.6 pandas to extract data from AWS RedShift and write to CSV file for down stream
Convert the SAS script to source from universal CSV file instead of retiring DB
Improve report performance with tuning the queries, database design and processes improvement, limiting the final input data, using proper automation control
10+ years experiences in various enterprise application systems development lifecycle including business analysis, design, development, test and implementation
Implement Conceptual/Logical and Physical Data Modeling of RDBMS, Data Warehouses and Data Marts using ERwin, TOAD Modeler and Visio etc.
Advanced experience with SDLC, iterative development with Scrum Agile, Jira, Bitbucket and Jenkins
Professional Documentations experience with effective category, naming convention and standardized templates
Excellent verbal and written communication skills and the ability to interact professionally with diverse groups
Strong analytical and problem-solving abilities, detail oriented with strong organizational skills
Proactive team player and self-motivated worker, quick learning and hard working
WORK EXPERIENCE
Sr. Data Analyst
Fannie Mae Washington DC May 2019 – May 2024
Industry: Finance
Develop Informatica mapping and workflow as well as upgrade validation
Sourcing data from S3 files with Dremio and create reports by Tableau
Utilized S3 copy and Redshift copy to loaded flat file data into AWS Redshift
Extracted the data from Redshift with python 3.6 pandas to generate the file for the down steam application
Solved the Redshift DB connection timeout issue by Python pandas chunk load and Netezza timeout issue by creating external file function to improve the extract performance
Configurated the S3 Bucket/Key and Redshift properties for the data migration processing
Developed SAS script, sourced the data from files and created the dataset for reports
Reconciled the data with old model data and new design Enterprise Data Warehouse data, made sure the reports correctly generated with the new data model
Designed the data migration automation for the application with Autosys and UNIX scripts
Performed End to End test with SF3 application calling and debugging the SAS codes for business UAT
Collaborated with Quality Assurance, DBA, release management teams for implementation of projects
Identified, documented and performed root cause analysis to prevent production issues from reoccurring
Utilized Jenkins to deploy codes to Test and UAT
AWS Lambda and Step function testing and Cloud watch validation
Jil contingency AutoSys job to utilize the App ID as well as YBYO to manipulate file for testing
Compared data with through DB, Files, Alteryx as well as other tools
Environment: AWS, Lambda, EC2, RedShift, S3, Cloud, Python, SAS, Autosys, Netezza, Oracle, Bitbucket, Jira, Jenkins
Sr. BI Consultant
DXC Technology
Projects: EDW on Azure Cloud/Hive Mclean VA Feb 2018 – May 2019
Industry: Retail
Solved client performance issue by created the EDW for client with Extract Load and Transform from current system to Big Data Hive (Data Lake) and presented the data through SQL DB (Power BI)
Ingested the files from different brands all over the world into the cloud Hive (Data Lake) and loaded them into data mart, organized source data time zone and dependencies, prepared the data for the final reports run
Designed solution for client’s Data warehouse with Analyzing, Modeling, ETL and BI Reporting
Converted the client’s old report database to dimension design model to reduce the reports latency
Took advantage of the cloud computer to use more computer power for peak run to improve report performance
Collected and summarized the file arriving time and jobs dependencies to generated the job scheduling for Automation loading
Provided solution for large volume historical data initial load and handled the data quality and file format issues
Extracted data directly from ERP system AX2012 and D365 and loaded to Data warehouse for reports
Developed Power BI visualizations for business KPI needs
Environment: Azure, Cloud DB, HDInsight, Hadoop, Hive 3.0, SQL DB, Power BI, SSIS, Jira, DevOp Git, ORC File, HQL, Logic App
Big Data Consultant
Toronto Big Data Professional Association
Projects: Big Data in Finance Toronto ON Oct. 2016 – Jan. 2018
Industry: Finance
Explored the insight of the data for decisions to improve the performance of the departments and corporation
Designed and developed POC for client to evaluate the Big Data platform compare with the current platform
Migrated data from tradition data into data lake improve the report performance and reduce the cost
Presented data for client for marketing and analytical purpose
Evaluated different tools and platform for the need of the client environments and budget
Environment: Hadoop, Hive, Netezza, UNIX Linux, SQL, Python, Informatica, XML, Agile, Git
Senior ETL Developer
Next Pathway
Project: Scotia RCRR, CPP, CRP, GLRAS Toronto ON May 2016 – Apr 2017
Industry: Bank
Organized data through different layers. Ingested data to Temp Storage Zone, loaded to Snapshot layer, transformed to Semantic layer and presented to users to generate the reports
Designed and developed the ELT processes to solve complex data problems with HQL and Java scripts
Implemented Big Data ELT with Talend 6.1 using tELTHiveInput, tELTHiveMap, tELTHiveOutput, tRunJob
Embraced Agile process, involved the client in every step of the project, developed and changed with fast respond
Designing and building scalable DataStage solutions for financial data mart
Utilized partitions to store data in table to balance the storage and performance
Evaluated the different execution engines MapReduce, TEZ and Spark to improve the run time
Collaborated with Data Architects and BA to refine the requirements and designs
Tested the codes and documented unit test cases, solved QA and UAT issues with Jira system
Environment: Hadoop, Hive, Spark Java, Python, Informatica, Talend, DataStage, Agile, Toad, Linux, Jira, Confluence, HUE, GIT, Avro File, HQL, SAS
Senior ETL Developer/Lead
BMO
Project: AML, MECH, OSFI, Basel III, GTSCM, Finance North York ON Nov. 2014 – May 2016
Industry: Bank (Capital Market)
Designed and developed ELT loading processes to meet reporting timing and data requirements
Coordinated with different departments to setup the SLA for each source file, designed the seamless schedule and dependencies for daily BI reports
Improved Netezza Database performance using Organize, Distribute and Groom functions
Developed ELT codes to load data from multiple sources to landing database then load to Data Mart for Reporting
Leaded development applying ELT standards, Framework standard and Database standards
Retrieved the MFA log summary data from Hive into Netezza through Fluid Query, to generate party of the risk score factor and use Spotfire to generate the Analytic reports
Migrated large volume of data (5TB) as pre deployment activity in Production for new release
Implemented large projects release by cooperating development team, QA team, Scheduling team and Release team
Utilized the development framework to manage ELT metadata, customize it for special requirements
Communicated efficiently with BA, Business, PM and Vendors to collect and clarify requirements
Performed data validation processes to control the data errors, prevented error data loaded into database
Environment: Netezza, DB2, SQL Server 2012, Power Designer, UNIX Linux, SQL, XML, Hadoop, Hive, Spotfire, Agile
Senior ETL Developer/Lead
Rogers Inc.
Project: CDR, CRM, Billing System Brampton ON Jun. 2011 – Nov. 2014
Industry: Telecom
Leaded ETL development according to coding standards, using restartable and scalable strategy in Informatica workflows, automated the ETL processes with scheduler (Control M) design
Analyzed, defined and documented requirements for data, workflow and logical processes
Designed and developed ETL codes with Informatica, Control M and Unix Script to meet business and project requirements, help business solve report problems
Used Hadoop as data store the huge volume data CDR and unstructured data as the source of data warehouse
Utilized the control framework to collect metadata of each workflow for data quality control
Handled data errors through coding level, database level and control framework level base on the requirement needs
Applied Teradata and Informatica functions to handled huge volume of CDR and Billing data
Performed unit test, code review, integration test and generated test documents
Created detail design document, unit test document, implementation plan, ETL process and run book
Assisted Business Analyst with data profiling activities
Environment: Informatica 9.5/8.6/7, Teradata v13, Control M 7/8, Toad 10, Cognos 8, MicroStretegy; Oracle 11g/9i, SQL Serve; UNIX AIX, PL-SQL, XML, Hadoop, Hive
Senior ETL Developer
CGI Inc.
Project: Atlantic Lottery, Corporation Information System, Moncton NB Nov. 2010 – Jun. 2011
Industry: Government
Analyzed the business requirements to design, architect, develop and implement efficient and scalable ETL flow
Reviewed, analyzed and developed fact tables, dimension tables and reference tables for Data Marts
Inspected and analyzed the documents of business and system requirements for EDW (Enterprise Data Warehouse)
Made recommendations for ETL design and performance enhancements for large data volume loads and processing
Prepared, maintained and published ETL documentation, including source-to-target mappings, business-driven transformation rules, functional design, test report, installation instruction and runbook
Developed SSIS solutions using Integration Service projects for Control, Extract, Data cleansing, Transform and Load
Developed and maintained data management standards and conventions, data dictionaries, data elements naming standards and metadata standards
Created test cases for ETL; prepared test data and test scripts to test fact tables and dimension tables
Environment: SSIS, Cognos; SQL Server 2005, 2008R2, Informix, DB2; AccuRev, JIRA; ERwin, Toad, XML, Espresso
EDUCATION
Bachelor of Engineering
South China University of Technology
Bachelor: Computer Science
Bachelor of Economics
Guangdong University of Finance and Economics
Bachelor: Economics
REFERENCES
Available upon request