Post Job Free

Resume

Sign in

Data Engineer

Location:
Bangalore, Karnataka, India
Posted:
November 27, 2020

Contact this candidate

Resume:

Resume - Snowflake Data Engineer

Priyabrata Dash

IBM Bengaluru, Karnataka adh6eu@r.postjobfree.com

+919*********

Headline

Data Engineer with expertise in Pyspark, Airflow, Python, Snowflake DB, AWS Redshift, Data Engineering, Business Glossary, Data Governance, Data Lineage and Data Quality, HVR Total Experience: 14 Years 2 Months

Profile

Currently working as a Data Engineering Lead where I am building a Telecom Based Data Hub Product using Lambda Architecture on CDH 6.3.x version where the Data Lake is built using pyspark, Airflow, HDFS, Hive and the Fast Data layer is built using Kafka, Spark Streaming & Cassandra.

Worked as a Data Engineer where I am building the Enterprise Data Lake for a leading US multinational by migrating from SAP BW to Snowflake DB where the Master, CRM & Transaction Data from SAP ECC/CRM is moved to Snowflake DB on AWS using Python, HVR for Data replication and using DBT & Apache Airflow for Data Pipeline & Transformation. My earlier Data Analytics expertise include supporting and managing and building migration strategy and the pipeline for a Monthly Data Mart of a leading US Insurance client from Greenplum Appliance to Cloudera platform using Spark, Hive and Impala. Spearheading the Data Governance initiative for a leading US Insurance client focusing on Meta Data Analysis, Business Glossary, Data Lineage, Data Catalog, Data Visualization and Data Quality using IBM Information Governance Catalog & IBM Information Analyzer Domain expertise in Major industries: Global Securities Custody, Insurance & Retirement, Ship Classification, Server Management

Managing the DevOps and Cloud Infrastructure ( AWS, Ansible, Jenkins, Gerrit, Gitlab, Kubernetes, Hadoop & Kafka) for the current client. Expert in Data Analysis & Visualization. Understanding client business, requirements, data lineage, analysis using SQL and documentation of the visualization Python, Tableau, Power BI, Mode and Metabase

Good coordination with the Onsite team in analyzing, suggesting and enhancing various process and techniques for delivering quality products.

I have worked for clients like Hitachi, American Bureau of Shipping, Lloyd's Register and Standard Charted Bank & Prudential Retirement.

I have worked in on-site locations with clients and business users in London, Jakarta, Singapore and Ivory Coast

Experience working in agile environment with a mix of onsite -offshore Global delivery model and working with team with diverse geo locations using tools like Slack, Teams, Jira, Trello and Service Now.

Career Highlight

Successfully built a unified Yaml based framework for building the Data lake zones for data sets coming from various sources like JDBC, files, API and in different formats csv, falt file, json, orc and parquet which can handle both snapshot and incremental data. The yaml framework has also the support for building Airflow DAGs for scheduling and also build data tables based on sql models. It also has support for Data validation and data profiling Executed successfully the upgrade of the current Data Pipeline moving the Data from SAP ECC/CRM using Oracle Fusion, AWS S3 and Python, Snowflake to using HVR for CDC, Airflow for Pipeline scheduling and DBT for transformation & building the analytics layer. Executed successful Major Data Governance Project to bring in the Data Lineage, Metadata & Reporting Assets related to EDW, ODS & Retirement Data Mart into IBM Information Governance Catalog

Successfully implemented the Enterprise Data Lake for a Spark platform in seamlessly handling complex ETL Transformation strategies across Relational, NoSQL and Big Data sources & targets.

Successfully executed Data migration activity and planning and execution of country wise roll-out of

Seccure application managing weekly Production patch deployments and key application upgrades.

Developed a Chatbot using NLP, IBM Watson Conversation API & Bluemix now used in improving engagement and develop automated customer agents for internal users and customers. Developed solutions for implementing Agent Commission automation using Hyperledger & IBM Bluemix for the client to leverage Blockchain capabilities in client's retirement business. Showcased Watson Explorer & Visual Analytics capabilities of IBM Watson Analytics to Prudential clients visiting India.

Developed migration status and key performance parameter dashboards. Automation/Bootstrapping of repetitive/manual maintenance and migration tasks using Python Led teams across broad technical, financial and business disciplines and took care of end to end delivery of the Project deliverable, change management using remedy and Production deployment support.

Understood & prepared the Migration Plan, Supported & perform OAT & SAT testing. And provided technical solutions to team for issues and support master data migration Support interface connectivity Support DR of NCS application for Ivory Coast. Went to Abidjan, Cote d'Ivoire to conduct user level training for country users and support business/ operational migration and support launch of Custody business for the Bank Visited Client Location in Jakarta Indonesia & Singapore to discuss and understand the environment of operational users and business stakeholders. Developed real time processing engine for SWIFt Messages from Custody System to Client Facing system using Internal Message bus and IBM MQ Validated the complete architecture and technical solution provided by the vendor for integration of IBM MQ into their legacy application for real time message transfer Experience

Data Engineering Lead

Netcracker Technology Corp - May 2020 to Present

Working as a Data Engineering Lead I work in a technical lead capacity where apart from being a hands on building the architecture for their Advance Analytics product line specifically their DataHub product where my responsibilities include building the base framework for the unified Data Lake for their BSS/OSS product offerings and support the integration of their DataHub product with their other Advance Analytics offerings.

Data Specialist & Lead

IBM India Pvt Ltd- February 2015 to April 2020

Actively work with Data Modelers and Integration Architects to create high as well as low level designs for Snowflake environment. Conduct PoC and present customers with different options, help with choice of tools.

Plan and drive core team to deliver DWH redesign and migration to Snowflake & AWS Redshift where needed. Contribute to Data Architect team in designing the data model for storage and access based on customer needs keeping in view regulatory requirements, archival and performance needs. As part of my day to day job I do data modelling, ELT using Snowflake SQL, implementing complex stored Procedures and managing their enterprise DWH. As part of the ETL workflow I help in Data Migration from SAP ECC/CRM & other RDBMS and NoSQL sources to Snowflake cloud data warehouse.

Also I do Snowflake administration activities like setting up resource monitors, RBAC controls,table clustering, virtual warehouse sizing, query performance tuning, Zero copy clone, support data sharing, events, manage data security and data access controls and design. As part of the BI pipeline support managing Data landing & Data extraction to AWS Redshift, AWS S3 and Build processes supporting data transformation, data structures, metadata, dependency and workload management and managing the Data Visualization & Data analytics layer.I use Snowflake utilities, SnowSQL, SnowPipe, Python for managing Snowflake. Earlier I worked on Big Data Analytics & Data Governance Project where I have worked on setting Enterprise Data Lake & ETL Pipelines using Python, Spark, Apache Nifi, Hive and Hadoop for many of their emerging Big Data use cases. I Also support in their Data Governance Initiative where I work as a Metadata Analyst working on IBM Information Governance Catalog where my day to day work involved working on Business Glossary, Data Lineage, associating Data Catalog to Business Glossary and automating the process of Data Quality checks, Metadata Discovery and Data Lineage exploration.

Custody Services SME, Data Migration expert and Project Manager Scope International - June 2010 to February 2015

In addition to being a data migration expert I was also responsible for the planning and execution of country wise roll-out of Secure (Centralized Custody Application based in BANCS) system, managing weekly patch deployments and application upgrades. Technical Analyst in Scope International

Melstar Pvt Ltd - March 2010 to June 2010

Doing software delivery, test management & Production deployment for the Bank's Custody system known as NCS.

Sr Software Engineer

L&T Infotech - June 2006 to February 2010

Worked as a key contributor to Application Re-architecture and Reverse engineering for Lloyd's Register & ABS (American Bureau of Shipping) and earlier worked for Hitachi as a Software developer working on key components for their Blade Server Management Servers. Education

Bachelor of Engineering in Information Technology in Information Technology from Silicon Institute of Technology with 70%

SSC fromSSC B.P College in Science Odhisha with 62% Skills / IT Skills

Data Analysis ( Hands on Experience): SQL Querying on Oracle,Snowflake DB, MySQL, PostgreSQL, DB2 & Basic Data Wrangling using Excel, Python Pandas Data Governance: Metadata Analysis and Data Lineage analysis using Information Governance Catalog, IBM Watson Explorer & IBM Watson Analytics Python: Experience in Standard Libraries and in pandas, fabfile, invoke, Flask, Django (Basic), Blaze, Dask, ODO, Boto, NLTK, Numpy, Scikit-Image, OpenCV DevOps: Experience in tools like Splunk, Jenkins, Jira, Service Now,Docker,Kubernetes, Terraform, Ansible,Gitlab,Github, Jira, and Gerrit Big Data: Hadoop, Hive, Scala, Spark, Impala, Sqoop, Flume, Solr, ElasticSearch Data Science: Amazon Sagemaker, R Language, Web Scraping using requests, Machine Learning using Scikit-Learn, Text Analytics using Spacy, Gensim, Deep learning using Keras, Scikitlearn, Tensorflow and Statistics using Statsmodels Data Visualization: Bokeh, matplotlib, Plotly, Tableau,Mode Analytics Languages & Technologies (Earlier Worked, Need refresh): Ruby, Excel Macros, VB Script, Linux scripting, XML, Oracle, Sybase, SQL, Java, Node JS Integration & ETL Tools: Airflow, DBT for Data Transformation, HVR for Data Replication, DataStage, Apache Nifi, Apache, Sqoop, Flume & IBM MQ Version Control:Git, Gitlab, Github, Bitbucket, Sourcetree, SVN, Clearcase, Gerrit Data Governance & Quality - IBM Information Governance Catalog, Enquero Data Catalog Databases: AWS Redhsift, Snowflake Cloud Database, MySQL, DB2, Sybase, MS SQL Server, Oracle 11g, sqlite, mysql, Mongo DB Neo4J, Titan DB, CoucheDB, HBASE, Cassandra, Redis Scheduling: Apache Airflow, Control M, AutoSys

Cloud & Cognitive: AWS (S3,EC2,ELB & ECR), Azure, Bluemix, IBM Watson, Docker, Openshift,

IBM Watson APIs, Amazon Lambda, Kubernetes

Online Profile

https://www.linkedin.com/in/priyab-dash-21616b15



Contact this candidate