Chittaranjan Sahoo
Cumming, GA
+1-404-***-**** ********@*****.*** LinkedIn Github Kaggle
¢ Professional Summary
IT professional with 15 years in Database Development, ETL Development, Data Engineering, Data Warehousing, Data Migration, Data Quality, Business Intelligence & Data Analytics, including 4 years in United States.
Expertise in Database development, OLAP, OLTP, ETL/ELT Data Pipelines
Expertise in ANSI SQL, PL/SQL, TSQL and RDBMS like Oracle, MS SQL Server, MySQL, PostgreSQL
In-depth knowledge of Azure data services such as Azure Data factory, Azure Data Lake Storage, Azure Databricks, Azure Synapse Analytics.
Writing Spark applications to process and analyze large datasets efficiently. This involves using Spark's core functionalities like RDDs (Resilient Distributed Datasets), DataFrames, and Spark SQL.
Experienced in working with Spark eco system using Scala and HIVE Queries on different data formats like Text file and parquet/avro/orc.
Implemented data integration solutions, including ETL (Extract, Transform, Load) processes using Azure Data Factory, Informatica, Python.
Proficiency in Azure Cosmos DB and its associated APIs
Experience in Data Visualizations using Cognos, Tableau.
Experience in Snowflake Cloud Technology.
Good in Microsoft Azure, AWS and GCP
Understanding in Data Science areas like statistical analysis, machine learning algorithms, classification, clustering, regressions, feature engineering etc.
Worked closely with the different Product team, Data Scientists, data Visualization team, Risk & Compliance team and Sourcing team to provide the required dataset.
Excellent problem-solving and debugging skills.
Strong teamwork and communication skills.
Able to adapt quickly to new technologies and challenges.
Academics
10/2011 – Master in Computer Application, MKU, TN, India
(US equivalent of Master of Science Degree in Computer Information Systems)
Certifications & Badges
Python for Data Science - University of California, San Diego (EdX)
Microsoft Azure AZ-900
Snowflake - Hands On Essentials - Data Warehouse
IBM Certified Designer Cognos 10 BI Reports
Oracle Certified Associate- Oracle Database 10g
Work Experience
# 1 Azure Lead, Cielo Talent (RPO) (through KForce)
Feb/ 2022 – Present
Working with business partners, architects, and other groups to identify the technical and functional needs of analytical systems and determine the priority of needs.
Data ingestion into Azure: Data sources included files (various formats), on-prem data warehouse (MS SQL Server). I have also built more than 200 data pipelines with output for consumption into Azure SQL Database or Azure Cosmos DB for advanced analytics. I have built pipeline orchestration using Azure Data Factory, DataBricks and Synapse.
As an Azure Data Factory (ADF) lead involved in managing and overseeing the development, deployment, and maintenance of data pipelines within the Azure Data Factory environment and led and managed a team of data engineers and developers working on Azure Data Factory projects. Lead the design, development, testing and deployment of more than 200+ data pipelines using Azure Data Factory.
Develop, test, implement, and document technical engineering solutions to assist business partners' self-service analytic needs, client reporting, and data consumption requirements.
Optimize the data integration platform to provide optimal performance under increasing data volumes.
Analyze, define, design, and document requirements for data, workflow, logical processes, and system environments.
Provide post deployment/production support for users through on Zendesk/ Jira tickets
# 2 Data Engineer, Home Depot (Inventory Planning) (through KForce)
Sep/ 2021 – Feb/ 2022
Working with Business team to understand the requirement and develop python code using Airflow to design data pipelines and SQL for transformation.
Analyzing Data and Developing pipelines in BigQuery using Airflow for downstream EDW Advanced Service Level apps to calculate weekly Safety Stock quantities for different locations.
# 3 Data Engineer, Verizon (DCIM Migration)
Feb/ 2021 – Sep/2021
Designing migration of DCIM (DataCenter Infrastructure Management) data from Oracle and SQL Server to Sunbird dcTrack and PostgreSQL
Designed and Implemented migration of Salesforce data for Oracle Fusion Application.
Used SQL Server Integration Services (SSIS) packages for data integration.
Involved in migration approach and detailed plan for migration
Analyzed interdependencies and constraints of SFDC Reports wrt Target schema
Defined data validation rules, data profiling and quality rules
Provided L2 support for user’s request, incidents and providing technical solutions.
Prepared Recon Report, DQ/DI Report, Data Validation using Tableau
# 4 Big Data Engineer, AT&T (Customer Experience Analysis of Postpaid Users)
Jun/ 2019 – Jan/2021
Participated in understanding the business problem with client.
Identified the sources required to address the business problem of classifying customer experience.
Designed a Data Lake in Hadoop (HDP) to ingest big data from heterogeneous sources like Dynatrace, Splunk, Oracle, HTML logs and Quantum Metric.
Developed scripts to extract, transform and load data into Data Lake.
Designed and developed data processing applications using Apache Spark and Scala to streamline the data ingestion process.
Experienced in working with Spark eco system using Scala and HIVE Queries on different data formats like Text file and parquet.
Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle.
Automated the whole data loading process from landing to staging to master tables.
Developed Data Cleansing programs to load master tables in Hive for analysis.
Implemented Feature Engineering and done Exploratory Data Analysis
Developed python code for classification of data using Unsupervised Machine Learning.
Designed and Developed an Enterprise Data lake using Azure Data Lake Storage (ADLS) with Databricks.
Prepared 60+ aggregated tables for visualization in Cognos / QlikView
Developed a Sankey diagram to present different customer browsing behavior pattern which was highly appreciated by the client.
# 5 Business Intelligence Analyst, AT&T (Gate Keeper)
Jun/ 2018 – May/ 2019
Designed Data warehouse model for Oracle ERP using Star Schema model.
Consolidated of ETLs and Visualization into a single platform.
Implemented re-usable ETL data pipe lines for complex business rules.
Created batch Process which schedules and ran the ETL by Parameterization.
Redesigned/ tuned the long run queries for nightly load.
Created health checks to compare incoming data vs. data loaded in target.
Developed Tableau dashboards and analytical reports using KPIs.
Prepared Business Requirements, GAP Analysis & ETL Design Documents.
# 6 Business Intelligence Analyst, CenturyLink (Revenue Assurance)
Jan/ 2018 – May/ 2018
Designed enterprise Data Model for Oracle and SAP ERP system.
Created ETL data pipelines to load complex datasets in data warehouse.
Implemented SCD to track history from different OLTP systems.
Developed UNIX scripts for pre-and post-loading processes.
Migrated of various legacy applications into new ETL and visualization platform
Performance improvement of various SQL based jobs
# 7 Business Intelligence Developer, AT&T (Broadband Operations)
Apr/ 2015 – Dec/ 2017
SME for BOBI, RDW and WDW Application.
Reverse engineering of existing workflows and mapping to optimize code to achieve 20% reduction in job run time YoY.
Creation of shell script/Python scripts to generate automated alerts to avoid any potential SLA misses.
Chatbot modification and configuration to generate automated job failure, critical SLA long running alerts and Database usage alerts.
One time or adhoc Teradata load/Pull query creations to resolve skew errors for quick resolution to SLA jobs.
Modify existing Teradata queries to resolve Skew errors and Spool errors.
Modification in Informatica mappings to accommodate business requirement via Monthly maintenance change request.
# 8 Business Intelligence Developer, General Electric (Digital Energy)
Apr/ 2011 – Mar/ 2015
Developed complex aggregated queries using joins and sub queries.
Developed complex ETL jobs using Informatica and DataStage
Designed several Indexes for faster data retrieval strategy.
Created data base metadata catalogs/ data dictionary.
Provided hands-on support for all data related activities.
# 9 Business Intelligence Developer, HP
Dec/ 2008 – Mar/ 2011
Developed various data base objects.
Member of a 24x7 Oracle on Demand Team supporting Oracle Database and Application.
Created data selection rules using data query languages via joins.
Automatic Storage Management setup Reconciliation checks for the data migration
Industries:
Telecom
Retail, Orders & Sales
Manufacturing
RPO
Skills:
Data Infrastructure
Data Engineering
Big Data
Data Warehouse & BI
Data Migration
Data Integration
Data wrangling, Statistics
Master Data Management
Change Data Capture
Oracle, MS SQL Server, MySQL, PostgreSQL, MongoDB
Azure Data Factory
Azure SQL Database
Azure Blob Storage
Azure Cosmos DB
Azure Data Lake Storage
Azure Synapse Analytics
Hadoop, Hive, Sqoop, Kafka
Apache Spark, Scala
GCP, BigQuery
SQL Performance Tuning
Snowflake Cloud Database
SOUP UI, REST API, JSON
CI/CD (Git)
Unix Shell Script
Scrum, Agile Methodologies
Microsoft Excel