Data Engineer Sql Server

Location:

Tampa, FL

Posted:

May 17, 2024

Contact this candidate

Resume:

NAME: Sasi kiran

Email: ad5rsi@r.postjobfree.com PH: 813-***-****

Data Engineer

Professional Summary

Professional with 9+ years of experience in the Information Technology industry and experience working as a Data Engineer with hands-on in the areas of Database Development, ETL Development, Data modelling, and Big Data technologies.

Experience with ETL workflow Management via Apache Airflow and in writing python scripts to implement the workflow.

Experience on performance tuning of SQL queries by analyzing the code, recreating the user driver tables by right Primary Index, scheduled collection of statistics, secondary or various join indexes.

Around 5 years’ experience in Azure Cloud Services (PaaS & IaaS), Azure Synapse Analytics, SQL Azure, Data Factory, Azure Analysis services, Application Insights, Azure Monitoring, Key Vault, Azure Data Lake.

Have knowledge and experience in fact dimensional modelling (Star schema, Snowflake schema), transactional modelling and SCD.

Have experience in using Relational Data Base Management (RDBMS) systems like Oracle, MS SQL Server, Teradata, DB2 design, MS Access, and database development using SQL, PL/SQL, writing, testing and implementation of triggers, stored procedures, functions, packages, Cursors using PL/SQL.

Have Experience in designing and developing Azure stream analytics jobs to process real time data using Azure Event Hubs, Azure IoT Hub, and Service Bus Queue.

Experienced with Cloudera, Horton networks and Map Reduce

In-depth understanding of Data Structure and Algorithms.

Hands on Experience with AWS Snowflake cloud data warehouse and AWS S3 bucket for integrating data from different source systems which includes loading nested JSON formatted data into Snowflake table.

Experience with Data pipelines, end to end ETL and ELT process for data ingestion and transformation in GCP.

Experience in integration of Informatica Data Quality (IDQ) with Informatica PowerCenter.

Experience in Data Mining solutions to various business problems and generating data visualizations using Tableau, Power BI, Alteryx.

Experience in Data Integration and Data Warehousing using ETL tools like Informatica PowerCenter, AWS Glue, SQL Server Integration Services (SSIS), and Talend.

Experience in Business Intelligence Solutions with Microsoft SQL Server and using MS SQL Server Integration Services (SSIS), MS SQL Server Reporting Services (SSRS) and SQL Server Analysis Services (SSAS).

Experience in utilizing Informatica PowerCenter, Informatica Data Quality (IDQ) as ETL tool for extracting, transforming, loading, and cleansing data from various source data inputs to various targets, in batch and real time.

Experience in working with Amazon Web Services (AWS) cloud and services like Snowflake, SQS, S3, and EC2.

Experience in design, development, unit testing, integration, debugging and implementation and production support, client interaction and understanding business application, business data flow and data relations.

Analyzed data and provided insights with Python library function like Pandas.

Worked on AWS Data Pipeline to configure data loads from S3 into Redshift.

Worked on Data Migration from Teradata to AWS Snowflake Environment using Python and BI tools like Alteryx.

Experience in moving data between GCP and Azure using Azure Data Factory.

Developed Python scripts to parse the Flat Files, CSV, XML, JSON files and extract the data from various sources and load the data into data warehouse.

Worked on JIRA for defect/issues logging & tracking and documented all my work using Confluence.

TECHNICAL SKILLS:

Operating Systems: Windows 98/2000/XP/7,8,10, Mac OS and Linux CentOS

Programming Languages: Python, R

Web Technologies: XML, JSON

Python Libraries/Packages: Boto3, Pandas, Matplotlib, HTTPLib2, Urllib2, Beautiful Soup, Numpy

IDE: PyCharm, Jupyter, PyStudio, Sublime Text, Visual Studio Code

Automation: Terraform, Git, BitBucket

Cloud: Microsoft Azure, AWS

Databases/Servers: MySQL, Redis, SFTP, FTP, PostgreSQL, Aurora DB, DynamoDB, MongoDB

Big Data Technologies: Hadoop, MapReduce, HDFS, Sqoop, PIG, Hive, HBase, Kafka, Yarn, Apache Spark.

ETL: Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Redshift, Postgres-SQL, Snowflake.

Web Services/ Protocols: TCP/IP, UDP, FTP, HTTP/HTTPS, SOAP, Rest, Restful

Build and CI tools: Azure DevOps, Azure Pipelines, Jenkins, Airflow

SDLC/Testing Methodologies: Agile, Waterfall, Scrum, TDD

Statistical Analysis Skills: A/B Testing, Time Series Analysis

Machine Learning and Analytical Tools: Supervised Learning (Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM, Classification), Unsupervised Learning (Clustering, KNN, Factor Analysis, PCA), Natural Language Processing, Tableau.

Professional Experience

Berlin Packaging, Chicago, IL Jan’2023 – Till Date

Sr. Azure Data Engineer

Responsibilities:

Designed and Implemented Big Data Analytics architecture, transferring data from Oracle.

Analyze, design, and build Modern data solutions using Azure PaaS service to support visualization of data. Understand current Production state of application and determine the impact of new implementation on existing business processes.

Designed, developed, and deployed Business Intelligence solutions using SSIS, SSRS and SSAS

Experienced with API & Rest services in collecting data and publishing to downstream applications.

Worked on Google Cloud Platform (GCP) services like computer engine, cloud load balancing, cloud storage and cloud SQL.

Designed Data Pipeline to migrate the data from on-prem/traditional sources to Cloud Platform

Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage pattern.

Developed business intelligence solutions using SQL server data tools 2015 & 2017 versions and load data to SQL & Azure Cloud databases.

Performed ETL using Azure Data Bricks. Migrated on-premises Oracle ETL process to Azure Synapse Analytics.

Involved in creating fact and dimension tables in the OLAP database and created cubes using MS SQL Server Analysis Services (SSAS).

Experience in Google Cloud components, Google container builders and GCP client libraries.

Exposure to Lambda functions and Lambda Architecture.

Created DDL's for tables and executed them to create tables in the warehouse for ETL data loads.

Implemented logical and physical relational database and maintained Database Objects in the data model using Erwin.

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks.

Azure Data Factory (ADF), Integration Run Time (IR), File System Data Ingestion, Relational Data Ingestion.

Created SSIS reusable packages to extract data from multi-formatted flat files and excel files into SQL Database

Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.

Have good experience working with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics (DW).

Worked on migration of data from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).

Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the SQL Activity.

Created SSIS Packages to perform filtering operations and to import the data on daily basis from the OLTP system to the SQL server.

Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL

Built MDX queries for Analysis Services (SSAS) & Reporting Services (SSRS).

Experienced in Querying data using Spark SQL on top of Spark Engine, implementing Spark RDD’s in Scala

Worked on designing, building, deploying, and maintaining Mongo DB.

Developed ETL framework using Spark and Hive (including daily runs, error handling, and logging) to useful data.

Coordinated with team and Developed framework to generate Daily ad hoc, Report's and Extracts from enterprise data and automated using Oozie.

Experienced in Designed and developed Data models for Database (OLTP), the Operational Data Store (ODS), Data warehouse (OLAP), and federated databases to support client enterprise Information Management Strategy and excellent Knowledge of Ralph Kimball and Billion’s approaches to Data Warehousing.

Responsible for maintaining and tuning existing cubes using SSAS and Power BI.

Worked on cloud deployments using maven, docker and Jenkins.

Environment: Azure Data Factory, GCP, Data Storage, Data Lake, Data Bricks, Power BI, Hive, Spark, Jenkins, Docker, Kubernetes, Maven

Rx Savings Solutions, Kansas Jan’2021 – Dec’2022

Sr. Azure Data Engineer

Responsibilities:

Involved in business Requirement gathering, business Analysis, Design and Development, testing and implementation of business rules.

Experience in leading offshore team in assigning tasks, getting status updates & helping offshore team to provide technical solutions.

Creating pipelines, data flows and complex data transformations and manipulations using Azure Data Factory (ADF) and PySpark with Databricks.

Created, provisioned multiple Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters.

Designing and Developing Azure Data Factory (ADF) pipelines to extract data from Relational sources like Teradata, Oracle, SQL Server, DB2 and non-relational sources like Flat files, JSON files, XML files, Shared folders etc.

Developed streaming pipelines using Apache Spark with Python.

Develop Azure Databricks notebooks to apply business transformations and perform data cleansing operations.

Develop Databricks Python notebooks to Join, filter, pre-aggregate, and process the files stored in Azure data lake storage.

Ingested huge volume and variety of data from disparate source systems into Azure Data Lake Gen2 using Azure Data Factory V2.

Created reusable pipelines in Data Factory to extract, transform and load data into Azure SQL DB and SQL Data warehouse.

Experienced in developing audit, balance and control framework using SQL DB audit tables to control the ingestion, transformation, and load process in Azure.

Used Azure Logic Apps to develop workflows, which can send alerts/notifications on different jobs in Azure.

Used Azure DevOps to build and release different versions of code in different environments.

Automated jobs using Scheduled, Event based, tumbling window triggers in ADF.

Created pipelines, data flows in Azure Data Factory for data loading and transformation.

Used Azure Blob storage, Azure databases and data warehouse as source for Power BI

Used Power Query for the ETL process and to import data into Power BI data model.

Created calculated columns and measures using DAX expressions to use in KPI’s.

Develop statistical techniques to describe data Using groups, bins, hierarchies, sorts, sets, and filters to create focused and effective visualizations.

Write complex SQL queries, Materialized Views and used them as data source for reports.

Implemented Row Level Security in Power BI to manage data privacy across user levels.

Product Key Life Cycle analysis Project and published to Power BI Service.

Use Git repository and Azure Devops for deploying pipelines across multi environments.

Created interactive Dashboards using Power BI for analytics Team.

Environment: Azure Data Factory, Data Storage, Data Lake, Data Bricks, Power BI, Hive, Spark, Jenkins, Docker, Kubernetes, Maven

Ferguson Charlotte, NC May’2019-Dec’2020

Data Engineer

Responsibilities:

Performed Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export.

Involved in Business Requirements Analysis, preparation of Technical Design documents, Data Analysis, Logical and Physical database design, Coding, Testing, Implementing, and deploying to business users.

Involved in the review of functional requirement specifications and supporting documents for business systems, experience in database design process and data modelling process.

Used SSIS to populate data from various data sources, creating packages for different data loading operations for applications.

Performed migration to move Data Warehouse from an Oracle platform to AWS Redshift.

Built S3 buckets and managed policies for S3 buckets and used S3 bucket and Glacier for storage and backup on AWS.

Developed complex mappings using Informatica Power Center Designer to transform and load the data from various source systems (Oracle, Teradata etc.,) into the final target database.

Analyzed source data coming different sources (SQL Server tables, XML, Flat files etc.,) and transformed according to business rules using Informatica and loaded the data into target tables.

Involved in creating the Tables and loading the data through Alteryx for Global Audit Tracker.

Analyzed large and critical datasets using HDFS, HBase, Hive, HQL, PIG, and Sqoop.

Changing the existing Data Models using Erwin for Enhancements to the existing Datawarehouse projects.

Used Talend connectors integrated to Redshift – BI Development for multiple technical projects running in parallel.

Created iterative macro in Alteryx to send Json request and download Json response from webservice.

Supported various business teams with Data Mining and Reporting by writing complex SQL queries using OLAP functions like ranking, partitioning, and windowing functions, Etc.

Worked with EMR, S3 and EC2 services in AWS cloud and Migrating servers, databases, and applications from on premise to AWS.

Tuned SQL queries using Explain to analyze the data distribution among AMPs and index usage, collect statistics, definition of indexes, revision of correlated sub queries, usage of Hash functions.

Developed shell scripts for job automation, to generate the log file for every job.

Written complex SQLs using joins, sub queries and correlated sub queries. Expertise in SQL Queries for cross verification of data.

Extensively worked on performance tuning of Informatica and IDQ mappings.

Worked on Power BI solutions to migrate reports from SSRS.

Create, maintain, support, repair, customizing System & Splunk applications, search queries and dashboards.

Experience in data profiling & various data quality rules development using Informatica Data Quality (IDQ).

Create new UNIX scripts to automate and to handle different file processing, editing and execution sequences with shell scripting by using basic Unix commands and ‘awk’, ‘sed’ editing languages.

Experience in cloud versioning technologies like GitLab, and GitHub

Integrated Collibra with Data Lake using Collibra connect API.

Worked with the team on ETL Processes in AWS Glue to migrate data from external sources (S3, Parquet, .txt Files) into AWS Redshift.

Created firewall rules to access Google Data proc from other machines.

Environment: AWS, Informatica Power Center, Talend, PostgreSQL Server, Python, Oracle, Teradata, CRON, Unix Shell Scripting, SQL, Erwin, AWS Redshift, GitLab, Power BI

Solix Technologies, India Jan’2016 –Dec’2018

Data Engineer

Responsibilities:

Collaborate with Business users to gather requirements and define/implement analytical tools, reports, metrics, and dashboards.

Developed analytical dashboards to provide relevant information to business users and senior management.

Responsible for extracting data from multiple systems and databases including SQL Server and Oracle to consolidate into a centralized system.

Performed data migration between different environments and third-party databases.

Interpreted client's data to identify key metrics, strategic insights, and conclusions.

Created drill down analysis reports using SQL Server Reporting Services

Performed integration and ETL tasks by extracting, transforming, and loading data to and from different RDBMS.

Wrote SQL queries and scripts to extract, aggregate, analyze and validate data accuracy and created Excel summary reports.

Used ETL (Extraction, Transform and Load) to collect the data from different departments and make it available for management for quick response. Also used Online Transaction Process (OLTP) and Online Analytical Process (OLAP) on a day-to-day basis for operation and analysis.

Performed routine functions in Informatica to load the data from a flat source file into the target by applying transformations to the data according to the project requirements.

Identified data elements from the source systems, performed data analysis to come up with data cleansing and integration rules for the ETL process.

Performed risk analysis with technical team to identify the key business risks areas for the project and prioritized the application development and testing.

Engaged in user facing and communication tasks while interacting with support teams, Business users and Super Users, vendors, and quality assurance team.

Documented monthly status reports for enhancement and modification requests for the development team to assist them in efficient tracking and monitoring of open issues of the project.

Created Entity/Relationship Diagrams, grouped and created the tables, validated the data, identified PKs for lookup tables.

Analyzed data and provided insights using Python library functions.

Worked with the team to design the data marts in dimensional data modelling using star and snowflake schemas.

Redefined attributes and relationships in the models and cleansed unwanted tables/columns as part of data analysis responsibilities.

Involved in loading the data from Source Tables to Operational Data Source tables using Transformation and Cleansing Logic.

Developed dashboards using Tableau to provide the management with an overall understanding of resource optimization, attaining incremental revenue worth $1.15 million.

Worked on stored procedures for processing business logic in the database.

Performance query tuning to improve the performance along with index maintenance.

Worked on the reporting requirements for the data warehouse.

Created support documentation and worked closely with production support and testing team.

Developed UNIX shell scripts to run batch jobs and load into production and used Unix Shell Scripts for adding the header to the flat file targets.

Preparation of the Test Cases and involvement in Unit Testing and System Integration Testing.

Deep analysis on SQL execution plan and recommend hints or restructure or introduce index or materialized view for better performance.

Utilized Power Query in Power BI to Pivot and Un-pivot the data model for data cleansing.

Environment: SQL, ETL, Informatica, UNIX, MS Access, MS Excel, HP ALM, Agile, Tableau, Python

Hudda InfoTech Private Limited, India July’2014-Dec’2015

Data Analyst

Responsibilities:

Gathered requirements, analyzed, and wrote the design documents.

Performed data profiling in the source systems that are required for Dual Medicare Medicaid DataMart.

Document the complete process flow to describe program development, logic, testing, and implementation, application integration, coding.

Worked with project team representatives to ensure that logical and physical ER/Studio data models were developed in line with corporate standards and guidelines.

Involved in defining the source to target data mappings, business rules, business, and data definitions.

Responsible for defining the key identifiers for each mapping/interface.

Responsible for defining the functional requirement documents for each source to target interface.

Document, clarify, and communicate requests for change requests with the requestor and coordinate with the development and testing team.

Worked on identifying Facts from Existing Relational Models and sub-Models.

Worked on identifying Dimension tables from the existing models of data marts.

Using ER Studio had designed the data mart using Ralph Kimball’s Dimensional Data Mart modeling methodology.

Designed and implemented data integration modules for Extract/Transform/Load (ETL) functions.

Involved in data warehouse and Data mart design.

Experience with various ETL, data warehousing tools and concepts.

Performed daily data queries and prepared reports on a daily, weekly, monthly, and quarterly basis.

Used advanced Excel functions to generate spreadsheets and pivot tables.

Worked with the users to do the User Acceptance Testing (UAT).

Created Technical specifications documents based on the functional design document for the ETL coding to build the data mart.

Extensively involved in Data Extraction, Transformation and Loading (ETL process) from Source to target systems.

Wrote SQL queries for custom reports.

Worked on daily basis with lead Data Warehouse developers to evaluate impact on current implementation, redesign of all ETL logic.

Environment: ER Studio, Power Designer, SQL/Server, Oracle10&11g, MS-Office, ETL (Extract, Transform, Load), XML, PowerPoint, MS Visio, and MS Outlook.

Contact this candidate