Resume

Data Engineer Sql Server

Location:

Sacramento, CA

Salary:

45/hr

Posted:

December 04, 2023

Contact this candidate

Resume:

Data Engineer

Email: ad1own@r.postjobfree.com Contact: 916-***-****

LinkedIn: https://www.linkedin.com/in/mithula-p-183b91287/

Professional Summary:

• Over 4+ years of experience in Data Engineering, Data Pipeline Design, Development and Implementation as a Sr. Data Engineer/Data Developer and Data Modeler.

• Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.

• Strong experience in writing scripts using Python API, PySpark API and Spark API for analyzing the data.

• Extensively used Python Libraries PySpark, Pytest, Pymongo, cxOracle, PyExcel, Boto3, Psycopg, embedPy, NumPy and Beautiful Soup.

• Experienced in building Automation Regressing Scripts for validation of ETL process between multiple databases like Oracle, SQL Server, Hive, and Mongo DB using Python.

• Proficiency in SQL across several dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle).

• Experienced in big data analysis and developing data models using Hive, PIG, and Map reduce, SQL with strong data architecting skills designing data-centric solutions.

• Experience working with data modeling tools like Erwin and ER/Studio.

• Experience in implementing Azure data solutions, provisioning storage account, Azure Data Factory, SQL server, SQL Databases, SQL Data warehouse, Azure Data Bricks and Azure Cosmos DB.

• Experience in designing star schema, Snowflake schema for Data Warehouse, ODS architecture.

• Expertise in Amazon Web Services (AWS) Cloud Platform which includes services like EC2, S3, VPC, ELB, IAM, DynamoDB, Cloud Front, Cloud Watch, Route 53, Elastic Beanstalk (EBS), Auto Scaling, Security Groups, EC2 Container Service (ECS), Code Commit, Code Pipeline, Code Build, Code Deploy, Dynamo DB, Auto Scaling, Security Groups, Red shift, CloudWatch, CloudFormation, CloudTrail, Ops Works, Kinesis, IAM, SQS, SNS, SES.

• Good knowledge of Data Marts, OLAP, Dimensional Data Modeling with Ralph Kimball Methodology (Star Schema Modeling, Snow-Flake Modeling for FACT and Dimensions Tables) using Analysis Services.

• Excellent in performing data transfer activities between SAS and various databases and data file formats like XLS, CSV, etc.

• Expertise in Python and Scala, user-defined functions (UDF) for Hive and Pig using Python.

• Experienced in development and support knowledge on Oracle, SQL, PL/SQL, T-SQL queries.

• Experience in Designing and implementing data structures and commonly used data business intelligence tools for data analysis.

• Expert in building Enterprise Data Warehouse or Data warehouse appliances from Scratch using both Kimball and Inmon’s Approach.

• Experience in working with Excel Pivot and VBA macros for various business scenarios.

• Expertise in SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS).

Technical Skills:

Big Data Tools

Hadoop Ecosystem, Map Reduce, Spark, Airflow, Nifi, HBase, Hive, Pig, Sqoop, Kafka, Oozie, Hadoop, Snowflake, Databricks

Methodologies

RAD, JAD, System Development Life Cycle (SDLC), Agile

Cloud Platform

AWS (Amazon Web Services), Microsoft Azure

Data Modeling Tools

Erwin Data Modeler, ER Studio v17

Programming Languages

SQL, PL/SQL, Python, Scala and UNIX

OLAP Tools

Tableau, SSAS, Business Objects

Databases

Oracle 12c/11g, MS SQL Server, My-SQL, Teradata R15/R14.

ETL/Data warehouse Tools

Informatica 9.6/9.1, and Tableau

Operating System

Windows, Unix, Sun Solaris

Development Methods

Agile/Scrum, Waterfall

Stanford Health care, CA Sep 2022– Present

Role: Data Engineer

Responsibilities:

• Designed AWS architecture, Cloud migration, AWS EMR, DynamoDB, Redshift and event processing using lambda function.

• Implemented usage of Amazon EMR for processing Big Data across a Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3).

• Utilized AWS services with focus on big data analytics, enterprise data warehouse and business intelligence solutions to ensure optimal architecture, scalability, flexibility.

• Experienced in Importing and exporting data into HDFS and Hive using Sqoop.

• Participated in development/implementation of Cloudera Hadoop environment.

• Experienced in running query-using Impala and used BI tools to run ad-hoc queries directly on Hadoop.

• Using Bash and Python included Boto3 to supplement automation provided by Ansible and Terraform for tasks such as encrypting EBS volumes backing AMIs.

• Involved in using Terraform migrate legacy and monolithic systems to Amazon Web Services.

• Wrote Lambda function code and set Cloud Watch Event as trigger with Cron job Expression.

• Validate Scoop jobs, Shell scripts& perform data validation to check if data is loaded correctly without any discrepancy. Perform migration and testing of static data and transaction data from one core system to another.

• Worked on creating and running Docker images with multiple micro – services and Docker container orchestration using ECS, ALB and lambda.

• Developed Spark scripts by writing custom RDDs in Scala for data transformations and perform actions on RDDs.

• Created Metric tables, End user views in Snowflake to feed data for Tableau refresh.

• Generated Custom SQL to verify the dependency for the daily, Weekly, Monthly jobs.

• Implemented Kafka producers create custom partitions, configured brokers, and implemented high level consumers to implement data platform.

Skills: JavaScript, Database Administration, Software Development Life Cycle (SDLC),Good Clinical Practice (GCP),Hive,Computerized System Validation (CSV),Snowflake,HTML,Databases, Data Visualization,Jira, Scrum, Business Analysis,Microsoft Azure

Ojas Innovative Technologies, Hyderabad

Role: Data Analyst Jul 2020 – Aug 2021

Responsibilities:

Developed ETL data pipelines using Spark, Spark streaming and Scala.

• Loaded data from RDBMS to Hadoop using Sqoop.

• Worked collaboratively to manage build outs of large data clusters and real time streaming with Spark.

• Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.

• Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using Spark framework.

• Implemented Spark using Scala and Spark SQL for faster testing and processing of data.

• Data Processing: Processed data using Map Reduce and Yarn. Worked on Kafka as a proof of concept for log processing.

• Designing and Developing Apache NiFi jobs to get the files from transaction systems into data lake raw zone.

• Monitoring the Hive Meta store and the cluster nodes with the help of Hue.

• Data Integrity checks have been handled using hive queries, Hadoop, and Spark.

• Worked on performing transformations & actions on RDDs and Spark Streaming data with Scala.

• Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.

• Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.

• Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.

• Used Sqoop to import data into HDFS and Hive from other data

Skill: C#

Pennant Technologies, Hyderabad June 2018 – Aug 2021

Role: Data Analyst/Data Engineer

Responsibilities:

• I was responsible for the systems study, analysis, understanding of systems design, and database design.

• Created SQL Databases, tables, indexes, and views based on user requirements.

• Worked with the application developers and provided necessary SQL Scripts using T-SQL.

• Data mapping using Business Rules and data transformation logic for ETL purposes.

• Involved in Migrating the data model from one database to an oracle database and prepared an oracle staging model.

• Created Complex ETL Packages using SSIS to extract data from staging tables to partitioned tables with the incremental load.

• I have created SSIS Packages using SSIS Designer for heterogeneous export data from OLE DB Source (Oracle), Excel Spreadsheet to SQL Server 2010.

• Worked on SSIS Package, DTS Import/Export for transferring data from Oracle and Text format data to SQL Server.

• Created SSRS reports using Report Parameters, Drop-Down Parameters, Multi-Valued Parameters.

• Created performance dashboards in Tableau/ Excel / Power point for the key stakeholders.

Skills: SQL Server 2010, SQL Query Analyzer, MS Access, MS Excel, Visual Studio 2010, Erwin.

Contact this candidate