Resume

Data Engineer Sql Server

Location:

Dallas, TX

Posted:

March 11, 2024

Contact this candidate

Resume:

Priyanka Ganta

MOBILE NO: +1-469-***-**** EMAIL ID: ad39it@r.postjobfree.com Dallas, Texas.

Professional Summary:

As a Data Engineer, I have 5+ years of experience in analysis, design, development, and implementation.

Use Business Intelligence tools such as Business Objects and Data Visualization tools such as Tableau, Power BI.

Solid understanding of data modeling, data source evaluation, ETL, and client/server applications.

Expert in offering ETL solutions for any company model.

Experience creating on-premises and real-time procedures.

Experience in AWS services, including EC2, RDS, DynamoDB, SNS, IAM, and Athena.

Strong understanding of and experience with NoSQL databases such as MongoDB and Cassandra.

Strong experience in Data Extraction, Transforming and Loading (ETL) from/to multiple sources like Excel, MS Access, XML, Oracle, Salesforce2 using ETL tools.

Excellent programming skills with PL/SQL, SQL, Oracle.

Knowledge of developing ETL pipelines in and out of data warehouses using Python and SQL.

Experience in working with Python ORM Libraries including Django.

Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data processing.

Excellent in learning, adapting to modern technologies in a demanding environment; initiative-taking to quickly resolve challenges.

Strong experience in generating drill-down, drill -through and cascaded-parameterized reports on top of relational databases and cubes.

Created stored procedure for generating reports using SQL Server Reporting Services (SSRS).

TECHNICAL SKILLS:

Programming Language:

Python, C, C++, Java, SQL

Hadoop/Big Data Stack:

Hadoop, HDFS, Hive, Pig, Cassandra, Spark SQL, Snowflake, Apache spark

Cloud Technologies

AWS (S3, EC2, EMR, Dynamo DB, Redshift), GCP

ETL Tools

SQL Server Integration services, Informatica.

Reporting Tools

Power Bi, Tableau, R, SAS, Rapid miner, SPSS-Statistics.

Query Languages:

Hive QL, SQL, Pig

Databases:

IBM DB2, Oracle, SQL Server, Teradata, Cassandra.

Operating Systems:

Windows, Linux, Unix

WORK EXPERIENCE:

Adroit Innovative Solutions Inc, Dallas, TX Aug 22- Till date

Role: Data Engineer

Responsibilities:

Architected and implemented ETL and data movement solutions using Azure Data Factory and SSIS, creating and running SSIS packages.

Converted Hive/SQL queries into Spark transformations using Spark RDD and Python and utilized Streams and Lambda expressions available as part of Java 8 to store and process data, improving service performance.

Implemented Java 8 features to optimize services and written templates for Azure Infrastructure as code using Terraform to build staging and production environments.

Utilized Spark SQL on top of PySpark engine for querying data, performing data cleansing, validation, and applying transformations through Python API.

Integrated Apache Kafka for importing real-time network log data into HDFS and processed it in real-time using Spark on Yarn cluster, coordinating data pipelines using Kafka and Spark Streaming with API Gateway REST service.

Implemented CI/CD process using Jenkins and GIT, deploying Azure Kubernetes Cluster and writing YAML files for various services such as pods, deployments, auto-scaling, load balancers, labels, health checks, Namespaces, and Config Maps.

Imported and exported data from Snowflake, Oracle, and MySQL DB into HDFS and Hive using Sqoop for analysis, visualization, and report generation.

Deployed and tested developed code using Azure DevOps for continuous integration and continuous deployment.

Developed Python scripts for vulnerability analysis with SQL Queries, including SQL injection checks.

Worked with Snowflake environment to remove redundancy and load real-time data from various sources into HDFS using Kafka.

Environment: Azure (HDInsight, Data Bricks, Data Lake, Blob Storage, Data Factory, SQL DB, SQL DWH, AD), AWS, Hadoop2.x (HDFS, MapReduce, Yarn), Hadoop, Hive v2.3.1, Spark v2.1.3, Python, SQL, Sqoop v1.4.6, Kafka v2.1.0, Airflow v1.9.0, HBase, Cassandra, Oracle, Teradata, MS SQL Server, Agile, Unix, Informatica, Talend, Tableau.

Accenture

Client: MCcormick & company, Maryland, united states April 2020- july 2021

Role: Data analyst

Responsibilities:

Involved in using all the AWS services, including EC2, RDS, DynamoDB, SNS, IAM, and Athena, with an emphasis on high availability, fault tolerance, and auto-scaling in AWS Cloud Formation.

Created Hive queries to pre-process the data needed for the business process.

Developed and deployed AWS Lambda functions to generate a serverless data pipeline that can be uploaded to Glue Catalog and queried from Athena for ETL Migration services.

Developed an ETL pipeline to fetch these datasets and transmit the derived ratio data from AWS to Datamart (SQL Server) to the Credit Edge server.

Experience with relational databases (e.g., Microsoft SQL Server, Oracle, MySQL) and columnar databases (e.g.,Microsoft SQL Data Warehouse) and configuring them.

Worked in Hive on partitioning, bucketing, join optimizations, and query optimizations.

Collaborated closely with business to translate business needs into technological requirements.

Developed HBase tables to load enormous amounts of structured, semi-structured, and unstructured data from UNIX, NoSQL, and several portfolios.

Hands on experience on Data Analytics Services such as Athena, Glue Data Catalog & Quick Sight.

Knowledge of how to install, configure, maintain, and manage Hadoop clusters using HDP and other

distributions.

Extract, convert, and load data from various formats such as JSON and a database, then expose it for ad-

hoc/interactive searches using Spark SQL.

Techno soft solutions, Hyderabad, India. Jan 2018- March 2020

Role: Data Engineer

Responsibilities:

· Analyzed business processes and coordinated with business analysts and stakeholders.

·Created source to Target documents for IT development teams.

·Involved in designing and managing schema objects such as Tables, Views, Indexes, Stored Procedures, Triggers and maintained referential integrity using SQL Server Management Studio.

·Processed ETL to transfer data from remote data centers to local data centers using SSIS. Cleansing, messaging of the data done on the local database.

·Developed data ingestion modules (both real time and batch data load) to data into various layers in S3, Redshift and Snowflake using AWS Kinesis, AWS Glue, AWS Lambda and AWS Step Functions.

·Developed Hive queries to pre-process the data required for running the business process.

·Worked on root cause analysis for the all the issues that occur in production or batch and provide the resolution for the issues.

·Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena.

·Utilized Spark SQL API in PySpark to extract and load data and perform SQL queries.

·Designed an ETL strategy to transfer data from source to landing, staging, and destination in the data warehouse using SSIS and DTS (Data Transformation Service).

·Developed SSIS packages to export data from Excel/Access to SQL Server, automated all the SSIS packages and monitored errors using SQL Server Agent Job.

·Day to-day responsibility includes developing ETL Pipelines in and out of data warehouse, develop major regulatory and financial reports using advanced SQL queries in snowflake.

·Implement One time Data Migration of Multistate level data from SQL server to Snowflake by using Python and SnowsQL

·Worked with subject matter experts to understand business logic and implement complex business requirements in the back using efficient test scripts and flexible functions.

·Designed scalable ETL pipelines in the data warehousing platform by automating the data from multiple SFTP servers into Teradata using Informatica to develop and quantifiable metrics and deliver actionable outputs.

·Create CFT and Terraform scripts to automate deployment and configuration of various data lake components on AWS such as Cassandra clusters.

·Migrated various data sources to AWS S3 and scheduled ETL jobs using AWS Glue to build tables in AWS.

·Experienced in using python libraries like NumPy, SciPy, matplotlib, Python-twitter.

· Experienced in Requirement gathering, Use Case development, Business Process flow, Business Process Modeling.

Technologies: AWS, Spark, Pyspark, Scala, Python, Pig, HBase, SQL, Spark, Teradata, AWS services, Tableau, Cassandra.

Education Details:

Bachelor’s in electrical engineer from Osmania University (3.4/4) CGPA

Masters in data science from University of North Texas (3.83/4) CGPA

University of North Texas, Denton.

Grader Assistant: Jan 2022 - May 2022

Worked as a Teaching Assistant for the courses Data Visualization Tools & Techniques, Knowledge Management Tools & Techniques.

Participate in the assessment process, using a variety of methods and techniques and provide.

effective, timely, and appropriate feedback to students to support their learning.

Helped professor in grading the students’ papers.

ACADEMIC PROJECTS:

World University Rankings University of North Texas Denton. Fall 2022

The main aim is to build dashboards to identify the best universities.

The objective was to develop different charts such as Bar Graphs, Pie Charts, Bullet Charts, Scatter Plots, Line Chart, Box and Whisker Plots, Area Chart for identifying the best Universities.

The conclusion is to determine which country has the top universities based on the dashboard statistics.

Technologies and Tools: Tableau, Excel, SQL, PowerBI.

Osmania University

Worked on a project of DC motor speed and direction control.

The main Aim is to use the touch screen we need to operate the direction and speed of the motor,

By using simulation software, we need to build a code for touch screen.

It will work by using the code.

Contact this candidate