Data Engineer with Python, AWS and SQL

Location:

Bellevue, WA

Posted:

January 20, 2021

Contact this candidate

Resume:

Sravani Korupolu

Bellevue, WA +1-480-***-**** ************@*****.***

Summary:

●7+ years of experience in Analysis, Design, Development, Management, and Implementation of various ETL projects and complex ETL pipelines.

●Experienced in Python programming to setup ETL pipelines and orchestrate them.

●Experienced in handling slowly changing dimensions SCD Type 1 and 2.

●Experienced with normalization and denormalization concepts and datasets.

●Experienced in production support and contributed to the KTLO work in the team.

●Experienced in Job Monitoring, setting up alarms on job failures, troubleshooting and restarting jobs based on the error.

●Experienced in troubleshooting Data Quality Issues, ETL Job failures and backfills.

●Experienced in Data Governance practices & processes.

●Experienced in code deployments to Production Systems and automating daily operational tasks.

●Experienced in relational databases like Oracle, SQLite, PostgreSQL, Redshift, and MySQL databases.

●Experienced in project deployment using Jenkins CI/CD.

●Experienced with event-driven and scheduled AWS Lambda functions to trigger various AWS resources.

●Developed and Maintained Cloud formation scripts and automated the provision of AWS resources, which include EC2, S3,RDS.

●Experience in building frameworks and automating complex workflows using Python for Test Automation.

●Good experience in Shell Scripting, SQL Server, UNIX and Linux, Open stock and Expertise python scripting with focus on DevOps tools, CI/CD and AWS CloudArchitecture.

●Developed Python automation scripts to facilitate quality testing.

●Experienced in full SDLC starting from Design and Development, Testing, and documenting the entire life cycle using various methodologies.

●Good working Experience in Agile (SCRUM) and waterfall methodologies with high quality deliverables delivered on time.

●Wrote Python modules to extract/load asset data from the MySQL source database

●Good working experience in using version control systems like Git andGitHub.

●Responsible for user validations on client side as well as server side.

●Exceptional problem solving and sound decision-making capabilities, recognized by associates for quality of data, alternative solutions, and confident, accurate, decision making.

●Experienced in working with various Python IDE's like PyCharm, VS Code, Spyderetc.

Technical Skills:

Programming Languages: Python, C, Shell Scripting.

OperatingSystems Windows, Mac OS, UNIX and Linux

WebTechnologies HTML/HTML5, CSS/CSS3, XML, JSON and CSS Bootstrap

CloudServices AWS

Database PostgreSQL, Redshift, MySQL

Deploymenttools Amazon EC2, Jenkins

Version Control Systems: SVN, Git

Education: Bachelor’s in Computer Science, 2013.

Professional Experience:

Mitsubishi Financials Rockville, MD Nov 2019 –PresentData Engineer/Python/AWS

Responsibilities:

●Work on maintaining a framework written in Python to process data on AWS EMR

●Work on writing quality checks for the processed data.

●Write SQL, PySpark queries to make sure the retrieved data is adhering to the schema and has no discrepancies.

●Build numerous Lambda functions using python and automated the process using the event created.

●Work on building a framework for data processing on AWSGlue, to increase speed, efficiency and decreasecosts.

●Used Selenium Grid to execute Selenium automation suites on different platform.

●Use AWSGlue to run ETL jobs, both Spark and non-Spark, and crawlers to create datasets

●Write Shell and Python scripts to automate loading data and kicking off some parts of the data pipeline.

●Write SQL, PySpark queries to perform checks on the retrieved data.

●Work on AWS Athena database to write SQL queries and generate reports for business customers.

●Run AWS Step Functions, troubleshoot and resolve issues with the workflow.

●Process data using AWS EMR and performed troubleshooting usingPySpark.

●Generate dataset reports with HTML, CSS to review the quality of the processed data.

Environment: Python 3.6+, HTML, CSS, AWS

Wells Fargo, SanFrancisco,CA Mar 2019 to Oct 2019

Data Engineer/Python

Responsibilities:

●Develop, implement, deploy, and maintain cloud-based ETL pipelines.

●Use Python, REST, Flask, Spark and apply data transformations.

●Wrote well-documented code and packaged it. Followed OOP concepts and created reusable chunks of code.

●Developed CI/CD system with Jenkins on Google's Kubernetes container environment, utilizing Kubernetes and Docker for the runtime environment for the CI/CD system to build and test and deploy.

●Also used python scripts for security purposes, AWS IAM, AWS lambda functions and deploying the applications through elastic beanstalk.

●Worked on Python unit test framework, for applications and tools and Pytest plugins for API/integration testing.

●Used relational databases like SQLite, AWS Redshift, PostgreSQL.

●Developed and Maintained Cloud Formation scripts and automated the provision of AWS resources,whichincludeEC2,S3,ECS,CloudFront,IAM,CloudFormationandGateway.

●Used Amazon SQS to manage message queuing service that enabled us to decouple and scale microservices and server less applications.

●Deploy docker based applications written in Python/Flask to AWS ECS, a container orchestration service.

●Developing Python automation scripts to facilitate quality testing.

●Build user dashboards on Looker to provide data analyses, insights using charts and graphs.

●Perform data transformations with ApacheSpark(PySpark) on Databricks to write to AWS S3.

●Used AWS Glue and Redshift Spectrum for ETL and querying semi-structured data from S3 respectively.

●Develop Django apt on AWS Services including EC2, S3, ELB, EBS, IAM, AMI, Lambda functions, Security Groups and Boto3.

●Monitored database performance with the help of AWS Cloud Watch and communicated with users to consume resources optimally.

●Followed Agile (SCRUM) methodology with high quality deliverables delivered on-time.

EEnvironment:Python 3.5+, HTML, CSS, GitHub, AWS.

VOLKSWAGEN GROUP OF AMERICA Auburn Hills, MI Jun’17 to Feb’19

Data Engineer/Python

Responsibilities:

●Responsible for gathering requirements, system analysis, design, development, testing, and deploying ETL Pipelines.

●Participated in the complete SDLC process.

●Wrote Python modules to extract/load asset data from the MySQL source database.

●Designed and implemented a dedicated MYSQL database server to drive the web apps and report on daily progress.

●Used Django framework for application development.

●Performed testing and deployment automation with Docker,Jenkins.

●Developed user interface using CSS, HTML,JavaScript.

●Created most important Business Rules which are useful for the scope of project and needs of customers.

●Worked on creating the Docker containers and Docker consoles for managing the application life cycle.

●Developed a CI/CD pipeline.

●Improved performance by using more modularized approach and using more in -built methods.

●Hands-on experience on SCM tools like GIT, containers like Docker and deployed the project into Jenkins using GIT version control system.

●Developed a fully automated continuous integration system using Git, Jenkins, MySQL, and custom tools developed inPython.

●Wrote scripts to automate operational tasks.

●Wrote unit test cases for testing tools.

●Designed and configured database and back end applications and programs.

●Performed research to explore and identify new technological platforms.

●Collaborated with internal teams to convert end user feedback into meaningful and improved solutions.

Environment: Python 2.7, 3.4, Jenkins, MySQL, HTML, CSS, Linux, Git, Docker.

Bitsoft Systems, India May 2013 – Oct2016

Software Engineer

Responsibilities:

Developed entire frontend and backend modules using Python.

●Designed and developed data management system using MySQL.

●Involved in writing application level code to interact with APIs, Web Services using JSON.

●Involved in AJAX driven application by invoking web services/ API and parsing the JSON response.

●Responsible for supporting Linux servers for production, development, and testing.

●Installing, Configuring, and Maintaining the DHCP, DNS, NFS, NIS, send mail server.

●Automation of jobs through Crontab and AutoSys.

●Used GIT for the version control.

●Performance tuning and preventive maintenance. Performed daily backup.

●Performed administrative tasks such as system start-up/shutdown, backups, Documentation, User Management, Security, Network management, configuration of dumb terminals.

●Troubleshooting backup and restore problems and performed day-to-day trouble shooting for the end users on Linux based servers.

●Configured and maintained NIS, NFS servers onLinux.

●Setting up Oracle 10g server in Linux/Unix environment.

EEnvironment:Python, CSS, HTML,Oracle 10g, Linux.

Contact this candidate