N RAMYA
Sr.Data Engineer & ETL Developer PH :+1-512-***-**** Email: *********@*****.***
Certified professional data engineer with overall IT experience more than 9+ years in different domain clients and technology spectrums. Experienced in working in highly scalable and large scale applications building with different technologies using Cloud, BigData, DevOps and ETL. Also, expert in working in different working environments like Agile and Waterfall. Experience in working multi cloud, migration and scalable application projects.
Professional Summary
BigData
Experience in working various hadoop distributions like Cloudera, Hortonworks and MapR.
Expert in ingesting batch data for incremental loads from various RBMS tools using Apache Sqoop.
Developed scalable applications for real-time ingestions into various databases using Apache Kafka.
Developed Pig Latin scripts and MapReduce jobs for large data transformations and Loads.
Experience in using optimized data formats like ORC, Parquet and Avro.
Experience in building optimized ETL data pipelines using Apache Hive and Spark.
Implemented various optimizing techniques in Hive scripts for data crunching and transformations.
Experience in building ETL scripts in Impala for faster access for reporting layer.
Built spark data pipelines with various optimization techniques using python and scala
Experience in loading transactional and delta loads in to NoSQL databases like HBase.
Developed various automation flows using Apache Oozie, Azkaban and Airflow.
Experience in working with NoSQL Databases like HBase, Cassandra and MongoDB.
Experience in various integration tools like Talend, NiFi for ingesting batch and streaming data.
Cloud
Experience in working with various cloud distributions like AWS, Azure and GCP.
Developed various ETL applications using Databricks Spark distributions and Notebooks.
Implemented streaming applications to consume data from Event Hub and Pub/Sub.
Developed various scalable bigdata applications in Azure HDInsight’s for ETL services .
Developed scalable applications using AWS tools like Redshift, DynamoDB.
Worked on building pipelines using snowflake for extensive data aggregations.
Working knowledge on GCP tools like BigQuery, Pub/Sub, Cloud SQL and Cloud functions.
Experience in visualizing reporting data using tools like PowerBi, Google analytics.
ETL
Worked with Informatica business team on Prototype for Informatica Cloud Application Integration.
Worked on scheduling job dependent MFT jobs in Autosys.
Design and automation of secured file transfer, using Managed File Transfer (MFT).
Worked on Informatica Life Cycle Management (ILM) jobs and supporting ILM process.
Experience with Data Profiling/Data Quality using Informatica Developer, BDM and MDM toolset
Experienced in writing Unix Shell Scripts for the automation of ETL processes and using schedulers like Autosys, Control-M.
DevOps
Experience in building continuous integration and deployments using Jenkins, Drone, Travis CI.
Expert in building containerized apps using tools like Docker, Kubernetes and terraform.
Developed reusable application libraries using docker containers .
Experience in building metrics dashboards and alerts using Grafana and Kibana.
Expert in java and scala built tools like Maven, Pom and SBT for application development.
Experience in working with tools like GitHub, GitLab and SVN for code repository.
Expert in writing various YAML scripts for automation purpose.
Experience Summary
Client
Etrade Financial Services
Location
Atlanta, Georgia
Designation
Sr.Data Engineer
Duration
Nov 2021 – Present
Responsibilities
Developed code for importing and exporting data from DB2 into s3 buckets and vice versa.
Handled ETL Framework in Spark with python for data transformations.
Implemented various optimization techniques for Spark applications for improving performance.
Involved in Spark streaming for real-time computations to process JSON files from Kafka.
Integrate AWS Glue job with supporting AWS services including DynamoDB, SQS.
Develop SQL logic in various flavors in SparkSQL, Snowflake, DB2, Oracle.
Developed various scripting functionality using shell, Bash and Python for various operations.
Pushed application logs and data streams logs to GEM server for monitoring and alerting purpose.
Developed Jenkins pipelines for continuous integration and deployment purpose.
Implemented integrations to various cloud environments like AWS, Azure and GCP for external vendor integrations for files exchange systems.
Implemented secure transfer routes for external clients using microservices to integrate external storage locations like AWS S3.
Developed automated file transfer mechanism using python from MFT, SFTP to HDFS.
Gathered requirements from the business users for Genome project and translate them into technical specifications for new features or enhancements.
Developed detailed user stories based on the requirements and worked closely with business users.
Developed implementation plans to streamline processes to reduce costs.
Designed and developed Informatica cloud mappings and Sessions based on business user requirements and business rules to load data from sources such as source flat files and snowflake tables.
Developed informatica cloud mappings to load data from NAS location to snowflake internal stage.
Worked on various transformations like Expression, Aggregator, SQL, Lookup, Filter, Router, Rank.
Environment: Apache Hadoop 2.0, Kafka, Spark, Linux, MySQL, Oozie, SFTP, GEM, DB2, Oracle, Snowflake, IICS
Client
Toyota Financial Services
Location
Dallas,Texas
Designation
Sr.Data Engineer
Duration
Feb 2020 – Oct 2021
Responsibilities
Developed code for importing and exporting data from RBMS into HDFS using Sqoop and vice versa.
Implemented Partitions, Buckets based on State to further process using Bucket based Hive joins.
Developed custom UDF’s in Java as and used whenever necessary to reduce code in Hive queries.
Handled ETL Framework in Spark with python and scala for data transformations.
Implemented various optimization techniques for Spark applications for improving performance.
Involved in Spark streaming for real-time computations to process JSON files from Kafka.
Developed API’s for quick real-time lookup on top of HBase tables for transactional data.
Built optimized dynamic schema tables using AVRO and columnar tables using parquet.
Built various oozie actions, workflows and coordinators for automation purpose.
Developed various scripting functionality using shell, Bash and Python for various operations.
Pushed application logs and data streams logs to Grafana server for monitoring and alerting purpose.
Developed Jenkins and Drone pipelines for continuous integration and deployment purpose.
Worked on building various pipelines and integration using NiFi for ingestion and exports.
Built custom end points and libraries in NiFi for ingesting data from traditional legacy systems.
Implemented integrations to various cloud environments like AWS, Azure and GCP for external vendor integrations for files exchange systems.
Implemented secure transfer routes for external clients using microservices to integrated external storage locations like AWS S3 and Google Storage Buckets(GCS).
Built SFTP integrations using various VMWare solutions for external vendors on boarding.
Developed automated file transfer mechanism using python from MFT, SFTP to HDFS.
Environment: Apache Hadoop 2.0, Cloudera, HDFS, MapReduce, Hive, Impala, HBase, Sqoop, Kafka, Spark, Linux, MySQL, NiFi, Oozie, SFTP
Client
Charter Communications
Location
Hartford, Connecticut
Designation
Sr.Data Engineer
Duration
Oct 2018 – Feb 2020
Responsibilities
Developed Hive ETL Logic for data cleansing and transformation of data coming through RBMS.
Implemented complex data types in hive also used multiple data formats like ORC, Parquet.
Worked in different parts of data lake implementations and maintenance for ETL processing.
Developed Spark Streaming application using Scala and python for processing data from Kafka.
Implemented various optimization techniques in spark streaming with python applications.
Imported batch data using Sqoop to load data from MySQL to HDFS on regular intervals.
Extracted data from various APIs, data cleansing and processing by using Java and Scala
Converted Hive queries into Spark SQL that integrate Spark environment for optimized runs..
Developed a migration data pipelines from HDFS on prem cluster to Azure HD Insights.
Developed Complex queries and ETL process in Jupyter notebooks using data bricks spark.
Developed different modules in microservices to collect stats of application for visualization.
Worked on docker and Kubernetes for deploying application and make it containerize.
Implemented NiFi pipelines to export data from HDFS to cloud locations like AWS and Azure .
Ingested data from Azure Event Hub for Realtime data ingestion into various applications.
Experience designing solutions in Azure tools like Azure Data Factory, Azure Data Lake, Azure SQL & Azure SQL Data Warehouse, Azure Functions.
Implemented DataLake migration from on prem clusters to Azure for highly scalable solutions.
Worked on implementing various airflow automations for building integrations between clusters.
Environment: Hive, Sqoop, Linux, Cloudera CDH 5, Scala, Kafka, HBase, Avro, Spark, Zookeeper and MySQL, Azure, Databricks, Scala, Python, airflow .
Client
Amadeus
Location
Portsmouth, Nh
Designation
Sr.Data Engineer
Duration
Aug 2016 – Sep 2018
Responsibilities
Worked on building and developing ETL pipelines using Spark-based applications.
Worked in migration of RDMS data into Data Lake applications.
Build optimized hive and spark jobs for data cleansing and transformations.
Developed spark scala applications in an optimized way to complete in time.
Worked on various optimizations techniques in Hive for data transformations and loading.
Expert in working with dynamic data schema evolutions like Avro formats.
Built API on top of HBase data to expose for external teams for quick lookups.
Experience in building impala script for quick retrieval of data to expose through tableau.
Experience in developing various oozie actions for automation purpose.
Developed a monitoring platform for our jobs in Kibana and Grafana.
Developed real-time log aggregations on Kibana for analyzing data.
Worked in developed Ni-Fi pipelines for extracting data from external sources.
Developed Jenkins pipelines for data pipeline deployments.
Worked on building different modules in spring boot scalable applications.
Developed Docker container for automating run time environments for various applications.
Expert in building ingestion pipelines for reading real time data from Kafka.
Worked in Poc for setup Talend environments and custom libraries for different pipelines.
Developed various python and shell scripting for various operations.
Worked in Agile environment with various teams and projects in fast phase environments.
Environment: Hadoop, Sqoop, Pig, HBase, Hive, Flume, Java 6, Eclipse, Apache Tomcat 7.0, Oracle, Java, J2ee, Talend, NiFi, Scala, Python .
Client
Cube It Innovations
Location
Hyderabad, TG
Designation
Etl Developer
Duration
Aug 2013 – Oct 2015
Responsibilities
Logical to Physical model conversion
Source views creation for Consumer CCAR
ETL jobs for Base to Main load
Working with data modelling team for solving physical model related issues
Wrote UNIX shell Scripts for file validation
Metadata design for ETL process
Developing loading scripts using Teradata Utilities
Automating the loading process using Unix shell scripts
Creating Edit check views
Creating Business access views
Technical Summary
Big Data
Hadoop, Sqoop, Flume, Hive, Spark, Pig, Kafka, Talend, HBase, Impala
ETL Tools
Informatica, Talend, Microsoft SSIS, Confidential DataStage
Database
Oracle, Mongo DB, SQL Server 2016, Teradata, Netezza, MS Access
Reporting
Microsoft Power BI, Tableau, QlikView, SSRS, Business Objects(Crystal)
Business Intelligence
MDM, Change Data Capture (CDC), Metadata, Data Cleansing, OLAP, OLTP, SCD, SOA, REST, Web Services
Tools
Ambari, SQL Developer, TOAD, Erwin, Visio, Tortoise SVN
Operating Systems
Windows Server, UNIX (Red Hat, Linux, Solaris, AIX)
Web Technologies
J2EE, JMS, Web Service
Languages
UNIX shell scripting, SCALA, SQL, PL/SQL, T-SQL
Scripting
HTML, JavaScript, CSS, XML, Shell Script, Perl, and Ajax
Cloud
Azure, AWS, GCP
Version control
Git, SVN, CVS, GitLab
Tools
FileZilla, Putty, PL/SQL Developer, JUnit,
IDE
Eclipse, Microsoft Visual Studio 2008,2012, Flex Builder