Professional summAry
Over *+ years of experience in SDLC Design, Development, Analytics, Quality Assurance, Build and Release, Integration, Support and Maintenance of large scale enterprise applications.
Excellent experience on Data Ingestion, Analysis, Governance, using tools and Big Data technologies – Hadoop HDFS, Map-Reduce, Pig, Hive, Sqoop, Kafka, Airflow,Flume, HBase, Zookeeper, REST,YARN, Scala, Cassandra, CockRoach, Spark, RDD, Streaming, Detaiku,Dataware house, Impala, Oozie, Weblech, Talend
Excellent experience on cloud environment - AWS, Cloudera/ Horton Nws, Google, and Azure for Big data.
Extensive experience on writing scripts using Scala, Flink, Shell script, Python, SQL, MapR, Pig Latin applications.
Experience on working ETL tool – Talend, Informatica,Snowflake,Data Lake and Data Warehouse and Data Marts, Snowspark,Pyspark,Scala, Java
Experience in working on version control systems like CVS, Subversion and GIT, SVN, putty, Git Bash, GitHub.
Excellent experience on multicloud AWS,Gcloud,Azure using Terraform, HCL, Ansible,Play books, Docker, Kubernates, Kubectl, pods,deploy environment configurations
Deep experience with data schema architecture and ETL design, Build & Scale existing Cloud Infrastructure, performance tuning
Modern data science methodologies including, but not limited to classification algorithms, natural language processing, and assessment of biases and uncertainty in those methods
Experience on AWS - (EC2, S3, RDS, Cloud-watch, DMS, Security Services,Lambda, ECS,Glow, CLI/sdk,EBS,EMR, Glacier, EC2, IAM, CDK, Redshift,VPC, NAT, ACL, DNS, Proxies, Firewalls, VPC Endpoints,Open Search, Direct Connect, VPN etc. ), Azure - (Data factory, ADF, Functions,Synapse,Notebooks, Databricks, Delta, Cosmos), GCP – (BigQuery,CLoud Storage, Dbs, Functions, gutil,gcloud,CLI,sdk, Console, Dataproc)
Expertise multicloud using Terraform, HCL, Docker, Airflow,Kubernates, Kubectl, pods,deploy environment configurations, install software.
Good Experience on Caching, Object and Block Storage, Scaling, Load Balancing, CDNs, Networking features
Having work experience in support of multi platforms like Ubuntu, Fedora and Windows for production, test and development servers.
Excellent experience on Python libraries - Oops, API, text, Pandas,NumPy, SciPy, MatplotLib, SeaBorn, NLP, Tensorflow, PyTorch, etc
Good Understanding on Modern data science methodologies including, classification algorithms, natural language processing
Good understand on ML models, statistical methods and research design such as linear, logistic, and conditional regression modeling, parametric and non-parametric statistics, cross-sectional and longitude
Experience on create XML, ANT, Shell, Ruby, Go, YAML,Real time, Perl,Iceberg,Hudi, Power Shell, HTML, DHTML, JSP and Java Scripts.
Good exposure on Application servers- Tomcat, Apache, Nginx, Weblogic, IBM Websphere, JBoss, Jetty, IIS.
Experience on writing PL/SQL, packages,T-SQL, Stored procedures, modeling, functions and complex SQL queries for databases - Oracle, MySQL, Sybase, DB2, Teradata, PostGre.
Extensive experience on J2EE, Java, Web services,SpringBoot, Micro Services,SOAP, REST, EJB, JMS and Spring Cloud, security.
Application integration with ides like Eclipse, RAD, WSAD, WTX, JDeveloper and deployed to application servers
Experience on Finance,Insurance, Banking, Health-care,Transport,Telecom, Accounting Domains
Extensive knowledge on Construction domain
Experience on methodologies and Architectures - SOA, Oops and Agile -Scrum, Kanban, Lean.
Excellent Problem solving, communication and interpersonal skills
SKILL SET:
Operating systems
Linux(Ubuntu,RedHat, Fedora, Cent OS), Unix, Windows, Ms DOS
Cloud Environment
Cloudera, AWS,EC2, ECS, RDS Lambda, DynamoDB, Cloud-Watch,Azure,Worked on Azure Data factory, Synapse, Cosmos, Datalake, Bigquery, GC-Stack
BIGDATA/ETL
Java, Python,R,MongoDB, Hadoop, Map Reduce, Pig, Sqoop, Flume, HDFS, Hive, HBase, Storm, Spark-MLIB, Streaming, Webleich, Kafka, Cassandra, Talend Studio, Power BI,Hudi, MS Office - Excel,Word,PowerPoint, MS Access,Iceberg,Airflow,
Python libraries - Oops, API, text, Pandas,NumPy, SciPy, MatplotLib, SeaBorn,Tensorflow, PyTorch
J2EE
SpringBoot, MicroServices,React, Struts, Spring, Hibernate,EJB, Node
Automation Tools for Continuous
(Integration, Delivery, Deployment)
Terraform, Puppet, Chef, Ansible, Docker, Kubernetes, Jenkins, GitHub
Application servers
Weblogic,Tomcat, Websphere, OC4J, JBoss, WESB
Databases
Oracle, DB2/UDB, Sybase, SQL Server, MariaDB
Design tools
OOAD, Design Patterns, UML, Talend Studio,MYSQL, Teradata
SCRIPTS
JSON, YML, Java Script, XML, HTML5, CSS3, Perl, Ajax, PHP, Python, XSL, XSLT, DOM, SAX,Python- SQLlite3,Pandas, NTDP, OOPS, Tensor, Datasets, Matplot
Professional DETAILS :
Project: Repave
Client: JPMC,Plano TX
Period: Jan '2021– till date
Role: Sr. Technical consultant
Responsibilities:
Collaborated with stakeholders, dependency teams, team members for requirements, issues, project dependencies
Worked on Relational and Non-Relational databases architect, design, modeling, migrate SQL and NoSQL data from on-prem to Cloud
Worked on AWS services – Cloud Formation Templates, EMR, Bitbucket, Artifactory,Jenkins, CloudWatch, EC2, DynamoDB, NoSQL, Airflow, MongoDB, RDS, AWS Glue, Athena, Step wise, serverless, ECS, S3- Standard, Glacier, Aurora DB, Lambda, SQS, SNS, API gateway, Route 53, Kinesis, CloudFront, VPC, NAT, ACL, AWS CDK/CLI, EKS
Worked on Java, Python,Scala, Scoop, Spark,GIT, Shell script, YML, JSON, Hive, HDFS,Teradata, Kafka, ETL tools Informatica IICS, Talend,Snowflake
Worked on the services like Caching, Object and Block Storage, Scaling, Load Balancing,aggregation
Worked on scripting with Python libraries - Oops, API, Text, Pandas, Gen2, NumPy, SciPy, MatplotLib, data partition, building data pipelines, machine learning models, dashboards, Spark, Text files, CSV and JSON.
Worked on ETL/ELT architecture, ETL operations, performance query optimizer techniques – Fast data load, Multiload, transformations, cleansing, archiving.
Worked on Data Modeling, DB Performance Tuning, Data Security & Governance, Cloud Integration and Management, Cloud Infrastructure, Automation and DevOps, Data Warehousing and Analytics, Security and Compliance
Develop data transformation processes, including data cleansing, normalization, aggregation, and enrichment to prepare data for analytics and reporting
Worked on environment create, maintain, deploy using Terraform, Docker, Hashicorp scripts,Terradata utilities, pipelines, registries,images
Worked on Create and maintain data models, define data structures, relationships and data storage requirements, using techniques like entity-relationship and data flow diagrams
Worked on identify and resolve performance bottlenecks in data processing and storage systems, optimizing query performance and improving overall data pipeline efficiency
Worked on Monitor data pipelines, diagnosing and troubleshooting issues, performing system upgrades and maintenance tasks to ensure data reliability and availability
Build automate data pipelines, maintain pipelines, identify existing data gaps and provide automated solutions to deliver analytical capabilities and enriched data to applications
Track and effectively communicate sprint/release progress to all affected teams and management
Worked on multicloud AWS,Gcloud,Azure using Terraform, HCL, Ansible,Play books, Docker, Kubernates, Kubectl, pods,deploy environment configurations, install software and patches, create clusters, instances, subnets, databases and load data into the target database.
Worked on Teradata utilities, enhance Teradata performance by using S3 for staging data or integrating with Redshift for analytics
Worked on Tablue to analyse the performance, forecasting, Bi weekly reports, Bar charts, monthly performace
Developed several complex Teradata, relational SQL queries, PL/SQL stored procedures, jobs, packages, procedures, functions
Environment : Shellscript,Jira,Excel,CloudFormation, Lambda, Stepfunctions,Teradata, YML, EMR, Bitbucket, Jenkins, CloudWatch, EC2, RDS, Terraform, Docker, EKS, DynamoDB, MongoDB, RDS, Talend,Gitlab, AWS glue, Athena, Avro, Parquet, Data pipelines, Hive, HDFS, PostgreSQL, Devshell, Python, Kafka,Tablue, Oracle, Tomcat, Java11, SQL, ServiceNow, Spark, PL/SQL
Project: LIBOR Transition
Client: CITI group, Tampa,FL
Period: Jan' 2020 to Nov' 2020
Role: Bigdata engineer
Responsibilities:
Worked on Data requirements analysis, ingestion, creation, manipulation, transformation,deployments
Worked on spark scripts, Shell scripts, creating DDL, DDF, schemas,SQL scripts, aggregations.
Worked on Databricks, Delta tables, workflows, analytics, Notebooks, pyspack,pipelines,deploy,cache
Worked on data requirements – Data ingestion, data analysis, data processing, develop code, deploy to pipelines, monitor pipeline performance
Created the VPC, configured the subnets, attached the gateway and routing tables to the subnets and deployed the EC2 instances in the subnets created.
Worked on AWS - CloudFormation templates, YML, Data Vault, Bitbucket, Artifactory,Jenkins and CloudWatch, EC2, RDS, Terraform, Docker, K8, Ansible, AWS glue,Athena, Step wise,serverless, deployment
Worked on Gcloud, dataproc, cloud storage,BigQuery, Messaging, Gutils,Gcloud, Linux platforms, storage, IAM,Security,Roles, Polacies
Worked on the services - Caching, Object and Block Storage, Scaling, Load Balancing, CDNs, Networking
Worked infrastructure Asset Management initiatives including webservers, Database servers
Create and maintain data models, defining data structures, relationships, and data storage requirements, using techniques like entity-relationship diagrams and data flow diagrams
Develop data transformation processes, including data cleansing, normalization, aggregation, and enrichment, to prepare data for analytics and reporting
Implement data quality assurance processes, validating data pipelines, and resolving data quality issues
Worked on multiple Warehouses, Dblakes, databases, file formats especially with SnowFlake, BigQuery, PostgreSQL, Parquet and Avro
Worked on Python, Scala, Spark real time, Map Reduce, Spark transformations, Spark RDD's, Spark streaming, Spark SQL using Python - memory management, performance management, parallelism
Worked on CITI specific tools like Autosys, DTS - aggregations, joins and testing.
Deployed code to Dev, Staging, SIT and test using shellscript and spark scripts.
Worked on creating Sqoop, Hive, External tables,manage tables, DDL, DML, aggregations deploy to Bit bucket.
Created Avro schema, spark code, Bit bucket maintenance, trigger Jenkins content.
Talend to Design, build, and test data integration jobs for large-scale data-intensive applications
Worked on spark jobs, improve performance, changing code and options for submit as per need.
Environment: Python,spark, Shell script,Azure SynapseDB, AWS, SNS,SQS, Lambda,EC2, S3, Route53, Snowflake, RDS, Rest API, JAVA11, SpringBootAPI, Terraform, Docker, Hashicorp, Maven,Hadoop2, Sqoop, Hive,DMS, Pycharm, Spark, Oozie, HBase,Impala,Cloudera, SpringBoot, Arcadia, SQL, Linux, RedHat, MySQL, YARN, NOSQL,Oracle, Bit Bucket, Eclipse4.5, GIT, Tectia, Shell, ETL, Oracle, Jenkins.
Project: Loadplan Builder
Client: FedEx, Pittsburgh,PA
Period: Feb' 2019 - Dec' 2019
Role: Sr. Software Consultant/Hadoop Consultant
Description: This application gives information about Aircraft Maintenance like the regions of Aircraft, Airport stations, Workers, Employee Maintenance, Employee timings, Airport Gates, Departments, Employee shifts, Available hours of employees, required hours for the particular work. Payroll for Employees, Aircraft maintenance, Region, gates, Stations. etc. Customer activities, data mining, analysis, filter, move to data pipelines, refine, move to permanent storages. Here with my brief responsibilities as:
Responsibilities:
Worked on implementation of Avro, ORC, and Parquet, text, data formats for computations to handle custom business requirements.
Worked in retrieving transaction data from RDBMS to HDFS, save the output in Hive tables as per user using MapReduce jobs.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Worked on Hive to create Hive tables, load data, write Hive queries,buckets,map, search, caching
Worked on OOZIE Operational Services for batch processing and scheduling workflows dynamically
Worked on Kafka, Spark streaming, Consumers and Producers, source XML Type columns to dynamically extract, integrate and load data in target schema using Talend
Worked on Spark - memory management, performance management, parallelism,SQL - aggregation, explain plans, joins, sort-merge, nested-loops
Worked on Terrform,Docker,JSON,YAML,K8 to create containers, Pipelines,maintain pipelines, registiry
Worked on Python libraries API,Matplotlib, text, Pandas,Numpy, Mapplot- Libraries, methods, functions, data partition, multi threading, transformations
Worked on Azure Services - Data Lake Storage, Data Factory,SQL, Data Warehouse, Synapse Analytical
Worked on REST Web services, Micro Services development and deployments, testing deployments.
Worked on performance tuning, monitoring and testing configurations and code changes.
Worked on Talend for data models, data formats - XML,JSON, CSV, batch jobs using MapReduce and Spark
Worked on ELK configurations and writing JSON Script, Facets, documents,Update,create,index,search,queries .
Worked on creating auto deployment configurations of cluster using Cloud formation templates for AWS components.
Worked on creating cluster environment and test scale-in and scale-out, load balencing, monitor and troubleshooting.
Environment: Java, AWS, EC2, EMR, S3, Route53, AWS, SNS,SQS, Lambda, RDS,Redshift, version control, Micro Services, Rest WS, JAVA8, Maven,Hadoop2, PIG, Sqoop, ELK, Synapse, Elastic search, Hive, Spark Streaming, PySpark, Oozie, HBase, Python, SQL, Linux, RedHat, Talend, MySQL, YARN, NOSQL, GIT, Shell, ETL, Oracle, DB2, Web sphere AS, Jenkins.
Project: Image Processing
Client: Microsoft, Redmond
Period: Feb' 2018 to Oct' 2018
Role: Sr Technical Consultant
Description: The project basically developed for the security of Microsoft Corporation internal, The aim is to capture the one/stream of human images and their movements - face, hand movements, legs and gestures of body and stored in a file system, and other functions like image comparison, grabbing things, directions.
Responsibilities:
Worked on create software scripts to automate test, staging and production service deployments.
Worked on Jenkins for automating builds and automating deployments maintained with shell script.
Performance tuning and debugging java code database environments in order to ensure acceptable database performance in production mode.
Worked on Cloud formation,stack templates, Auto scaling, loadbalencing to automate and deploy AWS resources and configuration changes
Worked on configuration management tool Ansible for continuous delivery.
Created playbooks for new environments and modified existing plays to provision.
Worked on testing environment, debugging, monitoring, performance tuning, security vulnerabilities.
Worked on TypeScript, HTML, JavaScript, CSS, JSON, JSP pages.
Worked on configure Continuous integration and build process using Jenkins as Continuous integration tool.
Worked on Docker images, compose, containerization.
Leveraged AWS cloud services such as EC2, auto-scaling and VPC to build secure
Environment : Java,Springboot, Rest, Bitbucket,AWS, Kubernetes, Cassandra, Java1.8, MongoDB, NOSQL, S3, RDS, GIT, Linux, Redshift, Ubuntu, EC2, Lamda, Python, Oracle10g, LDAP, Shell Script, Maven, Tomcat WS