Software Developer Senior

Location:

San Diego, CA

Posted:

May 13, 2024

Contact this candidate

Resume:

ASHISH KUMAR

USA +1-248-***-**** ******.*******@*****.*** www.linkedin.com/in/ashishkrs2008

Visa Status - H1B

Technical Lead Consultant Senior Bigdata Consultant Senior Software Developer

Extensive experience of more than 11+ years developing applications using Python, Spark (pyspark, Scala, Spark Streaming, SQL), Kafka, Databricks, SQL, Shell/Bash scripting, and other core Bigdata technologies. An expert in CI/CD tools and skilled in the Design, Analysis, and Development phases of software development projects. Deft with emerging digital technologies and keen on implementing innovative ways to improve productivity by introducing new technologies, systems, methods, and controls. Ability to influence stakeholders in cross-departmental initiatives for achieving organizational milestones & providing tactical solutions for service delivery focusing on innovative and cost-effective measures. Possess excellent communication & interpersonal skills with strong analytical, team building & organizational abilities. Select and integrate Big Data tools and frameworks required to provide capabilities required by business functions while keeping in mind hardware, software & financial constraints. Build scalable, repeatable, and secure data pipelines that can serve multiple purposes. Desire to join a strategically focused team to enhance customer experience through automation further.

Technical Core Skills: Big Data Ecosystems – Spark, Spark streaming, Spark SQL, Kafka, Scala, AWS - HDFS, Hive, Hadoop Cloudera, MAP Reduce, Impala, Zookeeper, Sqoop, Oozie Operating Systems – Linux, Unix, Windows Programming languages – Python, Scala, Shell scripting, Pyspark Database – SQL server, DB2, Clickhouse, RDBMS, MySQL, Oracle, Hive, Teradata IDE – Visual studio code, Pycharm, Vim, IntelliJ, Eclipse Methodologies – Waterfall, and Agile methodologies Tools – GIT, TFS (Microsoft team foundation server) Continuous Integration/ Delivery pipelines using Jenkins. DevOps Tools, Bitbucket, and SVN REST API services Scheduling tool – Splunk, Autosys, Redwood, Oozie JSON, CSV, XML, AVRO, and PARQUET file formats ETL tools, DataStage.

Professional Experience

Teksoft Systems, Troy, MI – April 2024 to current - on H1b 2024

SEB, WARSAW, Poland Jan 2024 to March 2024

Senior Data Engineer

Project: Financial Crime Prevention (Dow Jones)

Project Description: This project is to extract the data from XML files and load it in Hadoop DataLake and then identify transactions within bank by a person/entity who are coming under various sanctions and criminal activities. This data then can be used by government to investigate and take necessary actions against the people or identified

Technology Stack: Pyspark Python, Apache Airflow, Github, Aws, Databricks

Achievements -

Desgin and Implemented a solution in Pyspark and Python to read and extract XML tags from file size ~6GB and schedule the job using Apache Airflow.

Worked with various business team to correctly find the various data source to identify the transactions done by person/entity.

Created an automated process to load the data on monthly basis for business to consume.

HCL Technologies, Krakow, Poland Nov 2021 to Dec 2023

Technical Lead Consultant

Project - Data Monetization (Client - Euroclear Bank, works to strengthen the world's capital markets and connect their participants with leading and secure post-trade services)

Project Description - This project aims to create multiple reports for the client, providing them a 360-degree view of the data to improve their banking performance and customer experience. The data from different sources like DB2, Kafka, SQL server, and raw data files are consumed and processed using spark and Scala applications to enrich and transform data into internal data models.

Technology Stack - Spark, Scala, Kafka, AWS, Data Warehouse, Shell Scripting, Hadoop/Bigdata, SQLs, Splunk, Agile, Databricks

Achievements –

Implemented and deployed around 15+ programs using Spark APIs in Scala, which helped to improve the performance of processing the data from 2 hours to 30mins.

Utilized Kafka connectors to connect to MQ source to pull the data into Kafka topics and process near-time & real-time data streaming.

Created 10+ reports for the clients and automated the process in the production environment for better data visualization, hence achieving a 360-degree view of data.

Proactively identified service improvement needs and implemented new technology solutions.

Actively responded to the users on data-functional queries, resulting in increased customer satisfaction.

Applied Materials, Bengaluru, India May 2019 to October 2021

Senior Big Data Consultant

Projects - Hadoop Transformation & Spark 2.0 Framework Project WaferMap Analysis Dashboard

Project 1 Description - We migrated the tables and views from the SQL Server to the Hadoop Data Lake in this project. Microsoft Analytics Platform System could not handle the daily increase in the data size and was very slow to execute even simple queries. The aim was to convert the existing hive framework into Spark 2.0 framework.

Project 2 Description - Substrate mapping (or wafer mapping) is a process in which the performance of semiconductor devices on a substrate is represented by a map showing the performance as a color-coded grid. The map is a convenient representation of the variation in performance across the substrate.

Technology Stack – Python, Spark, shell scripting, Redwood Scheduler, SQLS, PySpark, data sources: SAP, DataStage, SharePoint, tornado, Clickhouse, Pycharm, React JS, Bigdata.

Achievements –

Designed and implemented scalable infrastructure for a large amount of data ingestion and transformation in Hadoop.

Implemented and deployed Spark 2.0 Framework, resulting in 10 times faster data processing than earlier Hadoop Map-Reduce frameworks.

Utilized PySpark & SparkSQL for SQL server query to reduce the runtime from 6 hours to less than 2 hours program resulting in a throughput increase of almost 80%.

Successfully migrated 100+ Hadoop data/tables/views programs into Spark Framework.

Created RestAPIs using python, which help to analyze the Wafermaps for client

The dashboard was deployed successfully in Applied materials projects globally and received multiple awards as a Team and individually.

Accenture Services, Bengaluru, India August 2012 to October 2018

Software Engineer Senior Software Developer Application Development Team Lead

Projects - Client Feedback Transformation Data Mover Framework File to Hadoop Load Utility

Project Description - The Client Feedback Transformation (CFT-Medallia) project is being developed to provide a centralized platform for Customer Experience Management. This helps the organization build innovation and loyalty through customer feedback. CFT Experience keeps every organization member – from service associates to bankers to executives – connected to real-time customer feedback. The aim was to build a Customer 360-degree view of data. To achieve this, we have developed in-house Frameworks “Data mover utility” & “File to Hadoop utilities” to move the data from various RDBMS systems into Hadoop. Also, to move various(CSV, json, XML, etc.) raw data files into Hadoop. Our proprietary product/service features primarily combine Extract, Transform, and Load (ETL) and Business Intelligence (BI) tools. It starts with extracting data from custom data sources (internal and external), the addition of appropriate business logic to interpret the data, and the display of results in a graphical format. This entire cycle is completed in a matter of minutes.

Technology Stack – Python, Shell scripting, HiveQl, Impala, Sqoop, oozie, Autosys, Perl, Teradata, DB2, Oracle, SQL, XML, JSON, etc.

Achievements –

Designed & implemented Big data ingestion utilities from scratch.

Created around 150+ HQLs (Hive QL) to support data decision engine /to build a 360-degree view of data.

Data ingestion utilities helped to ingest around ~3Peta Bytes of data from various sources to the Hadoop cluster.

Designed and implemented Multiple file loads in parallel executions using python modules/Sqoop modules extensively.

As part of HAAS (Hadoop as a service platform migration), migrated around 100+ applications to new platform clusters.

Delivered and supported multiple data pipelines across client DU unit, resulting in being selected 3 times straight for the individual “ACE Award,” Accenture excellence, and Team awards.

Education

SRM University, India 2012

Master of Computer Applications (MCA), CGPA-8.9

Sikkim Manipal University, India 2008

Bachelors in Computer Applications (BCA), 1st Class with distinction

Contact this candidate