Resume

Software Engineer Staff

Location:

Morgan Hill, CA

Salary:

240000

Posted:

April 17, 2024

Contact this candidate

Resume:

Contact

ad42qs@r.postjobfree.com

www.linkedin.com/in/sukesh-

nagaraja-516418a6 (LinkedIn)

Top Skills

Data Structures

Real-time Data

Technical Architecture

Languages

English (Full Professional)

Hindi (Elementary)

Kannada (Native or Bilingual)

Certifications

Goal Setting: Objectives and Key

Results (OKRs)

Sukesh Nagaraja

Staff Software Engineer at Samsung Semiconductors (R&D, Memory Solutions Lab)

San Jose, California, United States

Summary

• Software engineer with 10+ years of hands on experience in solving real business needs at large scale by applying software engineering and analytical problem-solving skills.

• Experience in architecting and building a robust, scalable, and highly available distributed infrastructure.

• Proficiency in data modeling, data design, SQL, and NoSQL databases

• Intake prioritization, cost/benefit analysis, decision making of what to pursue across a wide base of users/stakeholders and across products, databases and services.

• Experience in leading cross-functional initiatives and collaborating with engineers, product managers, and TPM across teams. Skills:

Programming languages - Python, Scala, Java, C++

Big Data tools & technologies - Spark, Hadoop/HDFS/HIVE/AWS/ S3, Cassandra, Kafka, HBase, Snowflake, Vmware Gemfire, ElasticSearch, MongoDB, Minio, AWS EMR, Databricks, Flume, Nifi, Sqoop, Oozie, Airflow, K8s, Docker, Rest/GRPC API

Benchmarking suites: TPC-H, TPC-DS, YCSB

AI/ML - Tensorflow, PyTorch

Experience

Samsung Semiconductor

Staff Software Engineer

November 2022 - Present (1 year 4 months)

San Jose, California, United States

(Customer) Moodys - Risk assessment firm that empowers organizations to make better decisions by providing analytical solutions. Roles & Responsibilities:

Page 1 of 6

• Identified compute & memory heavy operations in the client’s spark application.

Solution1:

• created JNI interface to access S3-Select C API’s (getObjectBuffer, putObject Buffer..) in scala.

• created chunked serialized data for compute & memory bound operations.

• Persisted chunked serialized data by offloading it to disaggregated storage solution/Minio.

• Reduced the application latency by reusing the same allocated buffer at task/ core level per executor Instance (spark)

• Compared the application latency by executing the application with other distributed platforms such as AWS EMR & Databricks.

• Solution was showcased in Samsung Memory Tech day event. Solution2: (ZERO-ETL)

• Integrated the spark application with offloaded compute & memory heavy operations in the target system (compute & storage layer). Skills - Spark, Scala, C API, DSS storage,Minio, Rclone, AWS AMR, Databricks, Ganglia, Prometheus

Vmware Gemfire with Samsung CXL devices (Memory box) - Roles & Responsibilities:

• Analysis of cache overflow to cxl device and achieve faster read performance.

• Benchmarking the read/write operations using YCSB (yahoo cloud serving benchmark)

Skills - Vmware Gemfire, Java, Prometheus, Pulse

Freddie Mac

Quantitative Analytics Sr

March 2019 - October 2022 (3 years 8 months)

Mclean, Virginia

Mortgage Data Enablement and Profiling

Roles & Responsibilities:

• Creation of efficient Data pipeline from source to target

• Data Parsing (Xml/Json) from sources - Hive/MongoDB/Snowflake as per business requirement.

Page 2 of 6

• Creation of concurrent parallel processing modules for data parsing.

• Development of custom user defined functions.

• Creation of Key-lookup for (PPI) critical data elements using hashlib (data masking)

PDF & Image Analysis

Roles & Responsibilities:

• Parsing text data & images from pdf using (OCR) tesseract.

• Development of custom code to reduce noise/blurness in images to improve quality of images.

• Development of pattern recognition code for data selection as per business requirement.

• Parse pdf for the images using python pdfminer and label the images with text data available in pdf by finding nearby coordinates.

• Image processing and manipulation such as removal of image metadata, geo spatial info etc..

Text classification for Loan closing fees

Roles & Responsibilities:

• Created data preprocessing & data augmentation layer for training set using python.

• Developed text classification model using tensor flow Bidirectional RNN.

• Developed inference/prediction module using GRPC API to predict labels for unlabeled data which resides in Vertica/Snowflake columnar database.

• Deployed trained text classification model on docker containerized tensor server to serve prediction model request.

• Evaluated prediction results with confusion matrix module. Ampcus Inc

Big Data Engineer

February 2017 - February 2019 (2 years 1 month)

Chantilly, VA

Enticer(Enterprise Network Threat Isolation Engagement & Remediation): Page 3 of 6

Enticer is a cyber-security threat intelligence model which handles both external & insider threat (User entity behavior analytics). It’s a Ampcus Inc product which is still in Proof of Concept phase.

Roles & Responsibilities:

Insider Threat:

• Streamed log data from servers to Data Lake using rsyslog/Apache Nifi.

• Developed cleansing,pattern recognition & transformation models for unstructured data using pyspark.

• Developed unsupervised & supervised machine learning models to identify insider threat.

• Indexed data to Elasticsearch & Dash-boarded the results using Kibana External Threat:

• Developed Pattern recognition models for Palo-alto,Juniper & Cisco ids log and shipped data

using filebeat, logstash and indexed into elasticsearch. visualized in kibana. Extract Transform & Load using ELK Stack

Roles & Responsibilities:

• Shipped data from NAS using filebeat.

• Processed GB's of complex nested Json data using (python)logstash.

• Developed program in python/ruby to decode the json attribute and saved data as pdf.

• Validated data between pdf and json on the fly using python while shipping data from logstash.

• Loaded data from logstash to Elasticsearch for data exploration.

• Used Webhdfs Rest api to load data to hdfs using logstash.

• Developed python Rest API to pull data from elasticsearch for further analytics.

• Created Dashboard in kibana for nested json and csv data.

• Created customized dashboard in kibana using Vega. Woolworths

Big Data Developer

October 2013 - August 2015 (1 year 11 months)

Cape Town Area, South Africa

Project: Retail Wifi Log Analysis

Page 4 of 6

Project Description

This Project is Built for Woolworths Holdings Limited (WHL) which is a South African-based retail group, to track users in both woolworth’s online app’s & retail stores when connected to visitors WiFi enabled mobile phones. Roles & Responsibilities:

• Ingested data from WiFi access point using flume to collect, aggregate and move log data to HDFS.

• Designed a model using Pentaho Data Integrations graphical designer for data transformation, parsing, filtering and loading to the Hive.

• Created External hive schemas for hdfs data & generated Report using Apache Zepplin.

• Imported and exported data into HDFS, Pig, Hive and HBase using data ingestion tools like Sqoop,Flume.

• Created visualizations and reports for the business intelligence team, using Tableau.

Environment: Apache Hadoop 2.7.2, Map-reduce, HDFS, Hive, sqoop, Flume, Java 1.6, Oracle 10g, Eclipse, Jenkins.

Finnair

Big Data Developer

February 2012 - September 2013 (1 year 8 months)

Bengaluru Area, India

Roles & Responsibilities:

• Developed a proof of concept such as Sentimental Analysis and Recommendation systems using apache mahout for ITC products & Finnair services, to analyze the reviews of the ITC & Finnair customers to know about their quality of service.

• Used Machine Learning techniques such as Naive Bayes classifier and Support Vector machines.

• Developed efficient java programs in apache mahout for filtering out the unstructured data.

• Experience on loading and transforming of large sets of structured, semi structured and unstructured data to HDFS.

• Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Walmart

Java Developer

Page 5 of 6

July 2011 - January 2012 (7 months)

Bengaluru Area, India

• Enhanced the existing functionality as per the customer requirement.

• Effectively communicated with customers on providing technical solutions.

• Generated the business reports for the projects.

• Created technical documentation.

ISRO - Indian Space Research Organisation

Intern

January 2011 - May 2011 (5 months)

Bengaluru Area, India

Mil Std 1553B Bus has been effectively used for various ISRO spacecraft Applications like GPS Receiver, Antenna Control and on Board Computer. Designed and implemented a bus controller card that handles the complexity of point - to - point communication between the terminals using MIL-STD 1553 a military specification defining a digital time division command/response multiplexed data bus.

Education

Thomas J. Watson School of Engineering and Applied Science, Binghamton University

Master's degree, Computer Science · (2015 - 2016)

The National Institute Of Engineering, Mysore

Bachelor's degree, Electrical, Electronics and Communications Engineering · (2007 - 2011)

Page 6 of 6

Contact this candidate