AJITH K J
Big Data analyst
*.* years of IT experience in development of Big Data and Data Science applications using Hadoop Eco System, AWS cloud services. Experience in designing and developing Data pipelines (ETLs) using Pyspark.
****.*****@*****.***
Banglaore
linkedin.com/in/ajith-kj-ajith-
899a52136
github.com/ajithkjajith
SKILLS
Python
Java, RestFul Webservice
Hive, Sqoop, Oozie
Hadoop, Map Reduce
Spark SQL, Spark Streaming
Presto
AWS, EMR, RDS
TinyML
NLP, ML models
LANGUAGES
English
Professional Working Proficiency
Kannada
Professional Working Proficiency
Hindi
Limited Working Proficiency
INTERESTS
Writing Exams to blind
People as Scribe.
Playing keyboard
Work on new
technologies to bring up
optimal solutions to the
problem
EDUCATION
M.E in Big Data and Data Analytics 9.75/10
Manipal School of Information Science, Manipal
07/2019 - 05/2021,
B.E in Computer Science 8.7/10
The National Institute of Engineering, Mysuru
08/2013 - 07/2017,
WORK EXPERIENCE
Big Data and Data analyst intern (8 months)
Beckman Coulter
07/2020 - Present,
Responsible for Ingesting data using sqoop from traditional RDBMS database to AWS S3. Responsible for Writing ETLs using Hive and Hive Query Language. Developed Data Ingestion and Data Processing platform in spark. Worked on developing Restful APIs.
Responsible for monitoring and maintaining Data Quality. Associate Application developer (1.8 years)
Accenture solution private limited
11/2017 - 06/2019,
Developed ingestion framework to sqoop data from mainframes to extract information about the policies.
Creating Restful Webservices using jersey, springmvc, maven project. PERSONAL PROJECTS
Clinical prediction for dysphagia disorder using edge computing(tinyML)
(12/2020 - 03/2021)
Predicting early stage of dysphagia disorder in a patient using accelerometer and microphone data of swallowing different types of liquid.
Building tensorflow lite model which is capable of deploying machine learning models on aurdino board. Insights on covid-19 cases (07/2020 - 10/2020)
Scrapped covid-19 cases data from different offical websites using NLP. Data Exploration and pre-processing using various statistical analysis. Created various analytical reports to showcase the different dimension of increasing cases. Predicting Sepsis using Correlation Based Clustering of Patient Features
(08/2019 - 01/2020)
The main aim of this project is to derive an optimal set of patient features using correlation based clustering for sepsis prediction in adults.
Appropriate correlation measures (Pearson, Correlation ratio, Cramer’s V) are used to correlate mixed data type features.
Worked on highly imbalanced data set to bring optimal solution and improve model accuracy. Assessing Road Quality Using Unsupervised Machine Learning Techniques
(01/2021 - 04/2021)
Assessing road quality where road segments are unevenly paved, have potholes, and speed bumps. Unsupervised clustering allows data to speak for itself paving way for more sophisticated supervised learning approaches to aid in making policy decisions. Achievements/Tasks
Achievements/Tasks