Sign in

Software Developer Data

Bloomington, IN
October 10, 2019

Contact this candidate


Charith Reddy Musku j +1-812-***-**** j Linkedin j Github


Indiana University Bloomington Bloomington, Indiana Master of Science in Data Science; GPA: 3.9/4 Aug 2018 - May 2020 Relevant Coursework: Machine Learning, Natural Language Processing, Deep Learning, Text Mining, Big Data, Information Retrieval Dhirubhai Ambani Institute of Information Technology Gujarat, India Bachelor of Technology in Computer Science; GPA: 3.8/4 Aug 2012 - Jan 2016 EXPERIENCE

Data Scientist, Intern Palo Alto, CA

SAP - Leonardo Machine Learning Jun 2019 - Aug 2019

Procurement Fraud: A Fraud Monitoring workflow to predict the risk score of a procurement request using Random Forests. End to end process involving Data collection, cleaning, analysis, training and an explanation module to interpret and visualize the results. Also experimented with a Deep Learning Autoencoder model to identify fraud patterns. Software Developer Bangalore, India

SAP Labs India Feb 2016 - Jun 2018

SAP Analytics - Developer: [Tech: Java, Javascript, UI5, Springboot, Kafka, Elastic Search, Logstach, Kibana, DevOps]

Revamped UI for SAP BI 4.2 using UI5 (JavaScript library), d3.js. RESTful Webservices implemented using Springboot in Java.

Integrated Kafka with ELK Stack (Elasticsearch, Logstash, Kibana) for real-time log analytics (performance monitoring & alerting).

Service Ticket Intelligence - R&D: [Tech: Python, SpaCy, Gensim, NLP, Docker, Flask, REST, Text Classification, NER]

- Developed & exposed ML models as RESTful API microservices, hosted from within Docker containers.

Experimented with then state-of-the-art deep learning model like char-level CNN for automatic classification of new support tickets

Recommending KB articles to users using the solved ticket history using NLP techniques(LSA/LDA/Doc2vec/Cosine similarity).

De-identification (hiding personal information like Name/ID) of support tickets using a custom Named Entity Recognition model. ACADEMIC PROJECTS

Question Answering over Bio-Medical Text: [Tech: Tensorflow, Python, QA, CNN, GLoVe, Word Embeddings, Bi-LSTM]

Developed an Automatic Question Answering system using Deep learning architecture called Bi-directional Attention Flow (BiDAF). Used a combination of GLoVe word embeddings, char-level CNN for special words & contextual embeddings with an attention layer.

Trained over BioASQ data with 30k QA pairs & achieved an F1 score 60.34. Also tried BioBERT (BERT pretrained on BioMedical data) Automatic Speech Recognition: [Tech: Tensorflow, Python, ASR, CNN, MFCC, Speech, RNN, Speech-to-text]

An end to end neural networks approaches for an ASR system which converts speech to text. Using the features extracted from Mel-filter bank (MFCC) with a Recurrent Neural Network using CTC as loss function to deal with the silence/blank/repeat characters.

Trained over TIMIT corpus with 630 speakers consisting of 8 different dialects. Acheived a word error rate of 38%. Hybrid Restaurant Recommendation System [Tech: AWS, S3, RDS, EC2, Kafka, PySpark, Flask, Python, Java, ZooKeeper]

A personalized restaurant recommendation system using a hybrid of Colloborative filtering using Matrix Factorization, Content based matching using NLP (Word2Vec similarity), Social Network Analysis (Friends’ opinion) and location-based for the cold-start problem

A Real-time recommendation generator (Databricks + Spark + AWS), Kafka as stream processor & a Flask powered web page. Stock Price Prediction using Time Series data [Tech: Finance, Time-series, Forecasting, NLP, Sentiment Analysis, Python]

A deep learning approach for stock price prediction using time-series data. Used Stacked Autoencoders for feature extraction, LSTM for prediction. Integrated text mining approach to boost the model by performing sentiment analysis of company’s news headlines.

Trained over 13 years of data downloaded from Yahoo finance for training. Predicted stock prices with a Mean squared error of 0.006 Distracted Driver Detection [Tech: Computer Vision, Image classification, CNN, Transfer learning, VGG16]

Used the transfer learning technique with VGG-16 Convolutional Neural Network as the pre-trained model, to detect and classify the driver behavior from the given images into 10 different classes like operating mobile, drinking, talking etc.

Trained the network over 24k driver images curated by the State Farm Insurance company and classified them with a log-loss of 0.22 SKILLS

Languages: Python, R, Java, Javascript, SQL, C, C++, PHP

Libraries: TensorFlow, PyTorch, Keras, Scikit-Learn, Numpy, Pandas, SpaCy, Gensim, Fasttext, CoreNLP, NLTK, Matplotlib, Seaborn, Lucene

Database & Big Data: Hadoop, Apache Kafka, Spark, Elastic Search, MongoDB, AWS S3, RDS, MySQL, PostgreSQL, Neo4J, HANA

Tools & Frameworks: GitHub, Docker, AWS, SCP, EC2, Linux, Jira, Sprinboot, MVC, Flask, REST, Logstash, Kibana, d3.js, Tableau

Contact this candidate