Dilip Molugu
Fairfax, VA, ***** Home: +1-929-***-**** E-mail: ac8q7c@r.postjobfree.com LinkedIn - www.linkedin.com/in/dilip-molugu GitHub - https://github.com/diliptechno
EDUCATION
Master of Science in Data Analytics Engineering,
George Mason University, Fairfax, VA
2017-Expected (May 2019)
GPA: 3.96
Bachelor of Technology in Computer Science
Vellore Institute Of technology, Vellore, India
2013 - 2017
GPA: 3.75
COURSEWORK
•Data Mining Statistics Visualization for Analytics Predictive Analytics Machine Learning Natural Language Processing Optimization
EXPERIENCE
CISCO (MAY 2018 – AUGUST 2018), San Jose, CA – Intern
•Worked in the Data Engineering team to set up a high-level architecture to query large data within few seconds.
•Accomplished this by a Hadoop cluster to run map reduce jobs to make the querying and indexing of the data fast.
•Implemented druid to query the data within 3 seconds which used to take more than a minute with traditional methods.
Electronics Corporation of India Limited (DECEMBER 2015 – JANUARY 2016), Hyderabad, India – Research Intern
•Worked as data Analyst Used R to preprocess and prepare the data for the models.
•Analyzed the inventory at the company and visualized the data and reported the data insights.
•Developed an Optimization model to reduce the inventory and transportation costs.
ACADEMIC PROJECTS
Textual Scoring Project Principal Group - Sponsored Capstone Project (Ongoing Project)
•Developing classifiers to classify the documents using Natural Language Processing and Deep Learning techniques.
Fake news stance detection using NLP, GMU, Tools: Python, TensorFlow
•Used neural networks to train the classifiers to detect the relation between the body and the head of the body.
•Implemented various deep learning techniques like convolutional neural networks, Multi-layer perceptron etc.
•Improved the accuracy of the prediction by 8% above the baseline accuracy.
Ontology creation on unstructured data. (knowledge mining), GMU, Tools: Python, NLTK, TensorFlow
• Developed a pipeline to create an ontology from unstructured data.
• Used various techniques in NLP like lemmatization, stemming, Ngrams (unigrams, bigrams, trigrams) and POS tagging.
• Generalized this kind of ontology generation by observing various patterns in the data.
House sale prices Prediction, GMU, Kaggle Competition, Tools: R, Python, Tableau, Numpy, Pandas
• Predicting the sale price of the houses with 79 variables describing every aspect of the residential houses in Ames.
• Implemented lasso and ridge regression along with Xgboost, SVM and random forest to predict the prices of the house.
• The ensemble model we used gave us very good results and we were ranked in top 10% of the kaggle competition.
Visualization of deaths in USA, GMU, Tools: R, Plotly, Micromaps, Shiny, ggplot
• Analyzed the number of deaths due to major diseases in the USA between the years 2000 – 2016.
• Used ggplot package to perform EDA and to generate excellent visualizations and Plotly for interactive visualizations.
• Developed an interactive map using micromaps and dashboards using shiny.
SKILLS
Programming languages: Python, Java, C, C++, R Database: SQL, Microsoft Access, MySQL, NoSQL Web Languages: HTML, CSS, PHP OS: Linux, Windows, Unix.
Related skills: linear regression, Classification, Categorical Data Analysis, Non-Parametric Analysis, Multivariate Analysis, weka, Time Series Analysis, Forecasting, Optimization, Simulation Models, Decision Trees, statistics, Operations research, modeling, git/GitHub, timeseries analysis, Tableau, PowerBI, Docker, Ansible, druid, TensorFlow, Plotly, Hadoop, spark, Numpy, Pandas, Jupyter, Scikit-learn, Analytical solver, Gurobi, shell scripting, NLTK, keras.
PUBLICATION: “Security breach in internet of things” in International Journal of Engineering Associated (ISSN: 2320-0804 Volume 5, Issue 4 and April 2016).