CHARU SAXENA
*******@****.***.*** 312-***-**** linkedin.com/in/csaxena github.com/dummyGetUsernames OBJECTIVE
Computer Science graduate with skills in sharing valuable insights employing Descriptive/Predictive Analytics and Machine learning algorithms to guide the Decision-Making pipeline through visualizations. Complimented by strong mathematical and logical aptitude. Seeking opportunity to apply fundamental ideas and solve real world problems.
Key Skills: Exploratory data analysis, Statistics, Probability, Data Mining, Machine learning, Data visualization, Neural Networks. EDUCATION
Illinois Institute of Technology, Chicago, USA (Masters in Computer Science) Aug‘16 - May‘18 Completed courses: Data mining, Advanced Data mining, Interactive and Transparent Machine learning, Machine learning, Parallel Distributed Computing, Computer Vision, Big data
SRM University, Chennai, India (Bachelor’s in Computer Science) Jul‘12 - Mar‘16 TECHNICAL SKILL AND PUBLICATIONS
Programming Languages: Python, R, MATLAB, SQL, Java Big Data: Apache Spark (PySpark), Hive, Pig, Map-Reduce, Databases: MySQL, MS SQL Server, Oracle 11g
Visualization Tools: Tableau, Microsoft Excel, Polanalyst(text-mining tool)
Libraries: Matplotlib, Seaborn, Pandas, Numpy, SciPy, Scikit-Learn, OpenCV, BeautifulSoup, NLTK
Machine Learning: – Regression, Classification (Decision Trees, SVM, KNN, Ensemble methods) Clustering (K-means, Spectral, Hierarchical), Text Analytics, Dimension Reduction(PCA), Time Series, Ensemble, Boosting,Word2VEC,Glove
Publication: Simulated annealing-based optimization for viral marketing. Research paper presented at. International Journal of Emerging Technology in Computer Science and Electronics [IJETCSE]organized by Sastra University, Chennai, TN, India.2016 April edition VOLUME 21/Issue
[http://www.ijetcse.com/wp-content/plugins/ijetcse/file/upload/docx/214ICERT-188-pdf. PROFESSIONAL EXPERIENCE
Illinois Institute of Technology
• Research Assistant Nov’17 – May’19
• Worked on healthcare domain with Rush Alzheimer's Disease Center (RADC) dataset.
• Analyzed the data to maximize insights, uncover underlying structure and extract key features for 200+ patients.
• Handled missing data using imputation data and elimination for each dimension based on the metric of correlated features of missing values.
• Extracted key features by performing feature engineering and dimensionality reduction resulting in improvement by 27% in accuracy rating.
• Performed predictive modeling utilizing boosted trees, SVM and parameter tuning to obtain accuracy of 75%.
• Worked closely and interacted with researchers conducting analytics and supporting analysis, planning as well as monitoring on follow up and implementation with collaboration.
• Student Employment Ambassador (Career Services) Aug’17 - May’18
• Consulted and gave presentations to students regarding Job search.
• Data-driven recommendations to increase hiring process at IIT by 15% using Pivot Charts (MS Excel) and Tableau. HireSphere (Machine Learning Intern) Aug’17 - Dec‘17
• Implemented high performance shell scripts to fetch statistical data using GitHub API which increased efficiency by 40%.
• Worked on HR analytics data to do predictive analysis on employee turnover for prospective employee.
• Analyzed employee’s demographics, education, and other experience related information through visualizations.
• Feature engineering to develop vital hiring metrics.
• Built Model with Random Forest and Logistic Regression, pruned parameters for optimal model utilizing Scikit-learn.
• Delivered high quality analysis to improve employee turnover and hiring quality through reporting and presentations. IBIZ Consulting services, Chennai, India (Summer Internship -BI TOOLS) May’14 - Jun ‘14
• Compared the BI tools: Qlikview and Tableau through a detailed report based 10+ aspects including monetary and other technical aspects for their future projects.
• Participated in requirements meetings and data mapping sessions to interpret business strategy needs
• Presented findings to team to improve strategies and operations through presentation. ACADEMIC PROJECTS
Learning with Rationale for text classification(Python)
• Sentiment analysis of Amazon product review using rationales provided by simulated reviewer and fed into the model before training. This method performed better than traditional methods with logistic Regression and Naive Bayes using AUC-ROC curve. Content-Based Image Retrieval (Python, MySQL)
• Developed an Image retrieval search application which analyses the color and texture feature of the images queried to retrieve similar images present in the database using OpenCV functions, python and MySQL. Essay-Scoring Project (Python, R, Weka)
• Automated Scoring predictive model for school-essays in association with company Whooo’s Reading by identifying features like organization-level of the essay, sentence-length, answer length, parts-of-speech, vocabulary score, predicted grade (from word content) and similarity to the prompt given for the essay using SVM and NLTK with accuracy of 72% on short answers using python and R. Customer Segmentation for Senior Life (Placed second in First Annual Joint Chicago ML–DePaul University Machine Learning Hackathon)
• Predicted and classify prospective based on in how many days they move-in using regression and classify prospects as hot, warm or cold using CHAID tree classifier and determine necessary sales activities to increase prospectors using visualizations. Predicting Character Encoding (Top-2 Kaggle Kernel)
• Build a tool utilizing python and Chardet to automatically detect when a text file in the wrong encoding is read in. Predicting Policing-using Data Science to predict High Crime Areas(Python)
• Designed a supervised learning model to predict the High-Crime locations based on various demographics features of the area community in USA using Random Forest, SVM, Decision-Tree using python with accuracy of 81% using Random Forrest. Twitter Trends Analysis using Spark Streaming (Spark, Python)
• Designed an application to plots out the popularity of tags associated with incoming tweets streamed live from twitter. Pre-processing and analysis of datasets (R, Weka)
• Performed various preprocessing techniques for analysis of dataset such as LSA, LDA clustering and other data mining techniques to find hidden concepts on following datasets: Iris, Pima, Wine,20newsgroup and Yelp.