MS STUDENT AT UNIVERSITY OF CALIFORNIA, MERCED https://github.com/sapphiresinha Objective
Looking for an organization where I can find challenging problems to solve by using my knowledge of data science and machine learning.
University of California, Merced Aug. 2018 to May 2020 MS, Electrical Engineering and Computer Science GPA: 3.54/4.0 Courses: Distributed Computing, Human-Computer Interaction, Digital Image Processing, and High-Performance Computing. Kurukshetra University Sept. 2014 to June 2014
B. Tech, Computer Science and Engineering GPA: 7.5/10.0 Courses: Data Structure, Object Oriented Programming, DBMS, Mathematics, Software Engineering, and Operating System. Employment
National Cancer Institute - National Institute of Health Bethesda (USA) Cancer Data Science Intern June 2019 to Current
Predicting drug response in cancer patient using supervised learning. Jindal Stainless Steel Hisar (India)
Software Developer (PHP) Intern Mar. 2018 to June 2018 Built a complaint portal where ‘Employees of company can update/ send safety related complains. Indian Institute of Management (IIM), Lucknow Lucknow (India) Data Analyst Intern Dec. 2017 to Jan. 2018
Analyzed the data and find the factors that affect the Hotel room prices. Skills
Programming Skills: Python, R, PHP, SQL, C, C++, HTML, C#, Unix and Linux. Data Science Skills: Data Analysis, Statistics, Machine Learning, TensorFlow, Keras, Statistical Tool: Tableau, MS Excel. Frameworks: Anaconda, Jupyter notebook, R Studio, XAMPP, MS-Office 2016, Visual Studio. Projects
Predicting drug response in cancer patient using supervised learning June 2019 to current We built a supervised model to predict response of cancer drugs in cancer cell lines by identifying and learning on the genetic interactions of a drug targets, where our current best AUC is ~0.80. Currently, we are trying to extend our predictions to patients' response in clinics.
Creating customer segments based on customer buying habits Apr. 2019 to May 2019 Using the customer’s weekly product buying habits of a wholesale distributor, I identified the segments of customers affected in case of change in “days of delivery per week” policy. Starting with feature preprocessing like data imputation, removing outliers and scaling, I reduced the feature count by PCA and applied k-means clustering to identify the segments of customers getting affected by this policy change.
Finding donors for charity using Machine Learning (ML) Mar. 2019 to Apr. 2019 I investigated factors that affect the likelihood of charity donations being made based on real census data. Trained, optimized and tested various supervised ML models like Adaboost, DecisionTree, and GaussianNB to predict the likelihood of donations with an accuracy of 0.86.
Predicting Boston Housing Prices Feb. 2019 to Mar. 2019 Employed supervised ML basics (cross validation, decision tree model, grid search for optimization) to predict the House prices on a given dataset with a coefficient of determination, R^2, of 0.923. Creating fast neural networks for mobile devices Aug. 2018 to Dec. 2018 Used Computer vision technique and DNN compression techniques to reduce the model size and computation complexity with the minimum loss in desired accuracy.
Technology and platform: Python and TX2(GPU), Model: TensorFlow, Mobilenet V2. Publication
Sinha, Neelam, "Higher prevalence of homologous recombination-deficiency in lung squamous May 2019 carcinoma from African Americans." bioRxiv (2019): 651794.