POOJA VIJAYSINGH BAIS
Boston, MA 214-***-**** ************@*****.***
https://www.linkedin.com/in/pooja-vijaysingh-bais-a80662117/ SUMMARY
Aspiring data scientist with passion for playing with data and finding valuable insights seeking opportunity to utilize analytical and statistical skills to impact on business decisions. EDUCATION
Northeastern University, Boston, MA [expected Mar 2021] M.P.S. in Analytics with concentration in Statistical Modelling Relevant Courses: Probability and Statistics, Enterprise Analytics, Data Mining Applications, Predictive Analytics, Communication & Visual Data Analysis, Data Management & Big Data University of Pune, Pune, India 2018
Bachelor of Computer Engineering
TECHNICAL SKILLS
Programming Language: R, Java, Python (NumPy, Pandas, Matplotlib) Machine Learning: Linear and Logistic Regression, Clustering, Random Forest, Decision Trees, Neural Networks, Naïve Bayes
Databases: MySQL, MongoDB
Software and Tools: MS Excel, MS PowerPoint, Eclipse, Android Studio, RStudio, Microsoft Azure, Databricks, Spark
Visualization Tools: Tableau, RShiny
ACADEMIC PROJECTS
Feature Importance of US Accidents (Databricks, Spark R), Northeastern University Jan2020 - Feb 2020
• Created cluster of 3 million dataset in Databricks
• Visualized the occurrence of accidents based on states, hour, month and year
• Extracted feature importance as stops and bumps using Random Forest algorithm from SparkR library Visualization of Video Game Sales (Tableau Dashboard, RShiny), Northeastern University Oct 2019 - Nov 2019
• Created visual representation of sales analysis based on genre, top publishers, games, and platforms
• Identified gaming sales patterns around the globe NYC Airbnb Data Analysis (R, R-Studio), Northeastern University Sep 2019
• Analysed dataset by replacing null to 0 for reviews and converting dates into months
• Articulated linear regression by converting price variable into log to show data distribution
• Predicted model with maximum accuracy by comparing Decision Tree, Random Forest and Gradient Boost Tree using R-squared and Testing
Analysis on dataset of Apple App store (R, RStudio), Northeastern University Jun 2019
• Conceptualized dataset from Kaggle for analysis
• Forecasted user rating using Decision tree and Two-sample T-test
• Implemented linear regression and Decision Tree to demonstrate that user rating is dependent on all other variables such as price and size
PUBLICATION
“An Android Application for Driver Assistance and Event Alert System Using Ultrasonic Sensor and Heart Rate Sensor.” IEEE 2018 Fourth International Conference on Computing Communication Control and Automation
(ICCUBEA). Retrieved from- https://ieeexplore.ieee.org/document/8697627