Post Job Free
Sign in

Data Analyst

Location:
Richardson, TX
Posted:
February 05, 2020

Contact this candidate

Resume:

SANJANA PATIL

+1-469-***-**** adbmjk@r.postjobfree.com www.linkedin.com/in/sanjana-nitin-patil https://github.com/SanjanaPatil96

EDUCATION

The University of Texas at Dallas May 2020

Master of Science, Business Analytics 3.20 GPA

Fr. Conceicao Rodrigues College of Engineering., University of Mumbai, India June 2018

Bachelor of Engineering, Information Technology 3.3 GPA

TECHNICAL SKILLS

Analysis Tools: - Advanced Excel (VBA, VLOOKUP, Macros), PowerBi, Tableau, Adobe Analytics (Omniture), Google Analytics

Programming/Database: - C, C++, HTML, Java, R, Python, SAS, Shiny, SQL, MySQL, PostgresSQL, Microsoft Access

Libraries: - TensorFlow, Keras, Scikit-Learn, Numpy, Pandas, Spark, Jupyter

Core Skills: - Data Warehousing, Data Mining, Data Visualization, Reporting, Requirement Gathering

Software Skills: - NetBeans, Android Studio, RStudio, Hadoop, STATA

Certifications: Google Analytics, Google Analytics IQ, Data Analysis with Python by IBM

CAPSTONE PROJECT

Child Poverty Action Lab (CPAL) – Dallas, TX [Community Resource Index Project] January 2020-Present

Provide the frontline partners of CPAL with information that can highlight the community needs in relation to economic mobility and community strength by calculating the Community Resource Index (CRI).

Identify the data sources and determine a type of facility to analyze and establish the neighborhoods around.

Perform evaluation against each sub-index indicator using one of the CPAL’s leading indicators along with variable reduction/factor analysis to mitigate the overlap between sub-index indicators.

Visualize the work performed for the reinforcement of the recommendations with the help of Tableau.

INTERNSHIP

Anchor Mark Pvt. Ltd- Data Analyst Intern June 2016-May 2017

Assisted in analyzing the company’s historical data of various filled weights based on tapped and untapped density of powder by performing data cleaning on this data with the help of SQL techniques.

Converted data into actionable insights by predicting and modeling future outcomes using classification and regression with Python resulting in the development of appropriate accuracy of 89.93%.

Predicted customers preferences using logistic regression and helped identify the user’s underlying needs by analyzing the product function and processes built around those needs.

Supported business planning by using data aggregation and mining to identify inefficiencies in existing operations.

Used MS Excel process automation and database data extraction to increase the operation efficiency by 8%.

Prepared a final report by using advanced Microsoft Excel to create pivot tables and Tableau for visualizations.

PROJECTS

Advanced Business Analytics with R [GitHub] [Predict number of comments on a Facebook post] August2019-December 2019

Pre-processed 40,949 Facebook posts with 23 feature variables available in the dataset to build models using R that would predict the number of comments based on these features.

Applied various Regression models, Random forests and Neural nets to perform the prediction on the test dataset.

Applied Machine Learning Project [GitHub] [Analyzed audit data to predict fraudulent firms] February 2019-March 2019

Performed preprocessing on the dataset and applied various regression models such as KNN, Lasso, Ridge, Linear and Kernelized SV regressor and Polynomial Regression with 92.49 percent accuracy of the best model.

Classified the data using KNN classification, Logistic Regression, Linear Support Vector Machine, Kernelized Support Vector Machine, Decision Tree with 99.20 percent accuracy.

Applied various ensemble techniques such as Voting Classifiers, Bagging, Pasting, AdaBoost Boosting, Gradient boosting and Deep learning models.

Business Analytics with R [Analyzed Expedia’s booking dataset to predict bookings] August 2018-December 2018

Analyzed booking dataset consisting of 395,054 sampled records from Expedia using Machine Learning algorithms like logistic regression, LDA and Naive Bayes classifier with the help of ggplot2, dplyr and other tidyverse libraries.

Predicted which customers are likely to book a hotel based on the variables in the dataset.

ORGANIZATIONS

Chief Creative Director - Big Data Club, University of Texas at Dallas August 2018 - Present

Volunteer - Infinity Lion’s Club, University of Texas at Dallas August 2018 - Present



Contact this candidate