VIDHYA SIVASANKARAN
Mobile Phone: 503-***-**** E-mail: ******.**************@*****.***
SUMMARY:
6+ years’ experience as a System Validation Engineer at Intel Corporation
Programming experience in R, Python (NumPy, pandas, SciPy, scikit-learn) and analyzing large datasets – GB of data
Data visualization using python libraries like pandas, NumPy, tableau, R programming
Knowledge of descriptive and inferential statistical concepts like Binomial distribution, Normal distribution, Bayes Rule, Sampling distribution, Hypothesis testing (A/B)
Experience in exploratory DA using R programming to explore relationship between 1 to multiple variables
Experience in Data Wrangling to gather data from multiple sources, to assess its tidiness and quality
Data story telling using Tableau to visualize datasets and highlights trends or patterns in a data set
Knowledge in Supervised Learning Algorithms – Decision Tree, Naïve Bayes, SVM, Ensemble Methods (Bagging, Boosting) and Unsupervised Learning Algorithms – Clustering
Hard-working, dedicated and responsible team player with strong technical, analytical and debugging skills
Self-motivated, time and target-oriented approach with good interpersonal and communication skills
SKILL SET:
Languages
C++, Java, C#, VBScript, JavaScript, JSON, REST API, Python (pandas, NumPy, Matplotlib, SciPy, Scikit Learn, Tweepy), R-programming, MYSQL, SQL-SERVER
Tools/Framework
Anaconda, Jupyter Notebook, Tableau, Android Studio, Eclipse, Json Validator, Firebase,
Education:
Master’s in computer science, Portland State University
Bachelor of Engineering in Computer Science, Madurai Kamaraj University
Certifications:
Data Analyst/ Machine Learning(continuing) Nano-Degree from Udacity
Projects:
Data Visualization of Flight dataset with Tableau [Project Link]
Objective is to visualize flight delays, cancellations and diversions of flights based on 1 Mil. data records from RITA from various carriers in airports across continental United States
Created worksheets, Dashboards and story to analyze average arrival/departure delays for major airlines and its trend over 12 months considering delays like weather, late flights, security and its impact on flight delays
Computed granular delay data at each airport/supported airline, used maps to visualize data at airports/states
Obtained avg. delays for airlines/airports, useful for travelers to consider airlines to avoid at specific airports
Calculated highest cancellation and diversion rate for each airline and visualized them as 12-month trends
Hypothesis A/B testing to determine users are willing to pay for a new e-commerce website [Project Link]
The e-commerce company has developed a new web page in order to try and increase the number of users who "convert," meaning the number of users who decide to pay for the e-commerce company’s new website
The main goal of this project is to help the company understand if they should implement this new page, keep the old page
Used null/alternate hypothesis and implemented them in two ways: confidence interval (sampling dist. and bootstrapping) and simulation using null hypothesis (calculating P-value)
As a result, we found out that users preferred to use the old page rather than the new pay-page
Confirmed the above results using regression testing approach
Data analysis of medical records from Kaggle to determine scheduled patient appointment show-up [Project Link]
Dataset is a collection of medical appointment information from 100k medical appointments in Brazil
Analysis is focused on predicting whether patients will show up for their scheduled appointments.
Analysis determined that patient show-up is correlated to wait times and longer the wait times, more no-show ups for appointments
Exploratory Data Analysis with R to determine quality of red wine [Project Link]
Used R to explore more than 10 chemical properties of red wine to analyze its quality using univariate, bivariate and multivariate analysis
In Univariate Analysis, used histograms to explore univariable. Most of them normally skewed, except 3 which are long tailed. Used Log transformation to make the distribution looks normal and deal with outliers
In Bivariate Analysis, used boxplots, and frequency polynomial to plot the graphs and view the relationship between wine quality and its chemical properties. Used correlation coefficients to view how variables are correlated between each other using ‘r’ score
In Multivariate Analysis, explored more than 2 variables by grouping through quality variable and related them with other variables and analyzed how multiple variables affects the quality of the wine
Data Wrangling, Analysis and Visualization of Twitter dataset from WeRateDogs [Project Link]
Used twitter dataset of around 5000+ tweets of user WeRateDogs to wrangle, analyze and visualize data
Extracted “retweets”, “favorites”, “retweets count” data by querying Twitter API using tweet ids. Used Python package Tweepy.
Assessed the above Json data, twitter archive from WeRateDogs visually & programmatically for quality/tidiness
Cleaned the assessed data and stored them in high quality, tidy panda data frames
Analyzed and visualized the above wrangled data obtained from different sources
Data Exploration of Bicycle Sharing System using Python3
Explored data(~1GB) of bike share systems for three major cities in the United States Chicago, New York City, and Washington. Various bike share metrics were computed using NumPy/Pandas in Jupyter Notebook
Features explored were popular travel time, popular station and trips, trip duration, user information
Initial implementation in Python3 without using any packages like NumPy and Pandas
Improved execution time for each metric computation with NumPy/Pandas packages by order of magnitude
EXPERIENCE:
Intel Corporation, Hillsboro, OR 2011-2016
System Validation Engineer
Hands-on experience in Post Silicon Validation of multiple platforms include Desktops, Mobile and SOCs (System on Chip) for Multimedia and Communication domains include Audio, Graphics, Media, Camera, and Content Protection, WIFI, BT, NFC, 4G, WIDI and protected content like Netflix, Blu-Ray
Experience in Pre-Silicon Validation using emulated FPGA systems for Multimedia and Communication domains include Audio, Media and Graphics
Experience in using various lab tools like Graphics Performance Analyzer, GPU cap Viewer (OGL/OCL) and using benchmark tools like Unigen, heaven, Valley heaven, Stone giant (NVIDIA), Pass mark, Future mark
ColumbiaSoft Corporation, Portland, OR 2010-2011
QA Developer
Implemented automated tests that simulate a variety of business functions including standard, non-standard usage scenarios
Performed various levels of testing including acceptance, integration, functional, regression and stress testing
Implemented black-box, white-box and performance/load testing on DocumentLocator® application
Performed detailed application testing before launching of each product
References: Available upon request
Visa Sponsorship: Not required. Can work for any employer