Data Scientist

Location:

Santa Clara, CA

Posted:

March 12, 2018

Contact this candidate

Resume:

SREEKAR A

510-***-****

********@*****.***

SUMMARY:

Certified Data Scientist with over 1+ years of experience in Data Science with Artificial Intelligence, Machine Learning, Deep Learning, Data Mining, Data Analytics, Data Visualization, Data Governance & Operations.

Experience in Design, model, validate and test statistical algorithms using Python and R, against various real-world data sets including behavioral data.

Develop, build, test analytics applications using iterative and agile-like development processes or practices such as test-driven development, continuous integration.

Working experience in Machine Learning algorithms such as Linear Regression, Logistic Regression, Decision Trees, K-Means Clustering and Association Rules.

Experience with analyzing online user behavior, Conversion Data (A/B Testing) and customer journeys.

Experience using technology to work efficiently with datasets such as scripting, data cleansing tools, statistical software packages.

Knowledge of writing Packages, Stored Procedures, Functions, Views using SQL.

Working experience of statistical analysis using R, MATLAB and Excel.

Ability to work with large transactional databases across multiple platforms (Oracle, HDFS).

Proficient in the integration of various data sources with multiple relational databases like Oracle/, MS SQL Server, DB2, Teradata and Flat Files into the staging area, ODS, Data Warehouse and DataMart.

Implemented deep learning models and numerical Computation with the help of data flow graphs using Tensor Flow Machine Learning.

Good experience in Text mining to transposing words and phrases in unstructured data into numerical values.

Developing Logical Data Architecture with adherence to Enterprise Architecture.

Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.

Good knowledge in statistics, mathematics, machine learning, recommendation algorithms and analytics with excellent understanding of business operations and analytics tools for effective analysis of data.

TECHNICAL SKILLS:

Programming & Scripting Languages: R, Python.

Database: SQL, MS Access, Oracle.

Development Tools: R Studio, Notepad++.

Packages: Dplyr, ggplot2.

Techniques: Machine learning, Regression, Clustering, Data mining.

Machine Learning: Decision trees, Regression models,

Random Forests, Time-series, K-means.

EXPERIENCE:

Client: Fiserv Feb 2017 – Present

Location: Sunnyvale, CA Role: Data Scientist

Description:

Fiserv, Inc., is a US provider of financial services technology. The company's clients include banks, thrifts, credit unions, securities broker dealers, leasing and finance companies, and retailers, among others.

Responsibilities:

Developed applications of Machine Learning, Statistical Analysis and Data Visualizations with challenging data Processing problems in sustainability and biomedical domain.

Compiled data from various sources public and private databases to perform complex analysis and data manipulation for actionable results.

Used predictive modeling with tools in R, Python.

Developed visualizations and dashboards using ggplot.

Worked on development of data warehouse and ETL systems using relational and non-relational tools like SQL.

Built and analyzed datasets using R, MATLAB and Python.

Applied linear regression in Python to understand the relationship between different attributes of dataset and causal relationship between them.

Applied concepts of probability, distribution and statistical inference on given dataset to unearth interesting findings through use of comparison, T-test, F-test, R-squared, P-value etc.

Validated the Macro-Economic data and predictive analysis of world markets using key indicators in Python and Machine learning concepts like Regression, Boot strap and Random Forest.

Interfaced with large scale database system through an ETL server for data extraction and preparation.

Worked in large scale database environment like Hadoop and MapReduce, with working mechanism of Hadoop clusters, nodes and Hadoop Distributed File System (HDFS).

Applied clustering algorithms like K-means and Hierarchical with help of Scikit and Scipy.

Identified patterns, data quality issues, and opportunities and leveraged insights by communicating opportunities with business partners.

Environment: Machine learning, HDFS, Linux, Python (Scikit-Learn/Scipy/NumPy/Pandas), R, SQL connector.

Client: Activision Blizzard Aug 2016 – Jan 2017

Location: Santa Monica, CA Role: Jr. Data Scientist

Description:

The Company develops and distributes content and services across various gaming platforms, including video game consoles, personal computers (PC) and mobile devices. The Company's segments include Activision Publishing, Inc. Blizzard Entertainment, Inc.

Responsibilities:

Statistical Modeling to drive values from customer data, avoid churn.

Prepared regular data reports by collecting samples of data sets using Excel spreadsheets.

Cleaned data by analyzing and eliminating duplicate and inaccurate data outliers using R.

Converted various SQL statements into stored procedures thereby reducing the number of database accesses.

Evaluated models using Cross Validation, Log loss function, ROC curves and used AUC for feature selection.

Ensured that the model has low False Positive Rate.

Recommended and evaluated marketing approaches based on quality analytics of customer consuming behavior.

Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.

Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis.

Performed analysis such as regression analysis, logistic regression, discriminant analysis, cluster analysis using R and Python.

Evaluated and optimized performance of models, tuned parameters with K-Fold Cross Validation.

Analyzing transaction data to cluster users into segments and develop different marketing strategies for each cluster.

Environment: R, Python 3.5.2, regression, logistic regression, OLTP, random forest, OLAP, HDFS, LINUX, SQL Server, Microsoft Excel.

Education:

California State University, Fullerton

Master of Science Electrical Engineering – 2017

GPA – 3.5

JNTUH

Bachelor’s in technology – 2015

Electronics and Communication Engineering

GPA - 3.6

Contact this candidate