Post Job Free
Sign in

Data Analyst

Location:
Austin, TX
Posted:
April 04, 2019

Contact this candidate

Resume:

Rosy Sahu

+1-512-***-****

********@*****.***

Team oriented & motivated Data Science Professional with more than Six years experienced in Data analysis & Artificial Intelligence, seeking assignments to attain a challenging role where I can utilize my problem solving skills and knowledge in handling large data sets to get meaningful insights.

Professional Summary

Strong analytical skills with the ability to organize, estimation, strategic decision making, team management and disseminate significant amount of information with attention to detail and accuracy.

Having good knowledge of Random Forest and SVM.

Proficient in Statistical Modelling and Machine Learning techniques (Linear, Logistic, Decision Trees, Clustering (K-means, Hierarchical), K-Nearest Neighbours, Naive Bayes Forecasting/Predictive Analytics, Regression based models, Hypothesis testing, Ensembles.

Extensively worked for data analysis using Python, R Studio, SQL and other tools.

Good understanding of a wide variety of ML techniques and algorithms: Statistical NLP, regression and classification based models and deep leaning frameworks and tools such as Tensor Flow.

Strong Agile Experience.

Knowledge in Database Architecture, Oracle 10G,11g and 12C.

Strong skills in statistical methodologies such as hypothesis testing, data mining, ANOVA, chi-square tests implementation using Python & R.

Knowledge on SAS for statistical data analysis.

Strong knowledge on utilizing Excel vLookups sorting and filtering.

Knowledge on dimensionality reduction techniques (PCA, LDA) and regularization techniques (Ridge and Lasso).

Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle.

Experience in extracting, transforming and loading (ETL) data from database tables and other sources using Microsoft SSIS.

Worked with large sets of complex data sets that include structured, semi-structured and unstructured data and discover meaningful business insights.

Experience with data visualization tools like Tableau and Python Libraries like Bokeh, Folium.

Strong Data Modeling experience using ER diagram, Dimensional data modeling, Star Schema modeling, Snow-flake modeling using tools like ER Studio.

Education

Masters in Data Science (2017-2018)

University of Denver, Colorado, USA

BTech in Electronics & Telecommunications (2006-2010)

Biju Patnaik University of Technology

Skills

Statistical Analysis

Database & Scripting Languages

Tools and Utilities

Regression Based Models: Linear, Poisson, Multi-class logistic regression.

Principle component analysis (PCA), LDA, Bayesian decision theory, confusion matrix and ROC.

Deep Learning frameworks and tools such as Tensor Flow.

Expertise in Ridge and Lasso Regression.

Oracle 10g/11g/12c (SQL,

PL/SQL), SQL Server 2008/2005.

NoSQL:PostgreSQL,MongoDB

HBase,CouchDB, Neo4j.

Scripting Languages:HTML,

UNIX Linux Shell Scripting.

TOAD, Oracle Apex, Erwin Data Modeler, SQL * Loader, SAS

EDM tool: Markit EDM

ETL Tools: Informatica

Spark, Hadoop, MapReduce

GitHub

Python Library like Panda, Numpy, Jupyter notebooks,

Data Visualization: Tableau, Matplotlib, Bokeh, Geoview.

Programming Languages

SQL, PL/SQL, Python, R, Oracle Express / Forms / Reports / Workflow, C, Java, C++

Professional Experience

Texas Department of Transportation(TxDOT) Austin, TX

Senior Data Modeler/Data Analyst October 2018 to present

Acquired data from multiple data sources and explores the high-level concepts and develop clear procedures for storing and retrieving data efficiently.

Applied data cleansing/data scrubbing techniques to ensure consistency among data sets.

Created logical, physical and dimensional data models, compared different version of data models and generated reports using Erwin Data Modeller.

Extensive experience in Relational Data Modelling, Dimensional Data Modelling, Logical/Physical Design, ER Diagrams, Forward and Reverse Engineering ER diagrams.

Working Extensively on Informatica tools such as Designer, Workflow monitor and Workflow Manager to load data from DCIS to TXDOT.

Presented detail reports about the meaning of gathered data to members of management and help them identify scenarios utilizing modifications in the data.

Used performance tuning techniques to improve session performance.

DU Human Trafficking (University of Denver) Denver, CO

Data Science Professional June 2018 to August 2018

Involved in the entire data science project life cycle including data extraction, data cleansing, transform and prepare the data ahead of analysis and data visualization phase.

Created Dynamic interactive web-page to show the human trafficking from source to destination using Bokeh and Geoview, matplotb library(Python).

Machine learning algorithms like Random forest and logistic regression models were built on correlated data sets and emphasized advanced algorithms like neural networks, SVM.

Evaluated models using Recall, Precision, Cross Validation and ROC.The final model after performance was predicted at an accuracy of 83.4%.

GitHub Link : https://github.com/rose0037/DU-Human-Trafficking

Jury Study Master Deck (University Of Denver) Denver, CO

Data Science Professional January 2018 to June 2018

Involved with the client to understand the domain and the attributes that have specified. Performed Data analysis, Outlier detection, handled missing values and categorical variables in the data using R.

Evaluated the effect of the different response variable on plaintiff’s win rates, focused on feature selection using the LASSO method and then examined the effectiveness of the limiting Jury Instructions using Logistic Regression.

Worked extensively in Python sea-born to provide better statistical data visualization to end user.

Multiple Regression techniques were used and tested. Robust regression was finalized based on the feasibility and accuracy of results.

Conducted validation of data models by different measures such as AUC, ROC, and confusion matrix.

Analyze the comment every jury put at the end to interpret their sentiment. RNN model was implemented to find if the juror is biased or not.

GitHub link : https://github.com/rose0037/Jury_Project_Q2

IM Platform Evolution (Cognizant Technology Solutions) Kolkata, India

Enterprise data management developer (Markit EDM) March 2014 - September 2015

Responsible for understanding the requirements, and translating that to technical details and build high-quality software using Markit EDM toolkit.

Automated data flow between any number of sources and destinations on an ad-hoc or scheduled basis using data porter.

Enables users to select and combine source data to their exact requirements via a rules based hierarchy using data constructor. Worked with Data Inspector and Data Matcher and analyzed data validity and quality.

Gained experience in analyzing, developing, testing and running fully automated ETL, matching, enriching, Mastering and exporting quality data to data consumers (downstream systems) for Clients using Markit EDM (CADIS).

Managed and released applications to the training, UAT and production environments.

Command Console (Cognizant Technology Solution) Kolkata, India

SQL/PLSQL Developer January 2012- February 2014

Designed and Normalized Databases, created Oracle Tables, Views, Constraints, Synonyms and sequences.

Wrote some complex SQL queries with sub-queries, analytical functions, and inline views.

Developed a strong understanding of Data Modelling in Data warehouse environments such as star schema and snow flake schema.

Developed and maintain large complex the logical and physical model using ERWIN .

Created BRD,FRD and the low level design document.

Developed complex Web Application using Oracle Application Express (Oracle Apex), which was used by resource allocation team of Cognizant.

Actively participated in Application diagnosis, Impact analysis, implementation of critical business rule like calculation of opening day balance of funds depending upon the preceding day as weekdays or weekends/holidays and written the respective code in Oracle 11G.

CARS Enhancement (Cognizant Technology Solution) Kolkata, India

Data Analyst/ Database developer November 2010 - December 2011

Designed and Normalized Databases, created Oracle Tables, Views, Constraints, Synonyms and sequences.

Extracted meaningful facts from the Data which help client to make better business decisions.

Developed and modified many complex stored procedures, functions, triggers, indexes and view of the application using Toad for Oracle editor as an interface in Oracle 10G and Oracle 11G and used SQL * Loader to load data from external files into Oracle database tables.

Extensively worked with SQL server management studio 2008 to Identify circular indices and deleted redundant indices, modified and created indexes on temporary and permanent tables for CARS module.

Training and Certifications

Completed the Markit EDM Introduction exam authorized by Markit (2015).

Cognizant Certified Professional – Level 0 Oracle 10g SQL and PL/SQL by Cognizant (2013).

OCA Level 1 Certified by Oracle (2012).

Underwent 30 days training on ORACLE at NICE (2007).

Volunteer History

Volunteered for many social and cultural events like annual function, annual sports activities, blood donation camp at Biju Patnaik University of Technology, 2009.

Active Member of “Care a Child”, Austin

An active member of "Cognizant Outreach" group and involved in gifts distribution, health checkups and teaching underprivileged children, 2012 -2014.

Awards:

Star Award from Cognizant for successfully delivering the project of CARS without any post production issues, 2013.

Certificate of Appreciation for the contribution to Cash Availability project from Cognizant, 2011.

Active member of the group to win first prize for a Technical seminar on "NoSQL" in Cognizant, 2011.

Completed Diploma on painting.



Contact this candidate