Post Job Free
Sign in

Data Manager

Location:
Minneapolis, MN
Posted:
March 31, 2016

Contact this candidate

Resume:

Bharathi Kasaba Venkatagiri

*** ******** ****** ** ***# 105 Mobile: 224-***-****

Minneapolis MN 55414 Email:act6ab@r.postjobfree.com

Objective

Seeking an opportunity as a Data Scientist Advance Analytics Java Hadoop developer to develop new capabilities in the area of Big Data for Analytics.

Over 3+ years of experience in Telecom, Healthcare, Insurance, Finance and Retail domains. Creative and innovative; Proficient in verbal and written communication skills; Can initiate work, play a pivotal role in a team; Quick learner and always zealous to learn new technologies.

Technical Skills

Data Science Skills: Data ingestion, Machine learning, Predictive Modeling, Data Mining and Visualization, Data cleaning, Data Cleansing,Text Analytics & Social Media Analytics, Big Data Analytics using Apache Hadoop / Apache Spark.

Programming Skills: C, C++ Java, Map Reduce, Scala, SQL, NoSQL

Big Data Tools: Apache Hadoop, Apache Spark, Hive,Hbase, Pig, Sqoop, Cloudera Impala, Cloudera Manager 5.5.1

Analytical Tools: R,MATLAB

Software: Eclipse 2.4, Oracle SQL, MS office pack, Windows, Ubuntu, CentOS.

Algorithms: Neural Networks, Support Vector Machines, Linear Regression, Gradient descent, Logistic Regression, Association Rule, Linear Discriminant Analysis.

Professional Summary

Skilled in Big Data/Hadoop projects using HDFS, MapReduce, HBase, Mongo DB, Graph DB, Spark, Hive, and Pig.

Experience in performing descriptive analytics using Hadoop and Spark.

Experience in processing Network graph using Spark GraphX and visualizing it using Neo4j.

Hadoop cluster installation and configuration on Azure, AWS EC2.

Skilled working with Cloudera Manager CDM- 4 & 5

Experienced in HBase data modeling and rowkey design to accommodate heavy reads and writes and avoid region hot spotting.

In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and YARN / MapReduce programming paradigm and for working with Big Data to analyze large data sets efficiently.

Experience in working with the Different file formats like TEXTFILE, AVRO and Parquet.

Experience using data visualization/ BI tools like Tableau.

Analysis using various tools such as SAP HANA, R, Statistica, Rapid Miner.

Proficient in Data analytics and Predictive modelling using MATLAB, and R.

Expertise in Databases (RDBMS and NoSQL) – Database & Star / Snowflake Schema Design, SQL

Involved in various projects related to Data Modeling, System/Data Analysis, Design and Development for Data warehousing environments. Strong knowledge on ETL methods and process used SSIS & SSRS, Developed mapping spreadsheets for (ETL) with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.

Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.

Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation, performance tuning.

Optimization, Database Administration and Replication for OLAP, NoSQL and Data-warehousing.

Proficient in the area of project implementation (SDLC )specifically in integration of business intelligence strategy, requirement gathering, requirement analysis, data modeling, information processing, system design, testing and training.

Possess experience in conducting current state (as-is) system analysis, defining future state (to-be), eliciting requirements, developing functional or technical requirements, mapping business processes to application capabilities, conducting fit gap analysis, developing/configuring/prototyping solutions, implementing to-be processes / solutions based on application capability and industry best practices.

Demonstrated success many times under aggressive project schedules and deadlines, flexible, result oriented and adapts to the environment to meet the goals of the product and the organization

Excellent work ethics, self-motivated, quick learner and team oriented. Continually provided value added services to the clients through thoughtful experience and excellent communication skills

WORK EXPERIENCE

Medtronic, Inc. May 2015- Present

Intern (Software Engineering)

Projects:

Physician Network Analysis using Spark GraphX

Project aims to study on how to prioritize physicians for sales and marketing activities based on the physician referral and categorize them based on their influence and the network with other specialized physicians.

Implemented an algorithm to calculate each physician influence based on the patients been referred to other physicians.

Used Spark for cleansing & cleaning the data and GraphX for processing the referral Network on AWS.

Did a performance study by using different types of cluster based on the memory.

Performed descriptive analysis using spot fire and SPSS of the network graph with influence factor.

Visualized using Neo4j was able to build queries using NoSQL for further analysis.

Handled all the issues related to network in cluster and data migration to the cluster.

Exploration of Patients Data using Cloudera Impala

The purpose of this project is to analyze the frequency of the different types of diagnosis & procedures and also the patterns of patients’ treatment path for any kind of diagnosis & procedure.

Analyzed the requirement of using the data through stakeholders and migrated the data which was in SAS file format to the Hadoop cluster.

Using Cloudera Impala loaded and partitioned the data based on the requirements.

Handled all the issues related to network in cluster.

Monitored Hadoop cluster environments using Cloudera Manager.

Built Impala Queries to perform descriptive analysis and also compared the performance based on the file storage (Text file vs Parquet) and also by trying out different partitions.

Tata Communications Pvt Ltd May 2010 – Jan 2012

Network Engineer

Worked on SDH/SONET Provisioning, Solution Implementation, Network Planning, Circuit Writing & Transmission- Built.

Provisioned circuits like National Leased Private circuits, IP-VPNQOS Circuits, Ethernet Wired Line Circuits which involves low and high Bandwidth Capacities.

Planned and designed different kinds of circuits like E1,E2,DS3, STM-1 to STM-64, Ethernet over SDH on the Platform PRESIDE, EMOS, MV36, and MV38 with cross connections in Marconi, Alcatel, and Nortel equipment.

Working knowledge on Multi Service Provisioning Platform (MSPP) using CRAMER with different capacity requirement.

Hands-On Experience in Elementary Management Systems (EMS)/Network Management Systems (NMS) technologies.

EDUCATION

University of St Thomas, St Paul-MN Sep 2014 – May 2016

Graduate Program in Software, Master of Science in Data Science GPA 3.8/4.0

Projects:

Human Activity Prediction using MATLAB

The need of this project was to understand human activities in the health- care domain and use the sensor data for predicting the abnormal behavior.

Used smart phone as sensors to identify human activities with a group of 30 volunteers’ between the ages 19-48 years.

Target was to predict activities composed of static, dynamic and postural transitions.

Captured the 3-axila linear acceleration and 3-axial angular velocity at a constant rate of 50Hz and transformed into 561 variables by using various signal functions.

Did a comparison study of algorithms like logistic regression, linear discriminant analysis, Support vector machine and neural networks based on the accuracy, F1 score, and classification percentage.

Data been skewed performed micro clustering of data such that each target class has equal number of instance in the training.

Compared the results of SVM by using micro cluster data with the skewed data based on the training function like Linear, Radial Basis Function (RBF), and polynomial and box constraints for each training function.

Built neural network models based on number of neuron in the hidden layer, and compared results by training the skewed data and micro cluster data.

Analysis on the movies rating using Tableau

The interest of this project was to analyze when audiences & critics agreed or disagreed on how good movies are based on the genre and story line.

Cleaned, cleansed and also created a bias factor between audience and critic which nothing but the weighted average of both the scores.

Analyzed the data based on the opinion of the audience and critic by visualizing it in different form of interactive charts.

Visualized the data by creating interactive charts and analyzed the opinion similarity between the audience and critic.

Finding frequent patterns and predicting the possible crime in the city of Chicago

The intention of this project was to find frequent patterns of crime and also to predict possible crime in particular location and time.

Performed cleaning, cleansing and descriptive analysis on the crime data set using mapreduce and generated reports.

Used SAP HANA instance on Amazon web service (AWS) for finding the frequent patterns using association rule and also did a comparison study on the predicting crime based on the F1score, accuracy and classification percentage.

Visualized most frequent pattern for defined cases using word cloud.

Using Naive Bayes and multiclass logistic regression was able to get the probability of crime that could occur in particular block of community area on particular day of week at time.

Also did a comparison study on the predicting the arrest or no arrest for particular set of crimes.

Avinashilingam University for Women, Faculty of Engineering

Bachelor of Engineering in Electrical and Electronics Jul 2006 – May 2010

Project:

Neural Networks Based Electric Load Forecaster

This project was envisioned to assist the power grids in predicting the variation in power consumptions that occur due to seasonal climatic conditions.

Cleaned and cleansed data using C and normalized the data using MATLAB.

Used Back Propagation Network with Bayesian Regularization for predicting the electric load

Performed a comparison study training functions like Batch Gradient Descent with momentum and Levenberg-Marquardt backpropagation based on the accuracy and prediction classification.

VOLUNTEER

Volunteer: Blood Donation camp, Coimbatore TamilNadu India October 2009.

Volunteer: National Service Scheme, public service program conducted by the Department of

Youth Affairs and Sports of the Government of India, June 2006 - May 2008.

Member: IEEE club, Avinashilingam University for Women, Coimbatore TamilNadu India,

June 2006- Jan 2010.

Member: Spark and Hadoop User Group Twin Cities.



Contact this candidate