Post Job Free
Sign in

Data Manager

Location:
Arlington, TX
Posted:
June 20, 2016

Contact this candidate

Resume:

CHIRAG KHETANI

Dallas, TX 682-***-**** *************@*****.***

SUMMARY

--- 3+ years of experience with Python and SQL for Data Analytics and Data Mining.

--- 1+ year of experience with Hadoop, Pig, Hive, Sqoop and Java

--- Have strong knowledge of Big Data frameworks - MapReduce, HDFS, Spark, Flume, Kafka,

Cloudera CDH, H base, Cassandra, MongoDB.

--- Have fair knowledge of Java, RESTful APIs, Scala, Tableau, and Amazon Web Services.

--- Well familiar Business Intelligence (BI), Data Warehousing and Data Marts.

--- Fair knowledge of SDLC methodologies Agile, Scrum, Waterfall.

TECHNICAL SKILLS

Programming language : Java, Python(pandas, Numpy, Scikitlearn), Scala, R, Java Script, HTML, CSS

Frameworks/others : RESTful APIs, spring, JSP, JSON, XML,

Big Data Frameworks : Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Impala, Spark, Flume, Kafka

Amazon Web Services : Amazon EC2, Amazon S3, Amazon RDS, IAM, AMI

Database Systems : Oracle, MySQL, NoSQL (Mongo DB, Cassandra, Hbase)

Operating System : Linux RHEL, Windows

Tools : Cloudera CDH, Git, Informatica Powercenter-ETL Tomcat, Eclipse, Anaconda, VM Ware.

Methodologies : Agile, SDLC, Scrum

PROFESSIONAL EXPERIENCE

TATA Motors Ltd, India - Assistant Manager, Data Analytics (Aug -2011 to Aug-2014)

Tools: SQL, Python, Hadoop, Pig, Hive MS Access, SAP BW, Oracle BI, MS Excel

Worked with OLTP and OLAP data, DW modeling, ETL and extraction from Hadoop HDFS using Java Map reduce, Pig, Hive and UDF.

Worked with different classification and clustering techniques like linear regression, logistic regression, decision trees, Random forest, and Naive base using Python.

Developed models for prediction of warranty cost, warranty complaints identified dissatisfied customers, which boosted the company’s Customer Satisfaction Index (CSI) to 830 in JD Power survey.

Analyzed vehicle warranty failure data, customer complaint data and emergency vehicle failure data using Python, SQL and Excel.

Experienced with wiring complex SQL queries for extraction of data from Oracle database.

Mahindra & Mahindra Ltd, India - Assistant Manager (Oct-2009 to Aug-2011)

Analysis of vehicle repair quality for dealers across the country.

Experienced with MS Excel, VLOOKUP, Pivot tables, writing functions, Visualization of Data.

ACADEMIC PROJECTS

Java -Big Data - Weather Data Analysis Using Hadoop (Java, MySQL, Hadoop, Cloudera CDH, Pig, Hive, Spark, Scala)

Worked with 15 node cluster.

Used Java (Map Reduce), Scala (Spark) for yearly finding mean, maximum and minimum of temperature, wind, tornado, and rain for each year in Texas with “TexasWeatherfor50yearBig” file in HDFS.

Data Science - Text Analytics – Amazon customer review (Python, Tableau, and Excel)

Converted unstructured review text data into structured data for Product Lifecycle Management, predicted product life cycle using sentimental score and no of reviews using Python, Tableau

Data Science – Sentimental analysis of Twitter Tweets (Python, JSON, Tableau, Numpy, Pandas, ANEW)

Collected tweets, preprocessed data - extracted tweet text, State, Time zone, removed punctuation, removed stop words. Carried out sentimental analysis using (1) AFINN (2) Hu & Liu lexicon Model and (3) ANEW.

Analyzed data with Tableau.

Java - Data mining for classification problem (Java, Weka, MS Excel, Data Preprocessing)

Carried out data cleansing, attributes selection, class balancing using SMOTE.

Used Java for Decision Tree, One R and used Weka for Naïve Bayes Classification of income range.

Python - Data Science - Movie sales forecasting from Twitter Tweets, Movie rating & Budget. (R, Python, Excel)

Collected the data form Twitter, BoxMojo.com and IMDB and preprocessed.

Carried out sentimental analysis of the tweets. Experimented different regression model with R for finding the best fit, predicted the movie sales revenue within standard significance level.

ACADEMICS

MS, Information System with Business Analytics (Aug-2014 - May-2016)

GPA: 3.82/4, University of Texas at Arlington, TX

Big Data – Hadoop, Spark

Data Science with Python

Data Warehousing

Advanced Business Statistics

Business Data Mining

Advanced Database Management

BS, Mechanical Engineer, Nirma University, India GPA 3.8/10 (2005 – 2009)



Contact this candidate