Data Microsoft Office

Location:

Harrison, NJ

Salary:

75000

Posted:

January 24, 2018

Contact this candidate

Resume:

JIGYASA KOHLI

EDUCATION

MS, Information Systems, New Jersey Institute of Technology, Newark, NJ Sept 2016 – Dec 2017 B. Tech, Computer Science, Guru Gobind Singh Indraprastha University, India Aug 2011 – July 2015 TECHNOLOGY

PROJECTS

Python, Java, HTML, CSS, JavaScript, HTML 5, XML, Bootstrap MySQL, SQL Server

Tableau, IPython, PowerBI, Advance Excel, R, Rapid Miner, MATLAB, Minitab, Bloomberg Apache Hadoop, Apache and Elastic MapReduce, Amazon Web Services, Apache Kafka, Apache Spark Streaming(Basics), pyspark,Azure ML Studio, Apache Oozie, Apache Pig, Apache Hive, Apache HBase Shiny(R), Flask (Python)

Git, Docker

Python (numpy, pandas, sklearn), Linux, Selenium, Machine Learning, Statistics,UML Modeling, Corporate Finance, Visio, Microsoft Office, Requirement Analysis, Unit Testing, Test-Driven-Development Programming:

Database:

Analytics Tools:

Distributed

programming:

Advanced

visualization skills

Version Control:

Others:

Predicting Backorder Risk for Products

Resampled the Imbalanced Classification Dataset model by using Smote Analysis and improved accuracy by 20% and reduced training time by 50% by using Various Machine Learning Algorithm in scikit-learn streaming over 2TB of data.

Twitter Sentiment Analytics using Apache Spark Streaming APIs and Python

Used Apache Kafka to buffer live tweets data fetched with the help of twitter API.

Used Stream Processing API by Spark to convert live data into DStreams and performed sentiment analysis on it along with its visualization.

Working with Edgar datasets: Wrangling, Pre-processing and exploratory data analysis.

Extraction of all the statistical tables from 10 K and 10 Q filings using Python.

Generation of the URL to get the data for the first day of the month from EDGAR Log File Dataset by developing a pipeline in Python

Handled missing data and computed summary metrics and performed anomaly detection.

Logged all the operations in a log file with summaries of 12 files in one file and uploaded it to Amazon S3 Zillow Kaggle Dataset

Data Ingestion and Wrangling.

Using RMSE and MAPE to predict log errors using different prediction models, the best result was shown by Random Forest.

Used Azure ML Studio for the Deployment of Model by invoking the JSON API.

Created a REST API that given a LAT and LONG, should return the top 10 closest homes. Big Data Analysis of Wikipedia dataset

Processed Big Data and performed Predictive Analysis on Wikispecies Dataset in Hadoop fully distributed mode.

Identified the most popular species in Wikipedia by parsing the XML and applying Google’s Page Ranking Algorithm using MapReduce.

PROFESSION

Associate System Engineer, IBM

Carried out Automation Testing for the Data Provisioned and pre-delivery sanity Checks using Selenium.

Analyzed issues related to the Data Loading and conversion of files into different format.

Identified defects and errors in data prior to data processing. Collaborated with back end and database testers.

Involved in design calls to understand customer requirement and provide suggestions on requirements.

Developed SQL procedures for loading the data into Database

Prepared data for exploratory analysis, intelligent data products, and dashboards

Designed Dashboards and Data Visualizations to communicate meaningful metrics to different customers according to their requirements.

September

2016

September

2017

May 2017

October 2016

December

2016

848-***-****

*****@****.***

https://www.linkedin.com/in/jigyasakohli/

https://github.com/jmsjigyasa

August 2015-

July 2016

Contact this candidate