Sign in

Data Mining Analyst

New York, New York, 10001, United States
March 17, 2011

Contact this candidate

Currently I am working for Saudi’s largest telecom operator – Mobily in their key Data mining project in Bangalore, India

I am permanent resident of USA. I am looking for a job in USA.

PH: +91-809*******


1908 Cherry Avenue ~ Easton, PA, 18040


Data Mining Specialist with over seven years of experience in data mining, text mining, business intelligence and machine learning algorithm development and implementation. Worked on high-visibility projects for clients in the banking, telecommunications, and retail industries, including Best Buy, Lyrica, ICICI Bank, Spice telecom, and Sri Lanka Telecom. Expert knowledge in CRISP-DM and SEEMA methodologies for data mining. Excellent knowledge of Machine Learning Algorithms and their internal logic. Authored white papers about data mining tools and techniques; developed and conducted data mining training.


Software: SAS Enterprise Miner, IBM Intelligent Miner, SPSS Clementine, WiziRex, MS Analysis

Manager, Poly Analyst, Microsoft Excel, PowerPoint

Database: SQL Server 2000, DB2, Microsoft Access, Flat file like CSV, ARFF

Languages: Java, Visual Basic


Data Mining Technical Analyst

MOBILY INFOTECH, Bangalore, India May 2010–Till Date

Primary Project: PrePaid & PostPaid Churn Analysis

• Mobily is Saudi Arabia’s second leading telecommunications provider and have approximately 15 million customers. Mobily has identified that the Churn Analysis model represent the greatest value to the business. This churn model is based upon ‘known’ churners, being customers who have a period of 8 weeks of inactivity for prepaid and who more than 60 days in Fully Bared are for post-paid. Twelve months of call data prior to the exclusion window is utilized to model churners. This project is being conducted within the overall framework of CRISP – DM.

My Role: Involved in

• Built separate decision tree models for prepaid and post-paid customers; additionally also created separate models based on customer types- Early Churners and Normal Churner (High ARPU and Non High ARPU).

• Identified 800 Attributes and created 150 new ratio Attributes.

• Undertook Data Preparation and Data Extraction.

• Used Rank ordering technique for Model evaluation.

• Identified the churn score between 0 and 1, that reflects the propensity of customer to churn, where "0" is unlikely to churn and "1" indicates high probability to churn.

Data Analyst


Primary Project: Sleep Study Data Analysis

• The Pulmonary medicine center has a sleep study center that diagnoses and treats sleep related disorders. The center has large set of historic sleep study data. The project mainly was concerned with analyzing the historic sleep data of the patients and deriving the Sleep Disease Rule.

My Role: Involved in

• Building the Data Mining Database - Collected data from different data source, eliminated irrelevant and unneeded data, imputed the missing value using k-nearest neighbor.

• Preparing data for modeling by selecting variables and selecting rows using systematic sampling.

• Building & Evaluating different decision tree models using train datasets and validate on test datasets.

Data Mining Specialist

AZTECSOFT, Pune, India Nov. 2006–Jan. 2008

Primary Project: Lyrica Text Mining (Client: Lyrica)

• The mining of relation between Lyrica drug and disease entities was the ultimate goal in this project. More specifically, mining the inquiry text. The inquiry text expressed a treatment relation between the Lyrica drug and the disease. Main goal was to identify those drug-disease patterns from 25000 drug-disease Inquiry texts.

• Environment - SAS Enterprise Miner, SPSS Clementine, SQL Server 2000, Microsoft Excel.

My Role: Involved in

• File processing – Converted Text Sources into database after completing spell check.

• Text Parsing – Handled multiword terms, equivalent terms and entities.

• Applying List – Created and applied lists like start list, stop list and synonym list.

• Built the model using text link analysis & constructed 182 drug disease patterns.

System Analyst

SATYAM COMPUTERS, Pune, India Jul. 2006–Nov. 2006

Primary Project: Customer Churn Management (Client: Sri Lanka Telecom)

• A serious problem with Srilanka telecom was customer churn. If churner is retained and converted into the best customer, company can increase overall profitability. Thus the main purpose behind this project was to identify those customers and their characteristics that are likely to leave services of the company.

• Environment - SAS Enterprise Miner, CSV Flat file.

My Role: Involved in

• Data Selection: - Identify the most important predictors.

• Data Cleaning: - Missing value Imputation and data quality assessment.

• Data Construction: - Construct new column or transformed column

• Model Building: - Build the predictive model using Decision Tree.

• Result: - Model Evaluation, Identifying the Business Rules.

Project Engineer

WIPRO TECHNOLOGIES, Bangalore, India Aug. 2004–Jun. 2006

Primary Projects:

Customer Segmentation (Client: Best Buy)

• Retailer observed over a period of time that the response rate of various mail and e-mail marketing campaigns were very low, which caused less returns on campaigns and resulted into overburden of campaign costs with revenue losses. Goal of this project was to improve response rate of mail and e-mail marketing campaigns, reduce campaign costs and increase revenue and profit.

• Environment - SAS Enterprise Miner, Microsoft Excel.

My Role: Involved in

• Understanding the historical campaign data and build analytical predictive models to target relevant customer segments for the effective campaign execution.

• Implemented Clustering and decision tree techniques to identify different customer segments

• Got 20 to 22 customer segments with certain customer characteristics. Of these segments top 5 segments had unique characteristics.

• Identified different rules for different customer segments.

Early Warning System (Client: ICICI Bank)

• This project was part of Credit Risk Analysis. Data mining techniques were implemented on historical data, to develop an Early Warning System – to identify potentially bad credit risk in advance.

• Environment - IBM Intelligent Miner, Java, CSV Flat file.

My Role: Involved in

• Developed data structure of Early Warning System.

• Implemented Data mining techniques including Decision Tree, Naïve Bayesian, Logistic Regression and K-Nearest Neighbor

• Developed ensemble classification technique.

• Constructed new attributes like credit limit utilization by days and transactions

• Achieved accuracy was 18-20%(50% higher than there earlier accuracy)

Senior Software Engineer

DATUM TECHNOLOGIES, Bangalore, India Sep. 2003–Jul. 2004

Primary Project: Call Data Recorder (Client: Spice Telecom)

• Customer churn was a significant problem in spice telecom. The project involved identifying customers who are most likely to leave the organization in advance. Dataset size was 9 GB.

• Environment - IBM Intelligent Miner, SQL Server 2000.

My Role: Involved in basic steps of data mining for knowledge discovery like

• Building the data mining data base: tasks like data collection, data selection, data cleaning and solving data quality problems.

• Prepared data for decision tree by selecting variables, constructing new variables, transforming variables, selecting row using random and systematic sampling.

• Built & evaluated the best decision tree model based on accuracy, confusion matrix and lift.

Software Engineer

BIGANTS, Bangalore, India Jun. 2001–Aug. 2003

Primary Project: VIS Miner

• Purpose of this project was to develop the Vis Miner Software. Vis Miner is the Flagship Product of Bigants. It is one of the few Data Exploration tools. Visualization Techniques are encompassed with a variety of interactive and semi automatic Data Mining techniques, which is used to cluster data and to induce rules.

• Environment - Java.

My Role:

• Developed data structure of VIS Miner.

• Developed main class in java: (1) Dataset - collection of rows

(2) Row - double array

(3) Attribute - Meta data information of column

• Developed the Visualization Techniques like parallel co-ordinate, constellation graph in java.

• Developed the data mining algorithms (Machine learning algorithms) like decision tree, k-means clustering, neural network, k-nearest neighbor in java.

• Involved in Module Integration and Dry Testing.


Bachelor of Engineering, Computer Science, 2001


Contact this candidate