HARSHA RACHUMALLU
******.*.*.*@*****.*** +1-419-***-**** https://www.linkedin.com/in/sriharsharachumallu/ PROFESSIONAL SUMMARY
• Data science and Application development with 5+ years of experience in addressing business problems using machine learning and web application environment.
• Extensive exposure on analytics project life cycle CRISP-DM (Cross Industry Standard Process for Data Mining) and web applications using SCRUM methodologies.
• Business understanding, Data understanding, Data preparation, Modeling, Evaluation and Deployment.
• Experience of understanding all data points, all systems and their relations in addressing a business problem.
• Involved in Data Collection, Data Engineering, Data Cleansing and Solution Design stages of multiple projects.
• Knowledge in developing end to end web applications and incorporating visual analytics dashboards.
• Experience of supporting project team in budgeting plan adhering to SOW, working on project plan using MPP. EDUCATIONAL QUALIFICATION
• Master of Science in Analytics; Bowling Green State University; Ohio; Jun 2016.
• Bachelors of Technology in Information Technology; Bapatla Engineering College; India; May 2011 TECHNICAL SKILLS
• Machine Learning Algorithms: Logistic Regression, Linear Regression, Decision Tree, Random Forest, Gradient Boosting, SMOTE, TOMEK, SMOTE ENN, Lasso and Ridge Regression, Nearest Neighbor Classifier, Weight of Evidence & Information Value (WOE & IV), K-means clustering, RFM Analysis, DBSCAN, Affinity Propagation, Principal Component Analysis, Support Vector Machines, Naïve Bayes, Auto Regression & Moving Averages.
• Programming Skills: Python, R Programming, Spark, Pig, Hive, JAVA, C#, jQuery, SQL, SAS.
• Frameworks: Hadoop, JSF, Spring, SOAP/REST Web Services, Hibernate, LINQ, MVC.
• Database: Oracle 9i, Oracle 10g, PostGreSQL, SQL Server.
• Visualization Technologies : Power BI, Tableau, Chart.js, D3.js, Matplotlib, Seaborn, ggplot2.
• Methodologies: Agile, Scrum, Kanban, Waterfall.
• Others: Maven, Ant, Jenkins, Nexus, Pentaho data modeler, JIRA, SVN, TFS, SSIS, Alteryx, Azure Cloud Services. WORK EXPERIENCE
Employer: Ernst & Young LLP, USA September 2016 – Present Data Scientist
Email Marketing Modelling
• Designed a model to predict if a customer will respond to marketing campaign based on customer information
• Unbalanced data issue was handled using Synthetic Minority Over Sampling, SMOTE and TOMEK LINK algorithms. Missing data was handled using KNN imputation.
• Developed Random forest and logistic regression models to observe this classification. Fine-tuned models to obtain more recall than accuracy. Tradeoff between False Positives and False Negatives.
• Evaluated models using Recall, F1 Ratio, KS Statistic, Cross Validation and ROC. Digital Grid Systems- Smart Meters as a Service
• Developed Ridge regression model to predict energy consumption of customers. Evaluated model using MAPE
• Developed a model to estimate consumption of various meter types in a substation at an hourly rate. Designed visual analytics for the same
• Performed streaming analytics using Microsoft PAAS components on 15 minute interval data collected by Azure IOT hub
• Engineered visual analytics for Meter Deployment, Meter Operations, Meter Security, Customer use-cases. Advisory Americas Quality
• Developed a multilinear regression model to identify impact of various quality parameters for Service Quality among EY’s engagements
• Designed a classification model to identify risk type depending on certain client survey metrics for all EY’s engagements
HARSHA RACHUMALLU
******.*.*.*@*****.*** +1-419-***-**** https://www.linkedin.com/in/sriharsharachumallu/
• Re engineered User role, time based visualization for all Americas EY’s engagements using embedded Power BI and Chart.js
Employer: Tata Consultancy Services, India November 2011 – July 2015 Data Scientist
Global Reporting Integrated Portal (GRIP)
• Developed a K-means clustering model to identify different customer clauses in OTC FX market trading
• Involved in development of visual analytics on intraday reported trades which occurs for every 15 minutes.
• Worked with different data science teams and provided respective data as required on an ad-hoc request basis
• Assisted both application engineering and data scientist teams in mutual agreements/provisions of data, deployment of production models etc.
Systems Engineer
SIVEX Report Distribution Engine
• Involved in analysis and requirement gathering, solution design, development for replacing derivative market report distribution engine for ABN AMRO
• Devised Single Sign-on (SSO) using firm’s Active Directory (AD) adhering to organization’s information security standards
• Collaborated with stake holders to address critical issues and implemented process improvements
• Worked with different teams to maintain application environments, deployments.
• Presented project progress reports and issues to the senior management. Global Reporting Integrated Portal (GRIP)
• Involved in development and enhancement of global reporting, intraday reporting engines for ABN AMRO
• Coding and enhancing features of the code, according to the client’s requirements
• Assisted client in multi-application deployment in respective servers/systems. Assistant Systems Engineer
EMIR Implementation
• Involved in implementation of European Market Infrastructure Regulation (EMIR) regulatory reporting for ABN AMRO to regulatory Regis-TR
• Involved in enabling Holland Clearing House to report OTC transactions to Regis-TR through ABN AMRO Generic Report Archival System
• Associated with implementation of generic archive system for all reports of ABN AMRO brokerage department
• Associated in developing distribution/redistribution of archived report to registered Email/SFTP using registered encryption algorithms
PROJECTS
Retail Analytics: Rossmann Store Sales Prediction
• Analyzed sales data of Rossmann stores, Germany to design functional prototype that provides improved decision making. Used XGBoost, ggplot2, RMSPE
Big Data: Analysis of Climatic and Temperature data from NCDC
• Performed analysis on 40 years climate data from https://www1.ncdc.noaa.gov/pub/data/noaa/
• Hadoop was used for data storage. Power BI,Spark, Hive and Pig were used to perform analysis on this data Regression on identifying accident leading factors
• Applied data science on ten years of crash data from The Transportation Safety Board of Nebraska (TSBN) to determine factors leading to accidents using SAS E Miner CERTIFICATIONS
• IBM Hadoop Fundations
• IBM Big Data Foundations
• MTA: Database Fundamentals
• Oracle Certified Java Programming