Data Manager

Location:

Hyderabad, Telangana, India

Posted:

January 29, 2018

Contact this candidate

Resume:

Kondal **************@*****.*** 469-***-****

Data Scientist

Over 8 years of Experience in Designing, Administration, Analysis, Management in the Business Intelligence Data warehousing Web-based Applications and Databases and Experience in industries such as Retail, Financial, Accounting, Distribution, Logistics, Inventory, Manufacturing, Marketing, Services, Networking and Engineering

Experience in all the Latest BI Tools Tableau, Qlikview Dashboard Design and SAS.

Analyze and extract relevant information from large amounts of data to help automate for self-monitoring, self-diagnosing, self-correcting solutions and optimize key processes.

Experience in data architecture design, development, maintenance for Windows and Android device applications.

Developing LogicalDataArchitecture with adherence to Enterprise Architecture.

Experience on advanced SAS programming techniques, such as PROC SQL (JOIN/ UNION), PROC APPEND, PROC DATASETS, and PROC TRANSPOSE.

Highly skilled in using visualization tools like Tableau, ggplot2 and d3.js for creating dashboards.

Experience in foundational machine learning models and concepts: regression, random forest, boosting, GBM, NNs, HMMs, CRFs, MRFs, deep learning.

Proficiency in understanding statistical and other tools/languages - R, Python, C, C++, Java, SQL, UNIX, Qlikview data visualization tool and Anaplan forecasting tool.

Strong DataWarehousingETL experience of using Informatica 9.1/8.6.1/8.5/8.1/7.1 Power Center Client tools - Mapping Designer, Repository manager, Workflow Manager/Monitor and Server tools Informatica Server, Repository Server manager.

Proficient in the Integration of various data sources with multiple relational databases like Oracle11g /Oracle10g/9i, MS SQL Server, DB2, Teradata and Flat Files into the staging area, ODS, Data Warehouse and Data Mart.

Experience in applying PredictiveModeling and MachineLearning algorithms for Analytical projects.

Developing Logical Data Architecture with adherence to Enterprise Architecture.

Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms.

Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical testing, normal distribution and other advanced statistical and econometrictechniques.

Developed predictive models using Decision Tree, Random Forest, Naïve Bayes, Logistic Regression, Cluster Analysis, and Neural Networks.

Experienced the full software life cycle in SDLC, Agile and Scrummethodologies.

Skilled in Advanced Regression Modeling, Correlation, Multivariate Analysis, Model Building, Business Intelligence tools and application of Statistical Concepts.

Excellent knowledge in Normalization (1NF, 2NF, 3NF and BCNF) and De-normalization techniques for improved database performance in OLTP, OLAP and Data Warehouse/Data Mart environments.

2+ years' experience in Agile background of software/data design, development, deployment to build services and customer support in Enterprise applications using Object Oriented Analysis and Design (OOAD).

Work on gigabytes of text and image files (2-D and 3-D) to solve real-world problems and visualize the data from the generating data reports using Google Data Studio for customer usability.

Good track record of working with complex data sets and translating data into insights to drive key business and product decisions.

Experience with Azure, SQL and Oracle PL/SQL.

Experience working with Amazon Web Services (AWS) product like S3

Involved in a Aveva start-up mode and contributed to projects using Amazon Web Services (AWS) to develop and deploy applications support on device and cloud

Hands on experience with scripting languages like Perl, Bash Shell and PHP (for automation)

Good understanding of scalable data processing to discover hidden patterns, conducting error analysis in the data for financial and statistical modeling.

Machine Learning

Regression, Classification, Clustering, Association, Simple Linear Regression, Multiple

linear Regression, Polynomial Regression, Decision Trees, Random Forest, Logistic

Regression, K-Nearest Neighbors(K-NN), Kernel SVM

R Language skills

Data Preprocessing, Web Scraping, Data Extraction, Dplyr, GGplot, Apply functions, Statistical Analysis, Predictive Analysis, GGplotly, rvest, Data Visualization.

Frameworks

Shogun, Accord Framework/AForge.net, Scala, Spark, Cassandra, DL4J, ND4J, Scikit-learn

Development Tools

Cassandra,DL4J,ND4J,Scikit-learn,Shogun,AccordFramework/AForge.net,Mahout, MLlib,H2O,ClouderaOryx,GoLearn, Apache Singa.

Modelling Tools

CA Erwin Data Modeler 7.1/4, Microsoft Visio 6.0, Sybase PowerDesigner16.5.

Version Controller

TFS, Microsoft Visual SourceSafe, GIT, NUNIT, MSUNIT

Software Packages

MS-Office 2003/ 07/10/13, MS Access, Messaging Architectures.

OLAP/ BI / ETL Tool

Business Objects 6.1 / XI, MS SQL Server 2008 / 2005 Analysis Services (MS OLAP,SSAS), Integration Services (SSIS),Reporting Services (SSRS), Performance Point Server (PPS),Oracle 9i OLAP,MS Office Web Components (OWC11), DTS, MDX, Crystal Reports 10, Crystal Enterprise 10(CMC)

Web Technologies

Windows API, Web Services, Web API (RESTFUL) HTML5, XHTML, CSS3, AJAX, XML, XAML, MSMQ, Silverlight, Kendo UI.

Web Servers

IIS 5.0, IIS 6.0, IIS 7.5, IIS ADMIN.

Operating Systems

Windows Win8/XP/NT/95/98/2000/2008/2012, Android SDK.

Databases

SQL Server 2014/2012/2008/2005/2000, MS-Access, Oracle 11g/10g/9i and Teradata, big data, hadoop, Mahout, ML lib, H2O, Cloudera Oryx, GoLearn.

Database Tools

SQL Server Query Analyzer.

Wells Fargo Charlotte, NC (Data Scientist) Oct ‘16 to till date

Responsibilities:

Responsible for performing Machine-learning techniques regression/classification to predict the outcomes.

Responsible for design and development of advanced R/Python programs to prepare transform and harmonize data sets in preparation for modeling.

Identifying and executing process improvements, hands-on in various technologies such as Oracle, Informatica, and Business Objects.

Designed the prototype of the Data mart and documented possible outcome from it for end-user.

Involved in business process modeling using UML

Developed and maintained data dictionary to create metadata reports for technical and business purpose.

Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.

Interaction with Business Analyst, SMEs and other Data Architects to understand Business needs and functionality for various project solutions.

Created SQL tables with referential integrity and developed queries using SQL, SQL*PLUS and PL/SQL.

Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats

Performance tuning of the database, which includes indexes, and optimizing SQL statements, monitoring the server

Wrote simple and advanced SQL queries and scripts to create standard and adhoc reports for senior managers.

Collaborate the data mapping document from source to target and the data quality assessments for the source data.

Created PL/SQL packages and Database Triggers and developed user procedures and prepared user manuals for the new programs.

Participated in Business meetings to understand the business needs & requirements.

Prepare ETLarchitect& design document which covers ETLarchitect, SSISdesign, extraction, transformation and loading of Duck Creek data into dimensional model.

Provide technical & requirement guidance to the team members for ETL -SSISdesign.

Participated in Business meetings to understand the business needs & requirements.

Design ETL framework and development.

Design Logical & Physical Data Model using MS Visio 2003 data modeler tool.

Participated in stake holders meetings to understand the business needs & requirements.

Participated in Architect solution meetings & guidance in Dimensional Data Modeling design.

Coordinate and communicate with technical teams for any data requirements.

Environment: Machine learning, AWS, MS Azure, Cassandra, Spark, HDFS, Hive, Pig, Linux, Python (Scikit-Learn/Scipy/Numpy/Pandas), R, SAS, SPSS, Mysql, Eclipse, PL/SQL, SQL connector, Tableau.

Automation Anywhere, Sanjose CA (Data Scientist) Aug ‘15 to Oct ‘16

Responsibilities:

Developed, tested and productionized a machine learning system for UI optimization, boosting CTR from 18% to 24% for the company’s website

Performed data preprocessing on huge data sets containing millions of rows including missing data imputation, noise and errortagging/removal, data consolidation and much more

Generalized feature extraction in the machine learning pipeline which improved efficiency throughout the system

Extracted customer time series data from millions of web logs using Apache Spark

Used predictive modeling with tools in SPSS, Python

Applied concepts of probability, distribution and statistical inference on customers’ data to unearth interesting findings through use of comparison, T-test, F-test, R-squared, P-value etc

Developed SQL scripts for creating tables, Sequences, Triggers, views and materializedviews

Designed several high-performance prediction models using various packages in Python like Pandas, Numpy, Seaborn, SciPy, Matplotlib, Scikit-learn, Pandas-datareader, Statsmodels

Developed several ready-to-use templates of machine learning models based on specifications given and assigned clear descriptions of purpose and variables to be given as input into the model

PerformedHadoop ETL using Hive on data at different stages of pipeline

Developed models for Information Retrieval from financial Trade chats using Natural Language processing and Machinelearning

Collaborated with technologists and business stakeholders to drive innovation from conception to production

Developed MapReduce/SparkPythonmodulesfor machine learning & predictive analytics in Hadoop on AWS

Responsible for creating Hivetables, loading the structured data resulted from MapReducejobs into the tables and writing Hivequeries to further analyze the logs to identify issues and behavioral patterns

Developed architecture around models for multi-task learning, distributedtraining on multiple machines, and integration into consumer facing API

Involved in creating amonthly retention marketing campaign which improved customer retention rate by 15%

Prepared reports and presentationsusing Tableau, MSOffice, ggplot2that accurately convey data trends and associated analysis

Environment: Hadoop HDFS, MapReduce/YARN, HiveQL, Apache Spark, R, SPSS, Python, Google Analytics, Data Mining, Seaborn, SQL, Regression, Cluster analysis, Git hub, Tableau, Amazon EC2, Amazon RDS, WINDOWS/Linux platform

CVS Health, Woonsocket, RI (Data/Business Analyst) Jan’ 14 to Aug’15

Responsibilities:

Involved in requirementscollection, gapanalysis,reporting and documentcreation

Documented the complete process flow to describe programdevelopment, logic, testing, and implementation, applicationintegration, coding

Assessed completeness, consistency, and validity of customer data and created models and simulations

Explored and analyzed customer historical billing information to build a predictive model to forecast customers increasing or declining product use

Participated in the Agile planning process and dailyscrums, provided details to createstories based on technical solutions and estimates and worked with internal architects and, assisted in the development of current and target state data architectures

Documented the complete process flow to describe programdevelopment, logic, testing, and implementation, applicationintegration, and coding

Analyzed sales and performance records, and interpreted results.

Evaluated dataprofiling, cleansing, combination and extraction devices

Developed complex SQL queries to bring data together from various systems.

Performed Dataalignment and Datacleansing

Involved in Data Migration between Teradata and MS SQL server

Sourced and analyzed data from a variety of sources like MS Access, MS Excel, CSV and flatfiles

Used Visual Studio report builder to design report of varying complexity and maintain system design documents

Used ETL process to Extract, Transform and Load the data into stage area and data warehouse

Used Tableau and MSPowerPoint and MSExcel to produce reports

Environment: SQL Server 2005 Enterprise, MS Visio, MS Project, MS-Office, MS Excel, MS PowerPoint, MS Word, Macros, Teradata, Tableau, ETL, ER Studio, XML and Business Objects

Flexera Software, Chicago, IL (Data/Business Analyst ) Oct ‘12 to Dec’13

Responsibilities:

Involved in various activities of the project, like information gathering, analyzing the information, documenting the functional and non-functional requirements.

Worked in Data warehousing methodologies/Dimensional Data modeling techniques such as Star/Snowflake schema using ERWIN9.1.

Extensively used Aginity Netezza workbench to perform various DDL, DML etc. operations on Netezza database.

Designed the Data Warehouse and MDM hub Conceptual, Logical and Physical data models.

Involved in Perform Daily Monitoring of Oracle instances using Oracle Enterprise Manager, ADDM, TOAD, monitor users, table spaces, memory structures, rollback segments, logs, and alerts.

Used ER Studio Data/ Modeler for data modeling (data requirements analysis, database design etc.) of custom developed information systems, including databases of transactional systems and data marts.

Involved in Teradata SQL Development, Unit Testing and Performance Tuning and to ensure testing issues are resolved on the basis of using defect reports.

Involved in customized reports using SAS/MACRO facility, PROC REPORT, PROC TABULATE and PROC.

Used Normalization methods up to 3NF and De-normalization techniques for effective performance in OLTP and OLAP systems.

Generated DDL scripts using Forward Engineering technique to create objects and deploy them into the databases.

Involved in database testing, writing complex SQL queries to verify the transactions and business logic like identifying the duplicate rows by using SQL Developer and PL/SQL Developer.

Used Teradata SQL Assistant, Teradata Administrator, PMON and data load/export utilities like BTEQ, FastLoad, Multi Load, Fast Export, Tpump on UNIX/Windows environments and running the batch process for Teradata.

Worked on data profiling and data validation to ensure the accuracy of the data between the warehouse and source systems.

Worked on Data warehouse concepts like Data warehouse Architecture, Star schema, Snowflake schema, and Data Marts, Dimension and Fact tables.

Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and Bulk collects.

Environment: Windows XP, SQL Developer, MS-SQL 2008 R2, MS-Access, MS Excel and SQL-PLU, Java, SSRS, SSIS.

Polaris, Pune (Data/Business Analyst) – Client Cisco Aug’09 to Sept’12

Responsibilities:

Developed Apex Classes, Controller Classes and Apex Triggers for various functional needs in the application.

Migrated data from external sources and performed Insert, Delete, Upsert & Export operations on millions of records. Designed and developed Service cloud and Integration.

Writing and executing customized SQL code for ad hoc reporting duties and used other tools for routine

Developed stored procedures and complex packages extensively using PL/SQL and shell programs

Involved in customized reports using SAS/MACRO facility, PROC REPORT, PROC TABULATE and PROC

Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy SQL Server database systems

Used existing UNIX shell scripts and modified them as needed to process SAS jobs, search strings, execute permissions over directories etc.

Extensively used Star Schema methodologies in building and designing the logical data model into Dimensional Models

Involved in designing Context Flow Diagrams, Structure Chart and ER- diagrams

Worked on database features and objects such as partitioning, change data capture, indexes, views, indexed views to develop optimal physical data mode

Worked with SQL Server Integration Services in extracting data from several source systems and transforming the data and loading it into ODS

Involved in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.

Environment: Windows XP, SQL Developer, MS-SQL 2008 R2, MS-Access, MS Excel and SQL-PLU, Java.

Education Details:

Bachelor of Science from JNTU, Hyderabad Major as Computer Science

Contact this candidate