Python developer

Location:

San Jose, CA, 95132

Posted:

March 02, 2023

Contact this candidate

Resume:

Sharath D

Email Id:***********@*****.***

Contact: 732-***-****

PROFESSIONAL SUMMARY:

Over 7+years of experience in Machine Learning, Datamining with large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization.

Experience in business analyst/Data Warehousing Design and architect, Dimension Data ModellingETL, OLPACube,Reporting and Other BItools

Experience in Technical Architect as a Sr. DW - BI Architect / Data / Information Architect.

Good knowledge of DW-BI Concepts, Experience in logical & physical data modeling using ErwinTool.

Experience in ETL&CubeDesign, Experience in Architect Design/Mapping & Capacity Planning documentation, Knowledge of DW design and development best practices.

Experience on Microsoft Azure Big Data – HDInsight, Hadoop, Hive, PowerBI, AzureSQLDataWarehouse.Knowledge on Azure MachineLearning (RL language) & Predictive Analysis, Pig, HBase,MapReduce,MongoDB, SpotFire, Tableu.

Experience in Machinelearning usingNLP text classification, Churn prediction using Python.

Experience in leading MS BI Centre of Excellence (COE)& Competency which includes taking trainings on MS BI tools (DW, SSIS, SSAS, SSRS), providing architect solutions and supporting MS BI practice.

Experience in Managerial experience as a Technical Project Manager and handled team size of 20.

Hands-on experience in managing all project stages including business development, estimation, requirements determination, gap analysis, business process reengineering, issue resolution, configuration, training, go-live assistance, vendor management and post implementation support.

Experience in Oil & Gas, Financial Banking, Telecom, Insurance, Finance, Healthcare & manufacturing Supply chain domain.

Knowledge of Machine Learning algorithms like: Classification, Clustering, Decision Tree algorithms and Time Series Methods.

Have hands on experience in applying SVM, Random Forest, K means clustering, nearest neighbor.

Design, build, validate and maintain machine learning based prediction models.

Proficiency in understanding statistical and other tools/languages - R, Python, C, C++, Java, SQL, UNIX, Qlikview data visualization tool and Anaplan forecasting tool.

Strong experience in the Analysis, design, development, testing and Implementation of Business Intelligence solutions using Data Warehouse/Data Mart Design, ETL, OLAP, BI, Client/Server applications.

Strong DataWarehousingETL experience of using Informatica9.1/8.6.1/8.5/8.1/7.1 Power Center Client tools - Mapping Designer, Repository manager, Workflow Manager/Monitor and Server tools Informatica Server, Repository Server manager.

Expertise in Data Warehouse/Data mart, ODS, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, data modeling, Effort Estimation, ETL Design, development, System testing, Implementation and production support.

Proficient in the Integration of various data sources with multiple relational databases like Oracle11g /Oracle10g/9i, MS SQL Server, DB2, Teradata and Flat Files into the staging area, ODS, Data Warehouse and Data Mart.

Experience in all phases of Data warehouse development from requirements gathering for the data warehouse to develop the code, Unit Testing and Documenting.

Experience in using Automation Scheduling tools like Tidal and Control-M

Expertise in development of software with major contribution in Data warehousing and Database Applications using Informatica, OraclePL/SQL, UNIX Shell Programming.

Experience in the migrating the code across different environments.

Experience in Creating and Validating Statistical Models using R and SAS.

Good experience in Text Mining of cleaning and manipulating text and finding sentiment analysis from text mining data.

Good experience in validating statistical model using Cross-Validation method.

Good knowledge in Building Statistical Models: Random Forest, Logistical & Linear Regression, ARIMA, Decision Tree, Clustering, XGBoost and Naive-Bayes Classifier etc.

EDUCATION:

MS Electrical Engineering - Northwestern Polytechnic University

B.Tech in Electrical and Electronics Engineering - JNTUH India

TOOLS AND TECHNOLOGIES:

Machine Learning

Regression, Classification, Clustering, Association, Simple Linear Regression, Multiple

linear Regression, Polynomial Regression, Decision Trees, Random Forest, Logistic

Regression, K-Nearest Neighbors(K-NN), Kernel SVM

R Language skills

Data Preprocessing, Web Scraping, Data Extraction, Dplyr, GGplot, Apply functions, Statistical Analysis, Predictive Analysis, GGplotly, rvest, Data Visualization.

Frameworks

Shogun, Accord Framework/AForge.net, Scala, Spark, Cassandra, DL4J, ND4J, Scikit-learn

Development Tools

Cassandra, DL4J,ND4J,Scikit-learn,Shogun,AccordFramework/AForge.net,Mahout, MLlib,H2O,ClouderaOryx,GoLearn, Apache Singa.

Modelling Tools

CA Erwin Data Modeler 7.1/4, Microsoft Visio 6.0, Sybase PowerDesigner16.5.

Version Controller

TFS, Microsoft Visual SourceSafe, GIT, NUNIT, MSUNIT

Software Packages

MS-Office 2003/ 07/10/13, MS Access, Messaging Architectures.

OLAP/ BI / ETL Tool

Business Objects 6.1 / XI, MS SQL Server 2008 / 2005 Analysis Services (MS OLAP, SSAS), Integration Services (SSIS),Reporting Services (SSRS), Performance Point Server (PPS),Oracle 9i,10g,11g,12c OLAP,MS Office Web Components (OWC11), DTS, MDX, Crystal Reports 10, Crystal Enterprise 10(CMC)

Web Technologies

Windows API, Web Services, Web API (RESTFUL) HTML5, XHTML, CSS3, AJAX, XML, XAML, MSMQ, Silverlight, Kendo UI.

Web Servers

IIS 5.0, IIS 6.0, IIS 7.5, IIS ADMIN.

Operating Systems

Windows Win8/XP/NT/95/98/2000/2008/2012, Android SDK.

Databases

SQL Server 2014/2012/2008/2005/2000, MS-Access, Oracle 11g/10g/9i and Teradata, big data, Hadoop, Mahout, ML lib, H2O,Cloudera Oryx, GoLearn.

Database Tools

SQL Server Query Analyzer.

PROFESSIONAL EXPERIENCE:

Client: ADP Astoria, NY Jun 2020 – Till Date

Role: Sr. Python Developer

Responsibilities:

Design and develop state-of-the-art deep-learning / machine-learning algorithms for analyzing image and video data, among others.

Develop and implement innovative AI and machine learning tools that will be used in the Risk

Experience with TensorFlow, Caffeand other Deep Learning frameworks.

Effective software development processes to customize and extend the computer vision and image processing techniques to solve new problems for Automation Anywhere.

Develop and implement innovative data quality improvement tools.

Will demonstrate cross-functional resource interaction to accomplish your goals.

Involved in Peer Reviews, Functional and Requirement Reviews.

Develop project requirements and deliverable timelines; execute efficiently to meet the plan timelines.

Creating and support a data management workflow from data collection, storage, analysis to training and validation.

Understanding requirements, significance of weld point data, energy efficiency using large datasets

Develop necessary connectors to plug ML software into wider data pipeline architectures.

Creating and support a data management workflow from data collection, storage, analysis to training and validation.

Experience in running Spark streaming applications in cluster mode and Spark log debugging.

Identify and assess available machine learning and statistical analysis libraries (including regressors, classifiers, statistical tests, and clustering algorithms).

Design and build scalable software architecture to enable real-time / big-data processing.

Acquire business knowledge in the Firm’s risk management processes.

Be very passionate about quality and have a strong sense of ownership on the work accomplished.

Be quick to learn new technologies as well as deliver on them in short order.

Taking responsibility for technical problem solving, creatively meeting product objectives and developing best practices.

Worked on requirements gathering for multiple functionality enhancements by engaging with business users and ascertaining their demands.

Involved in maintaining and uploading the Test Scripts.

Have a high sense of urgency to deliver projects as well as troubleshoot and fix data queries/ issues.

Work independently with R&D partners to understand requirements.

Understanding business process and Business problems thoroughly and forecasting the business using data science techniques.

Deeply involved in writing complex Spark-Scala scripts, Spark context, Cassandra SQL context, used multiple API's, methods which support data frames, RDD's, Cassandra table joins and finally write/save the data frames/RDD's to Cassandra DB.

Gathering required data from business users to achieve accurate training data for analysis.

Coordinate and communicate with technical teams for any data requirements.

Providing status of project-to-project manager and business team up to date.

Understanding the Business requirements based on Functional specification to design the ETL methodology in technical specifications.

Consolidation, standardization, matching Trillium for the unstructured flat file data.

Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Informatica Power Center 8.5.

Environment:Microsoft Azure HDInsight –Hadoop, Hive, HBase, Azure SQL Data Warehouse, SQL Server 2012, Integration Service (SSIS), Analysis Service (SSAS), Reporting Service (SSRS), Power BI 2.3, Share Point 2010, Naveego, Telerik, Mongo DB, Spot Fire, Tableau 10.

Client: Ceva Logistics Houston, TX Mar 2019 - May 2020

Role: Sr. Python Developer

Responsibilities:

Participated in stake holders’ meetings to understand the business needs & requirements.

Involved in preparation & design of technical documents like Bus Matrix Document, PPDM Model, and LDM & PDM.

Designed framework for sales requirements & Lead team of 5.

Provided technical solutions on MS Azure HDInsight, Hive, HBase, Mongo DB, Telerik, Power BI, Spot Fire, Tableau, Azure SQL Data Warehouse Data Migration Techniques using BCP, Azure Data Factory, and Fraud prediction using Azure Machine Learning.

Extensively used SQL, Numpy, Pandas, Scikit-learn, Spark, Hive for Data Analysis and Model building.

Participated in Business meetings to understand the business needs & requirements.

Prepared & designed technical documents like OLAPDesignDocument, ConceptualModel, and LDM&PDM.

Up skilled / Trained team on SQL Server 2012 for incoming new requirements.

Provided technical solutions for OLAP design and reporting requirements.

Participated in Business meetings to understand the business needs & requirements.

Prepare ETLarchitect& design document which covers ETLarchitect, SSISdesign, extraction, transformation and loading of Duck Creek data into dimensional model.

Provide technical & requirement guidance to the team members for ETL -SSISdesign.

Participated in Business meetings to understand the business needs & requirements.

Design ETL framework and development.

Design Logical & Physical Data Model using MS Visio 2003 data modeler tool.

Worked on Sqoop, Hadoop, Hive, Spark, Cassandra to build ETL and Data Processing systems having various data sources, data targets and data formats.

Prepare High Level Design (HLD) &Low Level Design (LLD) documents.

Provide technical & requirement guidance to the team members.

Manage development & enhancement of Project Change Request from client.

Participated in Architect solution meetings & guidance in Dimensional Data Modeling design.

Environment:Apache, Spark MLlib, TensorFlow, Oryx 2, Accord.NET, Amazon Machine Learning (AML)Python, Django, Flask, ORM, Jinja 2, Mako, Naive Bayes, SVM,K- means, ANN, Regression.

Client: Abbvie, Chicago, IL Jan 2018– Feb 2019

Role: Python Developer

Responsibilities:

Design Dimensional Data Modeling using Erwintool.

Design ETLArchitect for SSIS&CubeArchitect for SSAS.

Estimation of work/task using MS Project Plan and allocation of task/work among team members.

Onsite team coordination – Daily / Weekly Status report / meeting.

Responsible for Business Analysis and Requirements Collection and Understanding Business Problems

Interact regularly with Business leaders to set and manage expectations aligned to group capabilities.

Understanding business process and Business problems thoroughly and forecasting the business using data science techniques.

Gathering required data from business users to achieve accurate training data for analysis.

Coordinate and communicate with technical teams for any data requirements.

Guided team to implement EDA part for given sales data and analyze the results. Involved in Gathering, exploring and cleaning the data.

Use of cutting-edge data mining, machine learning techniques for building advanced customer solutions.

Assigning tasks to analytics team and reporting team and gathering inputs from them on regular basis.

Implemented techniques from artificial intelligence/machine learning to solve supervised and unsupervised learning problems.

Present results of analysis and prediction model evaluation to business executives.

Design, develop, maintain and communicate visual dashboards / reports and analysis based on business requirement needs

Providing status of project-to-project manager and business team up to date.

Understanding the Business requirements based on Functional specification to design the ETLmethodology in technical specifications.

Environment: Erwin r9.0, Informatica 9.0, ODS, OLTP, Oracle 10g, Hive, OLAP, DB2, Metadata, MS Excel, Mainframes MS Visio, Rational Rose, Requisite Pro, Hadoop, PL/SQL, etc.

Client: Citi Bank, New York City, NY Sept 2016 – Dec 2017

Role: Python Developer

Responsibilities:

Developing propensity models for Retail liability products to drive proactive campaigns.

Extraction and tabulation of data from multiple data sources using R, SAS.

Data cleansing, transformation and creating new variables using R.

Built predictive scorecards for Cross-selling Car loan, Life Insurance, TD and RD.

Scoring predictive models as per regulatory requirements & ensuring deliverables with PSI.

Data modeling and formulation of statistical equations using advanced statistical forecasting techniques.

Provide guidance and mentoring to team members.

Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation.

Formulate and test hypotheses, extract signals from peta-byte scale, unstructured data sets, and ensure that our display advertising business delivers the highest standards of performance.

Lead a multi-functional project team.

Develop necessary connectors to plug ML software into wider data pipeline architectures.

Applied association rule mining &chaid model to identify hidden patterns and rules in remedy ticket analysis which aid in decision making.

Understanding the client business problems and analyzing the data by using appropriate Statistical models to generate insights.

Integrated Teradata with R for BI platform and also implemented corporate business rules

Participated in Business meetings to understand the business needs & requirements.

Arrange and chair Data Workshops with SME’s and related stake holders for requirement data catalogue understanding.

Design Logical Data Model which will fit and adopt the Teradata Financial Logical Data Model (FSLDM11) using Erwin data modeler tool.

Present and approve designed Logical Data Model in Data Model Governance Committee (DMGC).

Environment:R Studio, Machine learning, Informatica 9.0, Scala, Amazon Machine Learning (AML)Python, Django, SAAS.

Client: Bank of America, Pittsburgh, PA Jun 2015 – Aug 2016

Role: Data Architect/Data Modeler

Responsibilities:

Deployed GUI pages by using JSP, JSTL, HTML, DHTML, XHTML, CSS, JavaScript, AJAX

Configured the project on WebSphere 6.1 application servers

Implemented the online application by using Core Java, Jdbc, JSP, Servlets and EJB 1.1, Web Services, SOAP, WSDL

Communicated with other Health Care info by using Web Services with the help of SOAP, WSDL JAX-RPC

Used Singleton, factory design pattern, DAO Design Patterns based on the application requirements

Used SAX and DOM parsers to parse the raw XML documents

Used RAD as Development IDE for web applications.

Preparing and executing Unit test cases.

Used Log4J logging framework to write Log messages with various levels.

Involved in fixing bugs and minor enhancements for the front-end modules.

Implemented Microsoft Visio and Rational Rose for designing the Use Case Diagrams, Class model, Sequence diagrams, and Activity diagrams for SDLC process of the application

Maintenance in the testing team for System testing/Integration/UAT

Conducted Design reviews and Technical reviews with other project stakeholders.

Created test plan documents for all back-end database modules

Implemented the project in Linux environment.

Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON,HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.

Contact this candidate