Post Job Free
Sign in

Data Analyst Python

Location:
Cypress, TX
Posted:
April 12, 2021

Contact this candidate

Resume:

SAINEE BORAH

***** ******* ***** *****, *******, TX- 77433, +1-682-***-****, *****.******@*****.*** http://www.linkedin.com/in/sainee

SUMMARY

Highly analytical and process-oriented data analyst with excellent understanding of business processes and operations. Expertise in data analysis, business intelligence and in-depth knowledge of database styles and research methodologies. Advanced proficiency in statistical and analytical tools for interpreting and analysing data for driving business solutions.

EDUCATION

• Master of Science in Information Systems 08/27/2015 - 05/13/2017 University of Texas at Arlington, Dallas, Texas, USA

• Bachelor of Engineering in Electronics and Communication Visveshvaraya Technological University, India 05/16/2008 - 08/17/2012 TECHNICAL SKILLS AND ACADEMIC ACHIEVEMENTS

• Tools: PyCharm, Spyder, Anaconda3, Advanced Excel, Microsoft Office, Power BI, Minitab, Tableau, Spotfire, Kapow Katalyst (RPA), Eclipse, Sublime, MySQL, ETL, SQL server management studio, SSIS, SSRS, SSAS, STATA,WEKA, R Studio, IBM Watson, SAP NetWeaver, SAP Business Object Explorer, SQLite Studio, Visual paradigm, Git, BitBucket, GIS, Google Earth, QTP, QuickCenter (QC), Application Lifecycle Management(ALM), Putty, Microsoft Teams.

• Core Competencies: Advanced Statistics, Data Analysis with Python (libraries like pandas, numpy, seaborn, sklearn, matlplolib, NLTK, beautiful Soup etc.), Data Mining with Python/WEKA, Supervised and Unsupervised Machine Learning Algorithms with Python, Data warehouse and Business Intelligence, Database management, Recommender Systems, Text Analysis, Sentiment Analysis, Market Basket Analysis using R/ Python, Programming in JAVA, SQL, SAP Configuration cases, SAP Business Warehouse, Automation Testing tools. Also well versed in documentation of project related data and information, organizing project materials and writing synopsis.

• Certificates: IBM certified Data Science and AI using Python, IBM certified Machine Learning using Python, Text Analytics using SAP HANA (openSAP), R programming (Coursera), SQL boot camp (Udemy), Power BI (Udemy).

• Software Languages: Python, R, SQL, PL/SQL, JAVA, C++

• Awards: Awarded with a certificate for academic excellence at undergraduate level, Employee Recognition award at Wood Mackenzie.

WORK EXPERIENCE

Data Analyst Lead- Carollo Engineers, Houston, TX, USA 07/08/2019- 12/31/2020

(Address: Millennium Tower, 10375 Richmond Ave #1625, Houston, TX 77042)

• Apply knowledge and experience with practical application of data science to Carollo internal organization efforts and client projects for the water industry. Collaborate across Carollo client service and project teams to conduct data analyses, develop data models, and create accurate data presentations.

o Currently working on developing an Advanced Infrastructure Analytics Platform (AIAP) with one of Carollo’s clients (Houston Public Works- City of Houston). o Practical experience working with large structured and unstructured datasets, building data models and ETL processes to collect, extract, and clean the data for subsequent reporting and analysis

o Developed Python code for analysis of City of Sugarland’s Water and Waste Water Assets. Used Power BI as visualization tool to create reports for their water and wastewater asset management needs.

o Research and develop statistical learning models for data analysis from various sources including GIS, SCADA, CMMS, LIMS, financial, hydraulic models, project management, and other information systems.

o Implement new statistical, learning, or other mathematical methodologies as needed for specific models or analysis.

o Communicate results and ideas to key decision makers within Carollo management team and to Carollo clients using various data analysis, modelling, and visualization tools as appropriate.

o Optimize joint development efforts with project teams through appropriate database use and model design.

Data Analyst- Wood Mackenzie, Houston, TX, USA 05/15/2017- 07/05/2019

(Address: 5847 SAN FELIPE 10TH FLOOR SUITE 1000, Houston TX 77057 USA)

• Anchor in building processes, governance and process innovation of variety of data-related tasks and responsibilities. Advanced data analysis, data visualization, data cleansing and wrangling of Oil and Gas datasets. Research and project management with the use of Python, Excel, SQL, SSIS and Spotfire tool.

o Sourcing and scrapping data from various web sources using Python and various libraries in PyCharm. Exploring the data, cleaning it and transforming it to more machine friendly inputs. Worked on transferring existing and create new processes to Woodmac 2.0 framework.

o Hands on experience on the Spotfire Tool and Data Structure used in NAWAT project. Perform data analysis with exceptional speed and accuracy. Browsing through the NAWAT, understanding what is available and how clients are using them. o Integrate with the team, knowing how to approach different data concerns. Complete the entire process (production/completion) of at least 3 states every quarter. o Create and modify processes using SSIS and SQL. Prepare analyses, charts, and tables for internal customers’ review and further analysis.

o Import, export and manipulate large data sets under tight deadlines. o QC and source missing data points and well attributes weekly. And ensure all well attribute files assigned to me are treated correctly. Recheck and determine metrics that violate their expected ranges. And ensure all error generated files are processed accordingly. o Have a very good understanding of AWS architecture and how different components like S3 Bucket, ECR, ECS, Fargate, persist, emit, Cloudwatch work etc. Have a good understanding of the event based architecture of Kafka architecture and various concepts - Kafka clusters, zookeeper, nodes, topics, partitions and target systems. o Support/clarify client requests in a timely manner, digging deep into the root cause and providing solutions.

Data Analyst Intern- Wood Mackenzie, Houston, TX, USA 02/27/2017-05/12/2017

(Address: 5847 SAN FELIPE 10TH FLOOR SUITE 1000, Houston TX 77057 USA)

• Assist the Senior Manager, Governance and Process Innovation in a variety of data-related tasks and responsibilities, including data analysis, data visualization, data cleansing and wrangling, research, and project management with the use of Excel, SQL, SSIS and Spotfire tool. o Efficiently search documents and databases to gather information. Collect and compile data using feeds from both external and internal Wood Mackenzie sources. Manually review and input missing data points from electronic data scrubbing. Search websites for missing well data that our data scrapers didn’t pull electronically; taking that data and inputting it into directly into Wood Mackenzie databases

o Perform data analysis with exceptional speed and accuracy. Ensure integrity of our own sourced data through identification and implementation of creative quality control methods. Produce reports and data queries as required by the team. o Prepare analyses, charts, and tables for internal customers’ review and further analysis. Develop relationships with wider Data Analyst community. Systems Engineer- Infosys Limited, Bangalore, India 08/27/2012-05/29/2015

(Address: INFOSYS LTD. CORPORATE HEADQUARTERS, ELECTRONICS CITY, HOSUR ROAD, BENGALURU KARNATAKA 560100 INDIA)

• Extracted, compiled, analysed complex data and used PL/Sql block to export the data from old data base to new data base. Used advanced Excel to generate spreadsheets and pivot tables.

• Performed data warehousing for Anglian Water Services client application. o Performed OLAP (online analytical processing) of large volumes of data. o Verifying the data integrity, record count, null values and duplicate values by writing queries on SQL server management studio and ETL.

o Used SSIS, SSAS packages for Integration and Analysis. Created reports utilizing SSRS. o Performed daily data queries and prepared reports on daily, weekly, monthly and quarterly basis.

• Executed Regression functional testing for PriceWaterhouseCoopers and Lexis Nexis client applications.

o Work consisted of running automation scripts on QTP, checking the results and occasionally testing bugs and failed results manually through a third party database platform, logging the results and updating them on ALM.

• Worked on workflow design, email-notifications and update sets in Bombardier Service Now project.

o Worked with client on functional requirement with ServiceNow, design and implementation of new functionality using Business Rule, UI policy and client scripts, SCCM integration. Worked on Service Catalogue which involves UI macro and workflow. PROJECT IN ACADEMICS

• Data mining project on Census Data

Performed data cleaning, data mining on a data set of 50000 instances and 14 attributes.

We wanted to predict whether a person earns below or above $50K a year based on demographic characteristics.

This was achieved by using EXCEL, combination of a variety of classification algorithms, selection of factors and WEKA (JAVA based mining software).

• Machine Learning Projects

Performed regression, classification and clustering using Python modules like sci-kit learn and SciPy.

Applied supervised and unsupervised learning, model evaluation, and Machine Learning algorithms using Python on real-life examples like cancer detection, predicting economic trends, predicting customer churn, recommendation engines, and many more.

• Business Intelligence project

Existing data from diverse data sources were consolidated into a uniform SQL Server database. ETL capabilities of SAP NetWeaver Business Warehouse were used to define imports of data, build scheduling packages, to define integration flows and workflows.

Created a number of cubes, dimensions and business critical KPIs using SAP Business Object Analysis representing aggregations in several ways.

Developed several detail and summary reports including line and pie charts, trend analysis reports and sub reports according to business requirements using SAP BW Reporting Tool.

• Analysis on World Development Indicators dataset:

Used statistical techniques for hypothesis testing to validate data. Performed linear and multiple regression analysis using R and STATA.

Analysed how well different factors indicate a country’s economic development as measured by GDP.

Presented finding and proposed solutions for improvement in GDP.

• System Analysis and Design Project:

Created System Request, Feasibility analysis, effort estimation, Requirement Definition, Activity diagram, Use Cases diagram, Class and Sequence diagram for an apartment based information system. Used Excel and Visual Paradigm.

• Data Science using Python

Calculated sentiments for each candidate in the dataset using: 1) AFFIN11.txt; 2) the Hu & Liu lexicon of positive and negative words; 3) ANEW module. Also determined who has the most positive comments and the most negative. For each candidate, displayed the top five states by: 1) positive sentiments and 2) negative sentiments.

Performed topic analysis on an entire set of tweets. Used 25 topics to extract from topic modelling software – NMF and LDA under Gensim. Used Tableau to visualize results.

Performed feature engineering (selecting important features) using the following classifiers :a) Support Vector Machine with “linear” and “rbf” kernels, b) Decision Tree, c) RandomForest, d) ExtraTrees. Used the following metrics to evaluate them: precision, recall, area under the curve (AUC), accuracy score.

• Transformer Fault Detection using GSM and ZigBee (Undergrad final year project)

This system is installed at the distribution transformer site and measures parameters like temperature, voltage, oil level status etc. using sensors.

Helps the utilities to optimally utilize transformers and identify problems before any catastrophic failure.

AREAS OF INTEREST

• Data Science and Machine Learning

• Data Warehousing and Business Intelligence

• DBMS

• Tools to analyse businesses

LEADERSHIP EXPERIENCE

• An active participant in UTA’s Multicultural Affairs and Mavs Go Green Organizations.

• Volunteered at ASUG- America’s SAP Users’ Group Conference in Dallas.

• Trained new employees to be competent at testing applications via knowledge transfer.

• Carefully chosen to educate clients with various approaches of understanding data and their significance and how to fit in our application.



Contact this candidate