Sign in

Data Analyst Python

Clifton, NJ
October 15, 2019

Contact this candidate


Peiying Peng

Data Analyst

*+ years of IT experience in the field of Data Analysis.

Experience working in all phases of Software Development Life Cycle (SDLC) using methodologies like Waterfall, AgileScrum and Rapid Unified Process (RUP).

Experience inMachine Learning, Data mining with large data sets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization, Web Scraping. Adept in statistical programming languages like R and Python including Big Data technologies like Hadoop, Hive.

Have Good knowledge on Python Collections and Multi-Threading.

Experience in using various packages in R and python-like ggplot2, caret, dplyr, NLP, Reshape2, pandas, NumPy, Seaborn, SciPy, Matplotlib, sci-kit-learn, Beautiful Soup.

Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.

Highly experienced in Performance Tuning and Optimization for increasing the efficiency of the scripts on large database for fast accessing data and generating reports.

Experience in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.

Successful implementation of CA Data Maker and IBM Optim for Test Data Management, Data Privacy in testing environments for large clients .

Served the roles of CA Data Maker and IBM Optim Consultant, Data Analyst, ETL Test Analyst, Quality Assurance analyst and Production support.

Served the roles of CA Data Maker and IBM Optim Consultant, Data Analyst, ETL Test Analyst, Quality Assurance analyst and Production support.

Experience working with data analysis, including driving needs gathering, documentation creation and

Reporting Salesforce Dashboards and Reporting expertise Salesforce Dashboards and Reporting expertise Salesforce Dashboards and Reporting expertise

Proven experience working with Big Data and able to provide customized BI solutions

Design & Implementation of Data Extraction, Transformation & Loading (using SQL Loader, Informatica& other ETL tools), Analyze Oracle, SQL Server and Migration of the same.

Experience in creating scripts to fetch data from NoSQL database like MongoDB.

Experience in creating ETL packages using SSIS and Visual Studio.

Good Data Warehousing concepts including Meta data and Data Marts.

Excellent analytical and problem-solving skills and ability to work on own besides being a valuable and contributing team player.


SDLC, Agile, Waterfall, Scrum, Jira, Python, R, SQL, MongoDB, Oracle, SQL Server, Teradata, MySQL, Informatica, Power BI, SSRS, SSIS, Hadoop, Tableau,Pandas, MS Office, Visual Studio, Windows, Linux


State University of New York Plattsburgh, Plattsburgh, New York

Masters of Science in Data analytics (2019 May)

Bachelor of business and management (2017 Dec)

Major: Supply chain Management, Business Administration


UnitedHealth Group, NY Jan 2019 – current

Role: Data Analyst


Involved in migrating/assembling data as per business process management requirements.

Used SQL queries to extract data from different databases including testing and production for data validation and data analysis.

Performing statistical data analysis and data visualization using Python and R.

Performed data loading and SQL tuning/optimization using SQL*Loader, import and export, Explain Plan and hints.

Performing Data Preprocessing using Python/SAS based on the nature of the source system.

Extracted data from the database using SAS/Access, SAS SQL procedures and create SAS data sets.

Imported the customer data into Python using Pandas libraries and performed various data analysis - found patterns in data which helped in key decisions for the company.

Worked on MongoDB database concepts such as locking, transactions, indexes, Sharing, replication, schema design, etc.

Building, publishing customized interactive reports and dashboards, report scheduling using Tableauserver.

Extracted data from the databases (Oracle and SQL Server, DB2, FLAT FILES) using Informatica to load it into a single data warehouse repository.

Developed and maintained data dictionary to create metadata reports for technical and business purpose.

Huntington, NY March 2017 – Aug 2018

Role: Data Analyst


Maintenance of large data sets, combining data from various sources by Excel,Access and SQLqueries.

Built Factor Analysis and Cluster Analysis models using Python SciPy to classify customers into different target groups.

Worked on improving performance of existing Hive Queries.

Run SQL scripts, creating indexes, stored procedures for data analysis.

Prepared Scripts in Python and Shell for Automation of administration tasks.

Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.

Used pandas, NumPy, SciPy, matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms.

Wrote scripts in R to implement Predictive Modelling algorithms.

Wrote several Teradata SQL Queries using Teradata SQL Assistant for Ad Hoc Data Pull request.

Utilized Informatica toolset (Informatica Data Explorer, and Informatica Data Quality) to analyze legacy data for data profiling.

Used reverse engineering to create Graphical Representation (E-R diagram) and to connect to existing database.

Worked on creating filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.

Developed normalized Logical and Physical database models for designing an OLTP application.

Freelancing Project Jan 2016 – Dec 2016


Role: Data Analyst

Performed data profiling in the source systems.

Involved in defining the source to target data mappings, business rules and data definitions.

Evaluates data mining request requirements and help develop the queries for the requests.

Involved in Designing Star Schema, Creating Fact tables, Dimension tables and defining the relationship between them.

Verified and maintained Data Quality, Integrity, data completeness, ETL rules, business logic.

Performed Data Validation with Data profiling, Involved in Data Extraction from Teradata and Flat Files using SQL assistant.


Contact this candidate