Summary
Knowledgeable and result-oriented professional with 6+ years of experience in data analytics and data science.
Good experience in Software Development Life Cycle (Analysis, Design, Development, Testing, Deployment and Support) in Agile and Waterfall methodologies.
Experience in using various packages in Python like Pandas, NumPy, SciPy, Matplotlib, and Scikit-Learn.
Experience in Text Analytics, generating data visualizations, manipulate data for data loads, extracts, statistical analysis, modeling, and data munging using Python and proficient R user with knowledge of statistical programming languages SAS.
Experience in Implementing procedures for extracting Excel sheet data into the mainframe environment by connecting to the database using SQL.
Good experience and knowledge of Amazon Web Services (AWS): EC2, S3 and IAM.
Good working Experience in various data analysis platforms including Jupyter Notebook and Spider.
Good knowledge of implementing Decision Trees, Linear and Logistic Regression, supervised and Unsupervised Learning principal Component Analysis.
Experience in building, publishing customized interactive reports and dashboards with customized parameters and user-filters using Tableau, Power BI.
Experience in Data Integration Validation and Data Quality controls for ETL process and Data Warehousing using SSIS.
Experience in creating and documenting Metadata for OLTP and OLAP when designing a system.
Good knowledge of transforming business resources and requirements into manageable data formats and analytical models, designing algorithms, building models, and reporting solutions that scale across a massive volume of structured and unstructured data.
Experience in Normalization and De-Normalization techniques for optimum performance in relational and dimensional database environments.
Working knowledge of relational and non-relational databases such as MySQL, SQL Server, Teradata, and MS Access.
Good knowledge of Data Warehousing principles Fact Tables, Dimensional Tables, Dimensional Data Modeling - Star Schema and Snowflake Schema.
Experience in project management tools like Jira and Maintain version control of code using tool such as GIT.
Good working with Operating Systems like Linux and Windows environments.
Effective team player with strong communication and interpersonal skills, possess a strong ability to adapt and learn new technologies and new business lines rapidly.
Skills
Methodologies:
SDLC, Agile, Waterfall
Programming Language:
Python, R, SQL
IDEs:
Jupyter Notebook, Spyder
Machine Learning:
Linear regression, logistic regression, decision trees, supervised Learning,
Unsupervised Learning
Cloud:
AWS
Python Packages:
NumPy, Pandas, Matplotlib, SciPy, Scikit-Learn
Visualization Tools:
Tableau, Power BI, SAS
ETL Tools:
SSIS, SSRS
Database:
SQL Server, MySQL, Teradata
Other Tools:
JIRA, Git, OLAP, KPI, MS Office
Operating System:
Windows, Linux
Education
Master’s In Computer Science
California State University, Fullerton, CA
Experience
Goldman Sachs Group, MA Mar 2020 - Current
Roles: Data Scientist
Responsibilities:
Followed Agile Methodology for application Implementation and Testing.
Worked on R and Python for programming for improvement of model. Upgraded the entire models for improvement of the product.
Working with open-source tools Jupyter Notebook and Spyder for statistical analysis and building the machine learning.
Responsible for Built classification models include Linear regression, Logistic Regression, and Decision Tree.
Involved in writing T-SQL, working on SSIS, SSRS, Data Cleansing, Data Scrubbing and Data Migration.
Responsible for Improved sales and logistic data quality by data cleaning using NumPy, SciPy, Pandas, and Scikit-Learn in Python.
Created SQL reports, data extraction and data loading scripts for different databases and schemas.
Developed Tableau data visualization using Scatter Plots, Geographic Map, Pie Charts and Bar Charts, and Density Chart.
Involved in performing data conversions from flat files into a normalized database structure using SSIS and other tools.
Generated data extracts in Tableau by connecting to the view using Tableau MySQL connector.
Create new EC2 instance in AWS, allocate volumes and giving Provisionals using IAM.
Responsible for writing complex SQL queries for extracting large volume data.
Created Tableau Server landing page to give user quick access to their most recently accessed reports.
Created Multiset, temporary, derived, and volatile tables in Teradata database.
Prepare detailed statistical reports in SQL, SSRS, R & SPSS to track progress of weekly, monthly and quarterly Students Trends and Enrollment.
Perform statistical analysis of data using SAS procedures and R packages (such as mixed models, linear, logistic, and nonlinear regression, multivariate regression, simulation modeling, best model selection, and survival data analysis)
Used JIRA tool and other internal issue trackers for the Project development.
Gained expertise on high performance data integration solutions - Microsoft SQL Server Integration Service (SSIS).
Worked on version control system tools like Git.
Environment: SDLC, Agile, R, Python, Jupyter Notebook, Spyder, AWS, EC2, IAM, NumPy, SciPy, Pandas, Scikit-Learn, Teradata, SAS, SQL Server, Tableau, OLAP, OLTP, JIRA, GIT.
Deloitte, India Jan 2017 – Jan 2019
Roles: Business Data Analyst
Responsibilities:
Involved in requirements gathering, Analysis, Design, Development, testing production of an application using the Agile model.
Used pandas, NumPy, SciPy, matplotlib, Scikit-Learn in Python for developing various machine learning algorithms.
Responsible for extracted patterns in the structured and unstructured data set and displayed them with interactive charts using R and Python.
Worked on data analysis with various analytic tools, such as Jupyter Notebook and Spyder.
Worked on customer segmentation using an Unsupervised learning technique-clustering and supervised, regression techniques to create a building model.
Analyzed old information architectures and contributed to the design and development of the new ones.
Created SQL scripts for testing and validating data on various reports, dashboards, and scorecards and handled performance issues effectively in Tableau.
Designed various analytical reports and dashboards on Sales performance, campaign response data in Tableau.
Used Amazon IAM to grant fine-grained access to AWS resources to users.
Built reports against Single Table, Multiple Tables and built formulas in Tableau for various business calculations.
Worked on to create ETL packages to Validate, Extract, Transform and Load data into Data Warehouse Using SSIS.
Designing and developing data ingestion, aggregation, and advanced analytics from MySQL.
Developed SAS coding and table templates for preparing, processing, and analyzing clinical data.
Responsible for Applied analysis methods such as Hypotheses testing and Analysis of variance (ANOVA) for validating the existing models on the observed data.
Confident communicator to ensure clients’ needs are always met in a manner that is consistent with their specifications and requirements.
Extracted data from Access to MS Excel and Analyzed database for dup/missing data while maintaining data integrity.
Involved in OLAP processing for changing and maintaining the Warehousing Optimizing Dimensions, Hierarchies and adding the Aggregations to the Cube.
Developed the necessary Stored Procedures and created Complex Views using Joins for robust and fast retrieval of data in SQL Server.
Used JIRA for defect tracking and project management and developed the project in Linux environment. Environments: Agile, R, Python, Jupyter Notebook, Spyder, Pandas, AWS, IAM, NumPy, SciPy, Matplotlib, Scikit-Learn, SSIS, SQL Server, MS Excel, MySQL, SQL Server, SAS, JIRA, Git.
IBM, India Apr 2013 – Dec 2016
Roles: Consultant
Responsibilities:
Involved in requirements gathering, Analysis, Design, Development, testing production of an application using the Waterfall model.
Implemented Data Exploration to analyze patterns and to select features using Python.
Embedded Power BI reports to internal Portal to manage access of reports and data for individual user using based on roles.
Developed SQL scripts using OLAP functions like rank Over to improve the query performance while pulling the data from large tables.
Utilizing Power BI to create various analytical dashboard that depicts critical KPIs.
Import data from multiple data sources and create automated reports in Power BI after cleaning and preprocessing the data.
Highly experienced in SAP HANA administration and Solution Manager configurations.
Worked collaboratively in a team environment.
Utilized strong interpersonal and communication skills to serve customers.
Assigned as the point of contact for the new Graduate hires, conducted weekly technical KT sessions including project-
specific tool’s access.
Oracle administrations using BR tools, Tablespace extensions, Table reorganization.
Deep understanding of business processes, and have experience working with people on different backgrounds, priorities, and responsibilities.
Created an Analysis tool using Excel Power Pivot loading millions of rows of data to Excel from SQL.
Developed normalized Logical and Physical database models for designing an OLTP application.
Used pandas, NumPy, SciPy and Matplotlib in Python for developing testing and data analysis.
Developed CSV files and reported offshore progress to management with the use of Excel Templates, Excel macros, Pivot tables and functions.
Used Git for version control with Data Engineer team and Data Scientists colleagues.
Review and analyzed user Stories in Jira to provide Level of effort for testing.
Environments: Waterfall, Python, Power BI, OLAP, KPIs, SQL, MySQL, OLTP, pandas, NumPy, SciPy, Matplotlib, Excel, Git, Jira.
Phone: 401-***-**** Email: **********.*@***********.*** Location: MA
Prathyusha Rao
Data Scientist