Sai Rakesh
Senior Data Analyst
Email: ************@*****.*** Phone: 707-***-****
Professional Summary:
• Overall 9+ years of extensive experience in data analysis using Python, SQL, Alteryx, PowerBI, and Tableau.
• Expertise in Tableau, Tableau Desktop, Tableau Server, Tableau Online, Tableau Public, Tableau Reader, and MS SQL Server.
• Expertise in Python, SQL, and ETL tools (Apache Airflow) for efficient data extraction, transformation, and loading.
• Proficient in Hadoop and Apache Spark for processing and analyzing large-scale datasets efficiently and in real-time.
• Extensive usage of AWS for connecting to data sources on the cloud using Amazon Redshift.
• Designed several business dashboards using Power BI, Tableau Desktop, and Tableau server and hosted them across the client servers.
• Responsible for creating SQL datasets for Tableau and Ad-hoc reports.
• Medical Claims experience in Process Documentation, Medicare, Analysis and Implementation in 835/837/834/270/271(EDI HIPAA X12 Standards) processes of Medical Claims Industry from the Payer side
• Analyzed EDI transaction data to extract and interpret critical business information, ensuring accurate data flow between systems.
• Performed data analysis and data profiling using complex SQL on various source systems including Oracle and Teradata.
• Strong understanding of FACETS and Facets Data Model working on data model and data extracts.
• Robust MS Access and Excel skills with Power Pivot, Pivot table, Pivot Charts, Power View, Power BI, and VLOOKUP functions and formulas.
• Extensive experience as Production Support Personal in various multiple Data warehouse Projects and extensively worked with offshore teams.
• Experience in Working and designing and populating dimensional models (star and snowflake schema) for a Data warehouse and data marts.
• Extensive experience in designing, developing, and delivering business intelligence solutions using Power BI and Reporting Services (SSRS).
• Expertise in Relational Database Management Systems, Tableau, Looker, AWS cloud computing, Hadoop, and Spark.
• Used SSRS to create basic reports containing tables and graphs and more complex data visualizations, using charts, maps, and sparklines.
• Combined the data blending and advanced analytics of Alteryx Analytics with the rich visualizations of Tableau.
• Worked on connecting tableau to other data sources like Hive, Redshift AWS, Teradata, Oracle, and experience establishing respective driver connections.
• Performed maintenance activities like cleaning logs and archiving log files at prescribed intervals.
• Setup of SSO between Tableau portal and Oracle, Snowflake- Tableau.
• Used Alteryx and Tableau, for business users to easily consume deep spatial, location-based, and predictive analytics.
TECHNICAL SKILLS:
Analysis Tools: MS Excel, Alteryx, SQL, SAP, Minitab, Pandas, JDA, JIRA, SSMS
Programming: Python, Visual Basic for Applications (VBA)
BI/Visualization: Tableau, Power BI, Seaborn, Plotly
EDI Standards: ANSI X12, EDIFACT, HIPAA, NCPDP, HL7.
Big Data: Hadoop, Spark
Cloud: AWS, Azure
Data Warehousing: RedShift, Snowflake
Data Orchestration: Apache Airflow
Databases: MS SQL Server, My SQL, Teradata, Oracle
PROFESSIONAL EXPERIENCE:
Client: UHG, Hartford, CT February 2024 to Present
Designation: Senior Data Analyst
Job Responsibilities:
• Developing Python scripts to automate Pactiv internal projects and reduced manual efforts by 99%.
• Extracting data from various sources to segregate data into the required format, utilizing SQL queries and Python scripting.
• Participated in Facets Table data Modelling planning, designing, implementation of the data warehouse and conducted testing by developing complex SQL queries.
• Performing as a Supply Chain Business Analyst in the healthcare industry focusing on process improvements, analyzing and integrating Supply Chain, Marketing, Finance and Sales teams in the project.
• Creating dashboards in Tableau with calculated fields and parameters, and implementing real-time monitoring dashboards, resulting in a 15% increase in efficiency for Tableau Visualization users.
• Worked on EDI transactions in claim and management cycle based on HIPAA guidelines.
• Developed HIPPA 276, 277, 270, 271,835 and 837 transactions to HIPAA standards
• Maintaining constant communication with the Finance and Marketing teams to update daily, weekly, or monthly data changes, leading to timely report generation and improved decision-making processes.
• Conducting performance tuning and capacity planning for the Snowflake data warehouse, ensuring it meets growing business demands and optimizing costs by approximately $60,000 annually.
• Developed comprehensive data mapping documents detailing source-to-target mappings for ETL processes and data integration projects.
• Worked on Facets Output generation, Interface development and Facets Migration Projects.
• Collecting and analyzing supply chain data and planning inventory, calculate the quotes.
• Created ss and MVT (Multivariate) testing roadmaps for IR 500 clients of Confidential to optimize websites for Confidential elements.
• Generating a comprehensive sales analysis using SQL queries, which involved joining multiple tables to track product performance, customer demographics, and sales trends, resulting in improved decision-making processes and a 30% increase in sales revenue.
• Having good knowledge on SQL/PLSQL, Python and R statistical programming languages as well as Data Visualization tools (Tableau, Smartsheet, MS Excel and Power BI)
• Constructed data narratives akin to crafting intricate snowflakes, each unique and mesmerizing.
• Worked on data flow diagrams, sequence diagrams, and business process models that describe how the EDI Transaction set (837) is used to submit billing information and encounter
• Worked closely with diverse teams of Strategic Growth Manager, Technical teams and account managers to overcome objections from clients using A/B testing, MVT (Multivariate testing) as tools to overcome objections.
• Experience in developing, implementing and testing EDI ANSI X12, EcMap (EMS, ECS, MMS files) EDIFECS Spec Builder, UNIX & XML/XSLT/XSD applications.
• Contact primary guarantors to verify delivery of equipment and acceptance of equipment finance lease agreements before finalizing funding.
• Extensive involvement in data mapping using Facets data model.
• Designed metrics for A/B testing, creating dashboards in Tableau to monitor test processes, analyzing test results, interpret and give recommendations to stakeholders.
• Effectively implemented A/B testing to improve UI/UX and KPIs, increasing the revenue by 15%.
• Leading the development and implementation of PySpark data processing pipelines, resulting in improvement in data processing speed.
• Developing and maintaining ETL pipelines and handling large datasets using Databricks, leading to an increase in data processing efficiency.
• Conducted in-depth analysis of large datasets stored in Snowflake using SQL queries and analytical functions to extract meaningful insights and identify trends.
• Leveraged S3 for scalable storage and Redshift's analytical power to conduct comprehensive analysis, resulting in actionable insights that facilitated strategic decision-making and contributed to $90,000 increase in revenue.
• Analyzed trading partner specifications and created EDI mapping guidelines
• Conducted data validation and testing to verify the accuracy of data mappings and transformations.
• Utilizing Jira extensively for tracking project progress and assisting in data collection, cleaning, and organization using Jira and Confluence.
• Responsible for identifying data for Government and National account contracts, equipment leases.
• Working on generating various Tableau dashboards using multiple data source like AWS Redshift data warehouse (DWH), Smartsheet via web data connector, Oracle DB, text files & Excel files.
• Developed metadata documentation using MS Word (printable) and Lucid Chart for online demo.
• Managing Hadoop cluster platforms for big data processing and analysis on Amazon EMR, optimizing costs by $30,000 annually.
• Created interactive dashboard using advanced excel, QlikView and SAP to provide complete end to end visibility of supply chain process.
• Designed and implemented data models within Snowflake, including schema design, table creation, and optimization for efficient data storage and retrieval.
• Preparing technical specifications and documentation for Alteryx workflows supporting BI reports, ensuring clear communication, and understanding among team members, resulting in a 15% reduction in errors.
• Actively analyzing statistics from various reports extracted from Oracle, SAP HANA, and SQL Server resulting in improved insights and decision-making processes.
Environment: Python, Tableau, Apache Airflow, PySpark, SharePoint, Snowflake, Hadoop, Databricks, AWS, Informatica PowerCenter.
Ford Motors, Dearborn, MI August 2022 - January 2024
Designation: Senior Data Analyst
Job Responsibilities:
• Identified gaps in existing data and worked with Engineering and business teams to implement data tracking. Building ML models and achieve their accurate results
• Utilized analytical applications to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into risk management and marketing strategies that drive value. Created Python Scripts to automate pulling data from different Data sources for data analysis.
• Creating and modifying SQL queries to aggregate data.
• Extracting factors that lead to higher sales for individual stores based on past-sales data.
• Implemented Machine Learning models and application that predicts and recommends levels of safety-stock and top store based on the percentage chance of an order being canceled.
• Extensive use of Facets back-end tables and front-end application system for data validation purposes.
• Utilized ETL tools and SQL to map data fields between disparate systems, ensuring consistency and accuracy.
• Developed a customer analytics platform using Google cloud (GCP) and Azure, integrating data from various sources and automating workflows with Apache Airflow.
• Developed Docker images and containers for deploying TensorFlow and Keras models, enhancing portability and scalability.
• Review equipment finance lease agreements, applications, and other supporting documents for accuracy and classification of applicants according to company guidelines
• Developed custom dashboards in QlikView, Qlik Sense, Power BI and Tableau to monitor project progress, enhancing project management efficiency.
• Utilized GCP services such as BigQuery, Cloud Storage, and Dataflow to manage and process large datasets efficiently.
• Worked with FACETS Team for HIPAA Claims Validation and Verification Process.
• Conducted training sessions for team members on NoSQL, Hadoop, and SSIS technologies, promoting knowledge sharing.
• Employed SAS procedures (PROC SQL, PROC MEANS, PROC FREQ) to perform data manipulation and summary statistics.
• Utilized SAS to import, clean, and transform large datasets, ensuring data integrity and accuracy for subsequent analysis.
• Built different tableau and Smartsheet reports/Dashboards for executive level review of different key business drivers including Variable Cost Productivity, Operational KPIs, Operational Planning for labor and country warehouse usage
• Collating data sources and building databases which can be accessed by end-users according to their intent by SQL. Data mining using state-of-the-art methods by SQL and MYSQL.
• Developed and maintained ETL processes using Informatica, PowerCenter to ensure the seamless integration of data from multiple sources into the data warehouse.
• Analyzed complex datasets stored in PostgreSQL, Cassandra, MongoDB, Redshift, Snowflake, and Google BigQuery to extract valuable insights.
• Utilized Python libraries including pandas, NumPy, SciPy, scikit-learn, Matplotlib, and Seaborn for data manipulation, statistical analysis, and visualization.
• Led cross-functional projects involving data migration and integration across AWS, Azure, and GCP, ensuring smooth transitions and minimal disruptions.
• Collaborated with DevOps teams to integrate Docker containers into CI/CD pipelines, improving deployment efficiency.
• Experimented A/B Testing of Recommender System to find the best suit model analyzing Click through Rate and Conversion Rate using Google Analytics.
• Implemented A/B Testing and UAT to achieve operational objectives by utilizing the process mining of Celonas, managing analytics process and recommendations to strategic plans and reviews.
• Leveraged PySpark for efficient data processing and analysis, handling terabytes of data daily.
• Managed and optimized relational databases in PostgreSQL and cloud-based data warehouses like Redshift and Snowflake to ensure data integrity and performance.
• Conducted training sessions for team members on Informatica, Collibra, and Docker technologies, promoting knowledge sharing.
• Performed in-depth data analysis using SQL and Python across AWS, Azure, and GCP to identify trends, patterns, and actionable insights.
• Designed and implemented ETL processes to integrate data from multiple sources into analytical databases and data warehouses.
• Developed and optimized ETL pipelines using Snowflake and PySpark to handle large-scale data from various sources, ensuring data quality and consistency
• Managed and maintained Cassandra and MongoDB databases to handle large volumes of unstructured data, enabling seamless integration with analytical workflows.
• Time series modeling and statistical models to forecast inventory and procurement cycles by Python.
Environment: Python, Django, Flask, C#, Linux Environment, SAS, Fast API, Hadoop, NoSQL, VMware realize Orchestrator, AWS lambda, EC2, S3, RDS, Spark, Shell Scripting, Linux, Perl, SSIS, Informatica, PowerCenter, TensorFlow, Keras, Informatica, Collibra, Docker, Curl, Shell, Tableau, Power BI, Airflow, Google Cloud Platform (GCP), QlikView, Qlik Sense, ReactJS, Hive, Pyspark, Snowflake, JavaScript, JSON, Agile, Azure, HTML, CCSS, Postgres, MongoDB, NumPy, Pandas, Matplotlib and Git
Client: Pactiv LLC, Lake Forest, IL October 2021-July 2022
Designation: Data Analyst
Job Responsibilities:
• Collaborated with cross-functional teams, including product owners and technical managers, to gather requirements and deliver actionable insights.
• Implemented a robust Data Warehouse system utilizing sequential files from diverse Source Systems such as SAP Hana, Oracle, and SQL Server, enhancing data integration efficiency by 30%.
• Leveraged Python for advanced data manipulation tasks, including data cleaning, transformation, merging, and handling large datasets, resulting in reduction in data processing time.
• Identified, diagnosed, and resolved errors in EDI transactions, ensuring smooth and compliant data exchanges
• Designed and executed ETL jobs using DataStage, facilitating efficient loading of data warehouse and Data Mart systems.
• Worked on developing the business requirement and use cases for FACETS batch process, automating the billing entities and commission process.
• Utilized Snowflake's features such as clustering, partitioning, and materialized views to optimize query performance and enhance data warehouse scalability.
• Implemented SQL queries to analyze historical sales data, leading to a 12% improvement in forecasting accuracy.
• Engineered scalable data processing solutions using PySpark and Apache Spark, resulting in a 40% increase in data processing speed and scalability.
• Proficiently utilized Tableau and Snowflake, employing SQL for data extraction and analysis, leading to streamlined dashboard development and improvement in data visualization accuracy.
• Implemented end-to-end data pipelines using Apache Airflow, automating ETL tasks and ensuring data integrity throughout the pipeline, resulting in enhanced operational efficiency and cost savings of $75,000 annually.
• Provided documentation and training to end-users on data mapping processes and tools.
• Collaborated with ETL developers to design and optimize data integration pipelines using Snowflake's built-in ETL capabilities or third-party tools like Apache Airflow or Talend.
• Developed Tableau data visualizations incorporating various chart types, including heat maps, scatter plots, geographic maps, pie and bar charts, symbol maps, horizontal bars, and histograms, enhancing data presentation and interpretation for stakeholders.
• Demonstrated extensive Data Warehousing expertise using Informatica as an ETL tool across multiple databases, optimizing data integration processes and reducing errors.
• Led the migration of legacy applications to AWS clouds and SaaS solutions, enhancing scalability and reducing infrastructure costs by 25%.
• Stayed updated with industry best practices and emerging technologies related to data mapping and integration.
• Utilized Alteryx for data blending, creating ETL workflows, and integrating seamlessly with Tableau, resulting in improved data integration efficiency and operational agility.
Environment: Python, Tableau, SQL Server, Snowflake, PySpark, Spark SQL, Azure, Databricks, AWS, Oracle, Teradata, MS Access.
Client: Amazon.in, Hyderabad, Telangana, India January 2020 – September2021
Designation: Data Analyst
Job Responsibilities:
• Designed, developed, and tested various Tableau visualizations for the dashboard and ad-hoc reports.
• Worked on Git on a high level to merge the changes & push back to master, worked on Microsoft Azure CI/CD process for deployment of code.
• Involved in Designing the specifications, Design Documents, Data modeling, and data warehouse design.
• Developed and maintained ETL workflows to extract data from various source systems, transform it according to business requirements, and load it into Snowflake data warehouses.
• Implemented Azure data solutions for seamless data integration, analysis, and visualization and driving actionable insights.
• Developed Python scripts to extract data from various systems into the data warehousing system, facilitating streamlined data consolidation and storage, resulting in a cost saving of $30,000 annually.
• Mapped EDI transaction data to internal data models, ensuring seamless integration with enterprise systems and applications
• Developed detailed data mapping specifications documenting source-to-target mappings for ETL processes and data integration projects.
• Implemented data quality checks and validation rules within Snowflake to ensure data accuracy, completeness, and consistency across the data warehouse.
• Implemented version control and change management processes to track and manage data mapping revisions extracted, transformed, and loaded (ETL) data from spreadsheets, database tables, and other sources using Microsoft SSIS and Informatica.
• Conducted meetings with business stakeholders, clients, and director-level members for requirement gatherings and presented various solutions.
• Developed 4+ sales dashboards in Tableau, enhancing user experience and interactivity, resulting in a 15% increase in data-driven decision-making.
• Created and tested various SQL stored procedures, views, and triggers during data preparation stages.
• Monitored, scheduled, extracted, and debugged issues with data extracts, ensuring data reliability and consistency for reporting and analysis.
Environment: Tableau, SQL server, Azure, Hive, SSIS, Python, Hadoop, Spark, Spark SQL.
Client: Wells Fargo, Hyderabad, Telangana, India January 2018 -December 2019
Designation: Data Analyst
Job Responsibilities:
• Responsible for building scalable distributed data solution using Hadoop Cluster environment with Hortonworks distribution.
• Used SSIS ETL to deal with data transformation and data integration. Loaded data into Data Warehouse and performed data mining, data cleansing, etc. using SSIS. It extracted data from Flat Files, XML files and SQL databases.
• Designed and implemented Power BI reports and dashboards to visualize key performance indicators, facilitating data-driven decision-making and resulting in a 20% increase in operational efficiency.
• Utilized Python libraries like Pandas, NumPy, and Matplotlib for statistical analysis and visualization and enhanced decision-making processes.
• Designed and developed Power BI reports and dashboards to visualize key performance indicators, resulting in improved decision-making.
• Evaluated and recommended data mapping tools and technologies to improve efficiency and effectiveness in data integration projects.
• Worked on Normalization and De-normalization techniques for optimum performance in relational and dimensional databases environments.
• Implemented MapReduce jobs in Hive for efficient data processing, resulting in a 25% reduction in data processing time and improved scalability.
• Utilized spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, and Data Frame.
• Developed custom ETL workflows using Spark/Hive to streamline data cleaning and mapping processes, improving data quality and integrity.
Environment: Python, PySpark, Hadoop, Spark, Spark SQL, Power BI, Hive, Python, Azure, Databricks, SISS.
Client: TEKsystems Inc, Hyderabad, Telangana, India July 2015-December 2017
Designation: Software Engineer
Job Responsibilities:
• Designed and developed Power BI visualizations tailored to business requirements, resulting in a 20% increase in data interpretation efficiency and $20,000 cost savings annually.
• Created SQL queries to optimize database interaction, ensuring alignment with end application requirements and enhancing data retrieval efficiency by 25%.
• Performed and automated SQL Server version upgrades, path installs and maintained relational databases.
• Implemented SSIS-based ETL workflows resulting in a 30% reduction in data processing time for comprehensive data analytics.
• With the help of MS Excel allows organizing large amounts of data into tables and spreadsheets, making it easier to clean and analyze the data.
• Collaboration Excel allowed multiple users to access and work on the same data, making it an ideal team collaboration and communication tool.
• Involved in providing maintenance and development of bug fixes for the existing and new Power BI reports.
• Identified the products with high risk of loss and updated inventory using MS Excel and Access.
• Designed dashboards utilizing custom filters and DAX expressions with Power BI.
• Presented data through dashboards and scorecards with different visuals such as funnel charts, donut charts, and scatter plots in Power BI.
• Performed source data analysis and captured metadata, reviewed results with business. Corrected data anomalies as per business recommendation.
Environment: Power BI, DAX, SQL Server, Oracle, SSIS, Python, MS Excel.
EDUCATION
Bachelor of Technology in Computer Science Engineering at Guru Nanak Dev Engineering College, Gujarat, India.