Meghashyam
Cell: +1-804-***-**** Email id: ******************@*****.***
ALL ABOUT ME:
An experienced Data Analyst proficient in Data Visualization, Analysis & Manipulation, Warehousing & Modeling, Skilled in various analytical tools, cloud platforms, and operating systems. Recognized for creating insightful visualizations and excelling in Project Management, Scripting & Automation. A collaborative team player, known for effectively working with peers, senior management, and stakeholders to devise and implement the most suitable solutions.
PROFESSIONAL SUMMARY
Highly analytical and process-oriented data analyst with over 9+years of work experience in data analysis and data management and has proven ability to work efficiently in both independent and teamwork environments. Experience working in different domains like Healthcare, Financial & Energy throughout my career.
Strong background in Creating visualizations, interactive dashboards, reports, and data stories using Tableau and PowerBI.
Maintained Proficiency in Troubleshooting SQL queries, ETL jobs, data warehouse/data mart/data store models.
Extensive knowledge and experience in Agile and Waterfall Methodologies of Software Development Life Cycle.
Experienced in Data Modelling with expertise in creating Star Snow-Flake Schemas, FACT and Dimensions Tables, and Physical and Logical Data Modelling.
Extensively experienced in end-to-end data analysis, encompassing data cleaning, data manipulation, database (Oracle) testing, and control development in R and Python.
Well-versed in working with various Python IDEs like PyCharm and Google Collaborator.
Hands-on experience in creating tables and Normalization. techniques on database Tables using T-SQL and SQL Server.
Experience in SAS, SQL, SSRS, Tableau, Python, and MS EXCEL (VLOOKUP, VLOOKUP charts, Macros).
Expertise in Data Manipulations using TABULATE, UNIVARIATE, Append, Array, DO loops, Macros, and Merge procedures.
Proficient in data mining languages like base SAS and SAS SQL.
Good Knowledge of various packages in R and python-like ggplot2, pandas, NumPy, Seaborn, SciPy, and Matplotlib.
Created Text Analytics, generating data visualizations using R, and Python and creating dashboards using tools like Tableau & PowerBI.
Knowledge in using R for predictive modeling and data visualization by using MySQL database and text files as sources.
Proficient in all phases of data mining; data collection, data cleaning, Critical Thinking, Communication Skills, Presentation Skills, and Problem-Solving.
Experience in various phases of the Software Development life cycle (Analysis, Requirements gathering, Designing) with expertise in documenting various requirement specifications.
Worked with Relational Database Design, Data Warehouse/OLAP concepts and methodologies.
Experienced in creating technical metadata for the underlying physical data model of the Source systems as well as the Data warehouse.
Implemented Optimization techniques for better performance on the ETL side and the database side. Experience in developing different Statistical, Text Analytics, solutions to various business generating and problems data visualizations using Python and Tableau Design and data reporting.
Good knowledge of data visualizations using Tableau and PowerBI software and publishing and presenting dashboards.
Worked on several Python packages like NumPy, pandas, and matplotlib.
Capable of using Python for predictive modeling and data visualization by using MS SQL Server database and text files as source.
Knowledge of using functions in Excel and performing analyzing using Pivot charts, VLook-up, and Hlook-up.
Hands-on experience in Joins, Functions, experience writing complex queries, using SQL Server.
Worked with tracking systems such as JIRA and version control tools such as Git.
TECHNICAL SKILLS
Statistical Analysis and Machine Learning
Scikit-learn, TensorFlow, Pandas, NumPy, Dataiku
Statistical analysis tools
Excel with Add-in, SciPy, Scikit-learn, R, SAS, Stata, SPSS, JMP, GA4, Minitab, and Scikit-learn.
Cleaning and preparing data tools
Excel, SQL, Open Refine (formerly Google Refine), Trifacta Wrangler, Data Wrangler, Panda, Apache Spark
Visualization Tools
Tableau, Power BI(DAX), QlikView, Looker, Google Data Studios.
Presentation Software
Microsoft PowerPoint, Google Slides, Keynote.
Specialized Tools
Data validation tools such as Melissa Data and Precise Data, Fuzzy matching tools such as Fuzzy Wuzzy and String Match in Python, CRM systems
Cloud
AWS (EC2, S3, RDS RedShift, EMR), Google Cloud (Big Query, Kubernetes), Azure
Databases
MySQL, PostgreSQL, MS SQL Server, AWS (RDS, Redshift, Redis), Google Big Query
Data modeling Tools
Erwin, Power Designer
Project Management
Microsoft Project, Jira, Asana
Methodologies
SDLC, Ralph Kimball, APIs, ETL, and Data Warehousing, such as clustering, linear regression, Agile Methodologies
Reporting packages
Microsoft SSRS, SAP Crystal Reports, CRM.
SOFT SKILLS:
Familiarity with Agile practices, Scrum, and Kanban.
Strong verbal and written communication skills.
Excellent interpersonal and collaboration skills.
Customer service-oriented with a sense of ownership.
Strong work ethic, time management skills, and positive attitude.
Ability to work independently or as part of a team.
Effective coordination and multitasking abilities.
Capability to leverage trusted sources for current automation and DA practices.
Experience managing vendor relationships and leading projects.
Work Experience
Sr Data Analyst/Power BI Developer
UBS Nov 2023 – Present
Responsibilities:
Designed and deployed robust ETL pipelines using Python, Spark, and SQL to ingest and transform high-volume transactional data from core banking systems into a centralized data lake, improving reporting latency by 45%.
Conducted thorough analysis of Business Requirements Specification Documents and Source to Target Mapping Documents to identify test requirements, ensuring alignment with key performance indicators (KPIs) such as data accuracy and completeness.
Developed SQL queries to extract data from disparate databases, aligning with KPIs related to data retrieval efficiency and accuracy.
Implemented data quality frameworks and automated validation rules to ensure accuracy and integrity of regulatory datasets used for Basel III, AML, and KYC compliance reporting.
Developed Python-based data analysis scripts to automate reconciliation of daily transaction data across multiple banking systems, reducing manual processing time by 60% and improving accuracy in financial reporting.
Participated in all phases of data preparation, achieving a 95% reduction in data errors through comprehensive data cleaning and validation processes.
Implemented end-to-end systems for Data Analytics and Data Automation, resulting in a 90% reduction in manual data processing time and improved data accuracy.
Migrated legacy on-prem data workflows to multi-cloud-based architecture (AWS, Azure, GCP + Snowflake), improving scalability and enabling real-time analytics for the credit risk and finance teams.
Integrated external data sources into Splunk via APIs and scripted imports using Python.
Gathered data from multiple sources and created datasets for analysis, resulting in a 90% increase in data availability and accessibility.
Built and maintained Splunk dashboards for IT performance, network latency, and infrastructure utilization.
Designed and implemented scalable data pipelines using Snowflake to support regulatory and financial reporting across global banking divisions, improving data accessibility by 40%.
Optimized performance of large-scale data processing jobs by tuning Spark configurations and partitioning strategies, resulting in 60% faster data load times for daily batch jobs supporting treasury and liquidity reporting.
Utilized Python libraries (pandas, NumPy, matplotlib, SQLAlchemy) to analyze customer behavior and credit risk trends, providing actionable insights that supported regulatory compliance and strategic decision-making for the bank’s retail division.
Developed predictive models on large-scale datasets, achieving a 85% accuracy rate in predicting business outcomes through advanced statistical modeling and machine learning techniques.
Developed Python scripts to match data with Azure Cloud Search database, resulting in a 80% increase in data classification accuracy.
Integrated Snowflake with BI tools like Tableau and Power BI, enabling senior leadership to make data-driven decisions on trading risk and portfolio performance.
Conducted rigorous testing of data pipelines and analytical workflows, employing techniques such as unit testing, integration testing, and regression testing, while leveraging tools such as Pytest and Selenium for automated testing and continuous integration/continuous deployment (CI/CD) pipelines.
Developed complex Snowflake SQL scripts and stored procedures to support risk and compliance analytics, ensuring full alignment with FINMA and MiFID II regulations.
Architected scalable data processing solutions using distributed computing frameworks such as Apache Spark and Hadoop, while leveraging cloud-native services such as AWS Glue and Google Dataflow for serverless data processing and orchestration.
Implemented role-based access controls (RBAC) and data masking in Snowflake to protect sensitive client and trading data, adhering to internal and external audit standards.
Drafted comprehensive Business Requirement Documents (BRDs) and Functional Requirement Specifications (FRS), employing techniques such as user stories and use case diagrams to capture business needs and system requirements, while aligning with principles of requirements engineering and traceability.
Generated advanced analytical reports and dashboards using SAS and R, incorporating key performance indicators (KPIs) and metrics such as Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) to measure business performance and drive data-driven decision-making processes.
Sr. Data Analyst
Department of Health and Human Services State of IOWA Dec 2022 – Nov 2023
Responsibility:
Designed and deployed data pipelines to feed machine learning models for public health prediction tasks (e.g., disease outbreak forecasting, hospital readmission risk), enabling evidence-based policy interventions across HHS.
Spearheaded daily operations of data analytics, reporting, and epidemiological teams within the Department of Health and Human Services, State of Iowa, utilizing Microsoft Excel, Tableau, and SAS for data management and analysis.
Engineered scalable data infrastructure to support AI/ML workloads using cloud platforms (e.g., AWS/GCP) and tools like Spark, Snowflake, and Sagemaker, accelerating time-to-insight for health research teams.
Directed cross-functional efforts to align work activities with agency and division goals in behavioral health, employing project management software like Asana and Trello for project planning and organization.
Coordinated collaborative initiatives across departments using communication tools such as Microsoft Teams and Zoom to enhance stakeholder engagement and streamline workflow.
Designed and deployed scalable Snowflake data warehouses to centralize public health data from CMS, EHRs, and CDC datasets, improving query performance and interagency reporting efficiency by 50%.
Collaborated with data scientists to productionize AI models, implementing MLOps practices and building feature stores for reuse across HHS agencies, improving model accuracy and consistency across deployments
Optimized workflow and resource allocation using project management software like Microsoft Project, enhancing productivity and project efficiency.
Assisted in resource allocation and evaluation procedures using financial and technological tools like QuickBooks and SAP, ensuring efficient use of resources.
Implemented evaluation methodologies using data analysis software like R and Python, conducting thorough assessments of project outcomes and effectiveness.
Implemented secure data sharing and role-based access controls in Snowflake to ensure HIPAA-compliant data access for cross-functional teams, supporting secure collaboration across HHS sub-agencies.
Provided technical oversight for surveillance and investigations using specialized epidemiological software such as Epi Info and SPSS, ensuring adherence to industry standards.
Automated data ingestion pipelines into Snowflake using Snowpipe and external stages (S3/Azure Blob), reducing manual data loading time by 70% and enabling near real-time health surveillance dashboards.
Reviewed performance data with Excel and Google Sheets, identifying trends and patterns to inform strategic decision-making in programmatic initiatives.
Analyzed program-level surveillance data using dedicated software like CDC WONDER and ESSENCE, guiding resource allocation and strategic planning efforts.
Served as the lead point of contact for data reporting and analytics, utilizing dashboarding tools like Tableau and QlikView to deliver actionable insights to leadership.
Utilized project management software like JIRA and Smartsheet to coordinate and track project progress, ensuring timely completion and delivery of projects.
Fostered collaboration and communication among team members through platforms like Slack and Microsoft Teams, driving efficient workflow and problem-solving within the organization.
-Utilized Excel, Tableau, and ETL (Extract, Transform, Load) software to convert raw data into meaningful insights at the Department of Health and Human Services, State of Iowa.
-Employed Excel for initial data manipulation and cleaning, ensuring accuracy and consistency in the datasets.
-Utilized ETL software such as Informatica or Talend to extract data from multiple sources, transform it into a standardized format, and load it into a centralized database for analysis.
Real-time projects include:
Healthcare Access Analysis: Used Excel to clean and organize data on healthcare facilities and patient demographics. Employed ETL software to integrate data from various sources, including hospital records and census data. Analyzed the data in Tableau to identify areas with limited healthcare access and proposed strategies for improvement.
Disease Surveillance Dashboard: Extracted data from public health databases using ETL tools. Transformed and aggregated the data to create a comprehensive dataset of disease incidence and prevalence. Developed interactive dashboards in Tableau to monitor disease trends in real time and provide actionable insights for public health interventions.
Healthcare Facility Performance Evaluation: Compiled data on healthcare facility performance metrics from disparate sources using ETL software. Standardized and cleansed the data in Excel to ensure accuracy. Created visualizations in Tableau to compare and evaluate the performance of healthcare facilities across different regions, enabling informed decision-making by healthcare administrators.
Community Health Assessment: Integrated data from community surveys, environmental reports, and health records using ETL tools. Processed and analyzed the data in Excel to identify health disparities and environmental risk factors. Visualized the findings in Tableau to facilitate community engagement and prioritize health initiatives based on community needs.
Epidemiological Analysis: Extracted and transformed epidemiological data from public health databases using ETL processes. Cleaned and prepared the data in Excel for statistical analysis. Utilized Tableau to visualize disease trends over time, identify geographic hotspots, and assess the effectiveness of public health interventions.
Generated graphs and reports using the ggplot2 package in R-Studio for analytical models.
Generated detailed report after validating the graphs using R.
Reviewed and modified SAS Programs, to create customized ad-hoc reports, and processed data for publishing business reports.
Use Python and SQL to manipulate data, construct data models, and validate them .
Assisted with automating reporting functionality using Power BI.
Used Python to graphically critique datasets and obtain insights.
Experience in using Agile methodologies like SDLC and Waterfall.
Perform data cleansing and preparation by slicing and dicing data using SQL and Excel.
Create database objects (tables, views, procedures) using MySQL to provide definition, and structure and to maintain data efficiently.
Redesigned data model through iterations that improved predictions by 12%.
Performed data analysis and data profiling using complex SQL queries on various source systems including SQL Server.
Created Pivot tables in Excel to analyze data across several dimensions.
Amazon, Hyderabad, India.
Data Analyst//Tableau Developer Aug 2019-June 2022
Responsibilities:
Developed AI-powered dashboards for automated financial risk assessments and predictive analytics using Tableau and Microsoft Co-Pilot.
Developed and managed Snowflake data pipelines to ingest and process large-scale e-commerce data (orders, inventory, customer behavior) from Amazon Marketplace and Seller Central APIs, improving reporting accuracy and operational efficiency.
Decreased ETL processing time by 40% by optimizing Python scripts and SQL queries with GitHub Co-Pilot, Apache Spark, and Snowflake.
Using TensorFlow, PyTorch, and Scikit-based AI-driven decision systems, anomaly detection, and predictive analytics Learn about financial risk management using hyperparameter-tuned machine learning models.
Integrated Snowflake with AWS services (S3, Lambda, Glue) to automate data loading and transformation workflows, reducing data latency and enabling near real-time sales and inventory dashboards for the Amazon storefront.
Improved data input and transformation efficiency by building serverless data pipelines with AWS Lambda, Glue, and Athena.
Utilized Hadoop for processing large volumes of financial data, enabling scalable risk analysis.
Optimized Hadoop MapReduce jobs to improve financial reporting processing times by 35%.
Optimized Snowflake queries and schema design for high-performance analytics on Amazon transactional data, supporting demand forecasting, pricing optimization, and customer segmentation strategies.
Integrated Teradata with Snowflake for efficient storage and access of large financial risk datasets.
Leveraged Teradata to analyze large-scale financial data, improving decision-making and reporting efficiency.
Optimized SQL queries in Teradata, reducing data retrieval time by 40% for financial reporting.
Optimized Tableau Extracts, Hyper Engine Queries, and Live Connections to cut down on dashboard refresh time by 50%.
Reduced data latency in reporting by designing and implementing real-time analytics solutions with Kafka and Apache Flink.
Implemented Tableau Data Prep, and Looker Explores for ad hoc reporting while spearheading self-service analytics projects.
Increased fraud detection accuracy by 30% through automated anomaly detection models using Scikit-Learn, TensorFlow, and PyTorch.
To protect the confidentiality of the data, Tableau's row-level security (RLS) and user access control were implemented.
Developed AI-powered Tableau dashboards for financial risk assessments, integrating predictive models to enhance decision-making.
Optimized Tableau Extracts and Live Connections, reducing dashboard refresh time by 50%.
Implemented row-level security (RLS) in Tableau to ensure compliance with financial data confidentiality regulations. Developed Tableau dashboards to visualize fraud detection insights using ML models (Scikit-Learn, TensorFlow).
Increased fraud detection accuracy by 30% by integrating anomaly detection models with Tableau’s advanced calculations.
Reduced data latency by implementing real-time fraud monitoring using Kafka and Tableau Hyper Engine queries.
Skilled in using serverless computing (AWS Lambda, Azure Functions) and distributed processing (Apace Spark, Kafka Streams, and Flink) to design highly optimized ETL workflows.
Enabled seamless data warehousing and reporting by integrating Azure Synapse Analytics with Snowflake.
Streamlined data transfers between cloud platforms and internal applications by developing RESTful API integrations.
Developed machine learning models for fraud detection, increasing financial crime detection accuracy by 30%.
Created real-time dashboards for monitoring financial crime, integrating transaction data and external watchlists.
Developed sophisticated computations with parameterized dashboards and LOD expressions for dynamic user inputs.
Utilizing AWS Lambda and Azure Functions, serverless functions were developed to automate data operations and lower operating expenses.
Strong practical knowledge of data Lakehouse architectures using Apache Iceberg, Delta Lake, and Hudi, guaranteeing schema evolution and data transformations that adhere to ACID standards.
To provide fast data visualization for big datasets, Tableau was integrated with Snowflake and AWS Redshift.
Created a real-time financial KPI dashboard that combines information from AWS S3, Snowflake, and SQL Server.
Led the effort to improve the performance of SQL queries for large datasets, resulting in 60% faster report creation.
Cognizant Technologies, Hyderabad India.
Data Analyst Jan 2017 - July 2019
Responsibilities:
Designed and optimized Azure Data Factory pipelines to automate data ingestion from PostgreSQL, MongoDB, and SQL Data Warehouse.
Built a multi-cloud data lake solution integrating AWS S3, and Azure Data Lake for seamless data exchange.
Enhanced strategic decision-making by developing ML-powered business insights reports using Databricks, PySpark, and Power BI.
Streamlined deployments by implementing CI/CD pipelines for data workflows with Jenkins, GitHub Actions, and Kubernetes.
Created Tableau geospatial analytics solutions by combining geographic clustering methods (DBSCAN, H3 Indexing) with ArcGIS APIs.
Developed self-service analytics solutions using Python Dash, DAX, and MDX to provide automated insights for business users.
Using Hyper API, I developed unique Tableau extract algorithms that greatly decreased memory consumption for extensive BI deployments.
Built predictive models for financial crime detection using machine learning algorithms like XGBoost and AutoML.
Designed geospatial analytics dashboards in Tableau, utilizing geographic clustering (DBSCAN, H3 Indexing) to optimize sales strategies.
Integrated ArcGIS APIs with Tableau for advanced location-based insights, enhancing regional performance tracking.
Improved customer segmentation analysis by leveraging Tableau’s AI-driven visualizations combined with Databricks ML models.
Designed and deployed real-time analytics dashboards by integrating Tableau with Snowflake and AWS Redshift, improving reporting efficiency.
Automated data pipelines using AWS Glue and Azure Data Factory, ensuring seamless data flow for interactive Tableau dashboards.
Enhanced operational efficiency by implementing self-service analytics solutions, reducing manual reporting efforts by 40%. Built interactive Tableau dashboards leveraging ML-driven forecasting models to improve sales and revenue projections by 18%.
Created dynamic financial KPI dashboards integrating data from AWS S3, Snowflake, and SQL Server for executive decision-making.
Implemented advanced LOD expressions and parameterized calculations in Tableau to enable customizable forecasting models.
Integrated multiple data sources into a centralized system for efficient financial crime monitoring.
Leveraged Teradata to analyze large-scale financial data, improving decision-making and reporting efficiency.
Enhanced Teradata queries to speed up data processing by 30% for business intelligence.
Created real-time streaming analytics dashboards using Looker, Kafka, and Azure Stream Analytics.
Built scalable data pipelines with Hadoop to process financial data for real-time reporting and analytics.
Integrated Hadoop with cloud platforms to enhance financial data access and processing.
Automated data quality checks with PyTest and Great Expectations to ensure data fidelity across platforms.
Integrated Azure Synapse Analytics for multi-cloud analytics tasks.
increased operational efficiency through the use of Tableau Server to automate report scheduling and distribution.
Improved customer retention strategy by leading predictive modeling projects using XGBoost and AutoML.
Optimized Tableau live connections by using partition pruning and columnar storage strategies (Parquet, ORC) in SQL Server, Snowflake, and Redshift.
Reduced query execution time by 50% through SQL indexing, partitioning, and optimization algorithms.
Swift Safe, Hyderabad, India.
Junior Data Analyst Sep 2015 - Dec 2016
Responsibilities:
Created SQL-based risk analysis reports using PostgreSQL, NoSQL, and Cassandra for credit risk evaluation.
Developed dynamic Power BI dashboards for portfolio analytics and real-time credit risk monitoring.
Automated data quality validation using Python scripting and Alteryx workflows, reducing mistakes by 20%.
Built self-service BI solutions to automate data reporting with Power Query and DAX.
Designed real-time data reconciliation procedures using Spark Streaming and Kafka.
Implemented data warehousing best practices using Fact-Dimension modeling and Star Schema.
Assisted in migrating data from on-premises databases to Azure Synapse Analytics.
Developed machine learning classification models for consumer credit scoring using XGBoost and Scikit-Learn.
Optimized pipelines for financial data analytics through advanced SQL tuning techniques.
Built automated data pipelines with Python, PySpark, and Airflow in collaboration with software engineers.
Education Details:
Master's in Computer Science from Wilmington University, Dec 2023