Data Analysis Analyst

Location:

Little Rock, AR

Salary:

$90000

Posted:

April 25, 2025

Contact this candidate

Resume:

Senior Data Analyst

Mohammad Zeeshan

Email: *********************@*****.***

Phone: +1-501-***-****

PROFESSIONAL SUMMARY:

Around 7 years industrial working experience focused on big data analysis, data mining, statistical inference, A/B testing, machine learning, data visualization and ETL data pipelines.

Robust analytical skills and capability to use diverse software such as R Studio and Power BI.

Experience in using various packages in R and python like ggplot2, dplyr, Rweka, gmodels, rjson, plyr, pandas, numpy, seaborn, SciPy, matplotlib, and sci-kit-learn.

Extensive experience in text analytics, generating data visualizations using R, python and creating dashboards using BI tools.

Leveraged Azure services such as Azure SQL Database, Azure Data Factory, and Azure Databricks to design and implement robust data processing pipelines for efficient data analysis.

Experience in configuring reports from different data sources using data blending.

Experience in using variables, expressions, functions, and defining Query for generating reports in R studio.

Expertise in R programming language for statistical analysis, data manipulation, and visualization using packages like dplyr, ggplot2, and tidyr.

Leveraged PySpark for large-scale data processing, data wrangling, and building scalable data pipelines, enhancing the efficiency of data workflows in big data environments.

Proficient in writing complex DAX (Data Analysis Expressions) formulas in Power BI for data analysis, calculations, and generating insights.

Developed and maintained data models in Power BI using DAX functions to create relationships, calculated columns, measures, and KPIs, ensuring accurate and efficient data analysis.

Proficient in developing and maintaining data warehouses on Azure, ensuring optimal performance and scalability for analytical queries.

Experienced in using AWS Glue for ETL (Extract, Transform, Load) processes, automating data preparation and integration tasks, and facilitating seamless data analysis.

Knowledgeable in utilizing Tableau Server for sharing and collaborating on Tableau workbooks.

Proficient in connecting Power BI to various data sources, including SQL databases, Excel, and cloud-based platforms.

Proficient in integrating AWS services with popular data analysis tools like Apache Spark and Jupyter Notebooks, enhancing the flexibility and capabilities of data analytics workflows.

Applied Azure Machine Learning services to build predictive models and enhance data analysis capabilities, contributing to data-driven decision-making.

Familiarity with QlikView Server for distribution and collaboration on QlikView documents.

Proficient in using BASE SAS, SAS/STAT, and SAS/SQL for data processing and analysis.

Proficient in working with the Hadoop ecosystem, including HDFS, MapReduce, and Hive.

Advanced proficiency in Microsoft Excel for data analysis, modeling, and visualization.

Familiarity with Jupyter Hub for managing collaborative Jupyter environments.

Proficient in implementing data governance policies and access controls using AWS Lake Formation, ensuring compliance and security in the management of data lakes.

Hands-on experience with MySQL replication and clustering for high availability and fault tolerance. Extensive expertise in designing and implementing Oracle databases.

Skilled in creating custom reports, dashboards, and goals within Google Analytics.

Demonstrated expertise in using Azure Data Factory for Extract, Transform, Load (ETL) processes, ensuring the seamless flow of data from source to destination.

Continuous learning and staying updated on the latest features and techniques in IBM SPSS.

Proficiency in Power BI advanced analysis features like Calculations, Parameters Trend lines, and forecasting.

Proficient in leveraging AWS Data Exchange for securely sharing and collaborating on datasets with external partners.

Skilled in using AWS Glue Data Catalog for metadata management, facilitating the discovery and exploration of datasets across the organization.

Expert in Bill Inmon and Ralph Kimball methodologies for database architecture.

proficient in logical and physical database design, and skilled in the complete data warehouse lifecycle, OLAP, and OLTP.

Well-versed in SDLC phases, adept at creating and maintaining various documentation.

Extensive knowledge of Big Data technologies (Hadoop, Spark, Hive, Pig) and Tableau.

Published Tableau workbooks, enabled data exploration, and collaborated with diverse teams for requirements analysis and technical solutions.

Expertise in Power BI reporting tools, Power BI Dashboards Developments & Server Administration.

Good understanding of Relational database design, data warehouse concepts, and methodologies.

Proficient in PL/SQL development, including stored procedures, triggers, and packages, to implement business logic within Oracle databases.

TECHNICAL SKILLS:

Languages

R, Python, SQL, PL/SQL

Data Analysis & Visualization Tools

R Studio, Power BI, Tableau, Excel, Jupyter Notebooks

Big Data Technologies

Pyspark,Hadoop, Spark, Hive

Clouds

Azure, AWS

Database Management Systems

MySQL, Oracle, PostgreSQL, SQL Server

ETL Tools

AWS Glue, Informatica, Azure Data Factory, Snowflake,Pyspark

Web Analytics

Google Analytics

Data Governance & Security

AWS Lake Formation, Azure Data Catalog, Data Encryption

Statistical Analysis & Modeling

Regression Analysis, Clustering, Multivariate Analysis, Factor Analysis, Cluster Analysis

Data Warehousing Methodologies

Inmon and Kimball methodologies

BI Administration & Server Management

Power BI Server Administration, Tableau Server Management

Collaboration & Version Control

Jupyter Hub, Version Control Systems

PROFESSIONAL EXPERIENCE:

CDC Georgia, USA. OCT 2023 – Current

Data Analyst III

Responsibilities:

Worked on architecture design to migrate current ETL jobs to the Cloud using AWS Redshift.

Worked deeply in Calculations, Parameters, Trend Lines, and Statistics to create detail-level summary reports of customer data.

Experienced in developing Business logic in semantic later by creating views in AWS Redshift to provide transformation logic visibility.

Ensured data security and compliance with industry standards by configuring access controls, encryption, and audit logging in Brickbucket.

Employed PySpark for transforming and processing large datasets, ensuring efficient data workflows and scalable data solutions

Designed and maintained Oracle DBMS solutions for managing over 10 million customer records with high data integrity and availability.

Integrated PySpark with AWS services like AWS Glue and S3 for seamless data processing and storage solutions, automating data pipelines and reducing manual intervention.

Utilized AWS Glue for ETL processes, ensuring clean and transformed data.

Managed Microsoft Access databases for internal operations, including building queries and reports for department use.

Created custom dashboards using Power BI to visualize Hadoop-based data sources in real time.

Designed and executed ETL processes using PL/SQL to integrate data from multiple internal systems.

Created custom visualizations and charts using ggplot2 to communicate findings effectively.

Collaborated with Portfolio Managers to develop predictive models for asset rebalancing, reducing volatility by 12%.

Created user-friendly Access forms and automated reports for internal stakeholders using VBA modules.

Collaborated with business users to gather requirements and design QlikView data models.

Developed SAS programs for data cleaning, validation, and standardization.

Orchestrated data pipelines with AWS Step Functions for automation. Collaborated with data engineering teams to design and implement Hadoop data pipelines. Created interactive and reproducible data analysis reports in Jupyter Notebooks.

Analyzed sales and marketing data using Google analytics. Developed serverless data processing functions using AWS Lambda. Utilized AWS Batch for batch processing of large datasets.

Designed data marts using dimensional data modeling with star and snowflake schemas.

Collaborated with ETL/ Informatica teams, performed data analysis, and developed SSIS packages for effective package development.

Collaborated with cross-functional teams to gather requirements and translate them into Oracle-based data models and dashboards.

Configured and managed Oracle Real Application Clusters (RAC) for high availability and scalability. Analysis of customer database using MySQL.

Used various SQL reporting skills to create SQL views and write SQL queries.

Integrated AWS SageMaker for machine learning model deployment. Leveraged AWS Glue for integrating machine learning into ETL processes.

Collecting data and business requirements from end users and management.

Conducted performance tuning of PL/SQL scripts and optimized Oracle database queries to reduce processing time.

Developed customers profiles along with reports using Python and Power BI, analytical results, and strategic implications to senior leadership for strategic decision making.

Experienced in developing ETL Pipelines in and out of data warehouse, develop major regulatory and financial reports using advanced SQL queries in Snowflake. Performed data manipulation on extracted data using Python Pandas.

Assisted in developing financial models to support portfolio optimization and asset allocation strategies.

Expand best practices, standards, and processes for effectively carrying out data migration activities. In addition, Work across multiple functional projects to understand data usage and implications for data migration.

Developed interactive dashboards and reports in Tableau for data visualization and storytelling.

Implemented and optimized full-text search functionality in PostgreSQL.

Environment: Tableau, QlikView, SAS, Hadoop, Jupyter Notebooks, AWS, IBM SPSS, Google Analytics, Erwin, Snowflake, Informatica, SSIS, Oracle, MySQL, SQL, PostgreSQL, Power BI, Python, Machine Learning

University of Illionois Sep 2022 – Aug 2023

Data Analyst

Responsibilities:

Developed interactive visualizations and dashboards using Tableau that enabled business users and executives to explore product usage and customer trends.

Implemented data ingestion pipelines using Azure Data Factory to gather and process diverse data sources.

Created custom SQL queries on various databases such as Teradata, MySQL, DB2 for data analysis and data validation.

Conducted comprehensive data mining and analysis to extract actionable insights, driving data-informed decision-making.

Developed interactive visualizations and dashboards in Tableau, facilitating data-driven decision-making for business users and executives by analyzing product usage and customer trends.

Used PL/SQL to automate generation of monthly operational and financial reports.

Integrated structured RDBMS data with Hadoop HDFS using Sqoop for comprehensive analytics.

Assisted in managing Oracle databases for sales and operations data, ensuring data consistency and performance.

Utilized process mapping techniques to document key workflows and identify areas for improvement and automation.

Conducted regular HIPAA compliance audits, identified gaps, and implemented corrective actions to ensure ongoing compliance with regulatory standards.

Designed and implemented Continuous Integration and Continuous Deployment (CI/CD) pipelines in OpenShift to automate the deployment of data analytics applications and machine learning models.

Built VBA-based tools for data reconciliation across systems, ensuring accuracy of monthly financial reports.

Designed and implemented efficient database structures using MySQL for optimal data storage and retrieval, enhancing performance and scalability in handling large volumes of insurance data. Additionally, managed data storage solutions such as Azure SQL Database, Azure Blob Storage, and Azure Data Lake Storage to leverage cloud capabilities.

Developed and optimized PySpark scripts for large-scale data processing, improving data transformation and analysis speed.

Designed and implemented efficient database structures for optimal data storage and retrieval using MySQL.

Designed and implemented data models using Azure Synapse Analytics (formerly SQL Data Warehouse) to support analytical reporting requirements.

Implemented HL7 (Health Level Seven) standards for interoperability, data exchange, and integration between healthcare systems and applications, facilitating seamless communication and data sharing.

Integrated Electronic Medical Records (EMR) data into analytics platforms like Tableau or Power BI for healthcare performance metrics tracking, patient outcomes analysis, and operational efficiency improvements.

Developed and implemented advanced data mining algorithms that improved prediction accuracy by 30%.

Proficient in crafting advanced SQL queries and leveraging PostgreSQL-specific features for complex data analysis.

Designed and developed Informatica Data Quality (IDQ) solutions to support Master Data Management (MDM) projects with respect to the insurance data, ensuring accurate and reliable data for policy management and customer service.

Configured Jenkins to integrate with version control systems like Git, automating the build and testing process upon code commits and merges.

Designed interactive and dynamic Power BI reports and dashboards with slicers, filters, and drill-through functionalities using DAX expressions for enhanced user experience.

Implemented and updated data governance procedures, standards, and documentation.

Developed QlikView applications for data visualization and business intelligence.

Conducted regression analysis, clustering, and segmentation using SAS.

Conducted regression analysis, clustering, and segmentation using SAS, extracting valuable insights from insurance data for risk segmentation, pricing strategies, and customer retention initiatives.

Managed and analyzed large-scale datasets using Hadoop ecosystem tools (Hive, HDFS, MapReduce).

Conducted data cleaning, validation, and analysis using Excel functions and formulas.

Developed and executed data analysis scripts and code in Jupyter Notebooks. Utilized IBM SPSS for statistical analysis and hypothesis testing.

Experienced in implementing one time Data Migration of Multistate level data from SQL server to Snowflake by using Python and SnowSQL.

Assisted in developing working documents to identify and get access permissions to database resources.

Environment: Tableau, SQL, Teradata, MySQL, DB2, R, IBM SPSS,, Informatica Data MDM, SSIS, Power BI, MySQL, Oracle, PostgreSQL, QlikView, SAS, Hadoop, Hive, HDFS, MapReduce, pandas, Machine Learning, web analytics, Excel Jupyter Notebooks, Snowflake, Python.

Radiare Software Solutions Banglore, IND Sep 2019 – Dec 2021

Data Analyst

Responsibilities:

Implemented data exploration to analyze patterns and select features using Python SciPy.

Developed and implemented R and Shiny applications to showcase machine learning for business forecasting.

Designed and implemented data pipelines using PySpark, enhancing the capability to process and analyze large datasets.

Implemented ETL pipelines in Azure Data Factory for seamless data extraction.

Organized data and performed analysis using R, Excel, and Tableau.

Developed and maintained PL/SQL packages and stored procedures for enterprise-wide data analytics applications.

Chose suitable approaches, analytical and statistical designs, and data management to meet objectives.

Conducted performance tuning activities to enhance MySQL database efficiency.

Implemented and maintained data replication solutions in Oracle, ensuring data consistency across distributed systems.

Ensured data quality through transformations using Azure Databricks.

Implemented concurrency control mechanisms to handle concurrent access to data and maintain data consistency in PostgreSQL. Constructed methods to analyze diverse datasets.

Designed VBA macros to clean, merge, and transform large Excel datasets for marketing analytics.

Conducted customer segmentation and churn analysis using Hadoop Hive datasets.

Created reusable VBA code libraries to standardize processes across reporting teams.

Contributed to the documentation of R-based analytics processes and methodologies. Implemented data transformation and cleansing processes within QlikView. Managed and optimized data storage with Azure SQL Database and Blob Storage.

Designed data models in Azure Synapse Analytics for analytical reporting. Utilized SAS for data manipulation, transformation, and statistical analysis.

Implemented security measures and access controls for Hadoop clusters.

Ensured data accuracy and integrity in Excel spreadsheets. Integrated Jupyter Notebooks with version control systems for code collaboration.

Contributed to the writing of study proposals and reports, including statistical analysis sections and data management. Optimized query performance in Azure SQL Database and other services.

Automated batch jobs in Hadoop environment, reducing report generation time by 50%.

Extracted data from databases, developed projections, statistical analyses, and curve fitting.

Involved in testing and executing SQL scripts for report development, Power BI reports, dashboards, and scorecards before publishing.

Wrote SQL statements, stored procedures, and triggers for extracting and writing data.

Environment: Python, R, Azure, Excel, Tableau, MySQL, Oracle, PostgreSQL, QlikView, SAS, Hadoop, Jupyter Notebooks,

Power BI, SQL.

UPS- Manufacturing and Logistics Domain India June 2018 to Aug 2019

Data Analyst

Responsibilities:

Recorded procedures for statistical analysis, data extraction, organization, and storage.

Built factor analysis and cluster analysis models using Python SciPy to classify customers into different groups.

Utilized AWS Glue for ETL (Extract, Transform, Load) processes to clean, transform, and prepare data for analysis.

Utilized and managed PostgreSQL extensions for specialized functionalities, such as PostGIS for spatial data processing.

Orchestrated data pipelines using AWS Step Functions to automate and streamline data processing workflows.

Developed measures for resourceful data extraction and re-use of statistical analysis processes.

Developed data dictionaries as required for reporting and presentations.

Provided Tableau support and troubleshooting for end-users.

Developed serverless data processing functions using AWS Lambda, reducing infrastructure costs and improving scalability. Collaborated with IT teams to troubleshoot and resolve Power BI-related issues.

Integrated QlikView with various data sources to create comprehensive dashboards.

Conducted data quality checks and validation in the Hadoop environment.

Implemented VLOOKUP, HLOOKUP, and other functions for data reconciliation.

Developed and maintained Excel templates for standardized reporting.

Environment: Python, AWS, ETL, PostgreSQL, Tableau, Power BI, QlikView, Hadoop, Excel.

Contact this candidate