WORK EXPERIENCE
Mayo Clinic, Florida, USA Senior Data Analyst April 2023 - Present
OBJECTIVE: As a Data Analyst at Mayo Clinic, I collaborated with interdisciplinary healthcare, research, and operational teams to harness data for strategic decision-making. I played a key role in developing dashboards, performing statistical analyses, and ensuring data integrity across multiple clinical and administrative systems. My work supported initiatives focused on improving patient care quality, optimizing operational efficiency, and advancing healthcare research through actionable insights.
Key Responsibilities and Achievements:
Retrieved and integrated large-scale clinical and administrative datasets from EPIC, Oracle, and other internal systems using SQL and Python. Ensured accuracy and consistency by validating data sources and automating ETL processes.
Collaborated with IT and data engineers to improve data pipelines and warehousing.
Designed and maintained interactive dashboards using Power BI and Tableau to track key healthcare metrics.
Delivered visual reports to clinical departments and senior leadership for operational and research use.
Focused on performance, patient outcomes, and workflow efficiency metrics.
Performed trend analysis, hypothesis testing, and outcome studies using Python (Pandas, NumPy, SciPy).
Analyzed clinical metrics like readmission rates, infection trends, and care quality benchmarks.
Generated reports that influenced care pathway changes and operational protocols.
Developed automated data pipelines and reporting scripts using Python and scheduling tools.
Reduced manual workload for weekly and monthly reporting cycles by over 40%.
Ensured timely delivery of consistent, validated reports to clinical and business teams.
Worked closely with physicians, nurses, researchers, and IT teams to gather data requirements.
Presented insights to both technical and non-technical stakeholders in clinical and executive meetings.
Translated complex data findings into practical, actionable business recommendations.
Partnered with research and biostatistics teams to prepare clean, de-identified datasets for studies.
Documented data sources and methodologies for reproducibility and compliance.
Contributed to research publications and conference materials through accurate data summaries.
Analyzed department-level financials, patient billing data, and utilization costs using Excel and SQL.
Provided variance and trend reports to finance and administration teams.
Helped reduce operational expenses by identifying high-cost service areas.
Participated in data governance committees to define metrics, standards, and naming conventions.
Conducted regular audits and quality checks to identify anomalies and ensure data reliability.
Created and updated data dictionaries and documentation to support team-wide consistency.
Environments: CI/CD pipelines, GitLab, Jenkins, Python, Pandas, Apache Airflow, MongoDB, MySQL, Alteryx, HDFS, MapReduce, Hive, Hadoop, Snowflake, Excel, TensorFlow, Scikit-learn, PyTorch, Tableau, Power BI, Informatica, Talend, Apache NiFi, Apache Airflow.
Citadel LLC, Miami, Florida, USA Senior Data Analyst April 2022 – March 2023
OBJECTIVE: Citadel LLC is a global financial institution that operates as a leading hedge fund and market maker recognized for its excellence in investment management, risk management, and technology-driven trading strategies. Developed and maintained the reports and dashboards to track key performance indicators (KPIs) and business metrics and support the design and implementation of data collection and analysis methods to improve operational efficiency.
Key Responsibilities and Achievements:
Analyzed complex financial datasets from Citadel’s trading operations and market data to derive actionable insights for investment strategies and risk management using SQL, Python, and R for data extraction and statistical analysis.
Developed and implemented ETL processes to extract, transform, and load data from multiple financial sources into Citadel’s data storage systems using SQL Server and Apache Airflow for efficient data handling and pipeline automation.
Designed and deployed interactive dashboards and reports for portfolio managers and senior stakeholders using Power BI and Tableau, visualizing key financial metrics and market trends.
Applied advanced machine learning models to predict market trends, assess risk, and optimize trading strategies using scikit-learn, TensorFlow, and XGBoost and leveraged Hadoop and Apache Spark to process and analyze large-scale financial datasets, enabling faster decision-making and real-time analysis of market data and trading patterns.
Worked with NoSQL databases such as MongoDB and Cassandra to store and query unstructured market data, enabling deeper insights into trading patterns and investor behavior.
Optimized data warehousing solutions on Azure Synapse Analytics and Google Big Query, streamlining querying and reporting on large-scale financial data and investment portfolios.
Automated routine financial reporting and data analysis tasks using Python and PowerShell scripts, enhancing efficiency and reducing manual intervention in Citadel’s risk management processes.
Integrated third-party market data sources and financial APIs to ensure real-time data updates and accurate analysis, supporting Citadel’s trading operations and investment decision-making.
Used Jupyter Notebooks for documentation, data analysis, and visualization, sharing insights across teams to support trading strategy development and risk assessment and automated data quality checks and validation using Apache Airflow to ensure the accuracy and consistency of trading data in Citadel’s real-time analytics platform.
Environment: SQL, Python, Azure Data Factory, Power BI, Tableau, scikit-learn, TensorFlow, R, Python, NoSQL databases, Hadoop, Apache Spark, MongoDB, Cassandra, PowerShell, Jupyter Notebooks, Apache Airflow, Jenkins, GitLab, RESTful APIs.
Central Bank of India, Hyderabad, India Data Analyst May 2020 - Dec 2021
OBJECTIVE: Central Bank of India is one of the oldest and largest public sector banks plays a significant role in the Indian banking sector by providing a wide range of financial services to both individual customers and businesses. Involved in cleaning, validating, and transforming data to ensure accuracy and consistency and use statistical techniques and data analysis tools to uncover insights from complex datasets.
Key Responsibilities and Achievements:
Utilized SQL to query large financial datasets, ensuring data accuracy and consistency for financial reporting and risk analysis at Central Bank of India and developed interactive dashboards and visual reports using Tableau and Power BI to provide real-time insights on loan performance, customer transactions, and operational efficiency.
Automated ETL processes with Informatica and Talend to streamline the integration of diverse financial data sources, enabling faster decision-making and reporting.
Leveraged Hadoop and Apache Spark for processing and analyzing large-scale transaction data, improving the bank's ability to perform complex analytics for credit scoring and fraud detection.
Implemented machine learning models using scikit-learn and TensorFlow to optimize credit risk assessment, predict loan defaults, and enhance fraud detection capabilities and utilized NoSQL databases like MongoDB to manage unstructured data from customer interactions, enabling better insights into customer behaviour and preferences.
Applied Python libraries such as Matplotlib, Seaborn, and Plotly to visualize financial trends, market performance, and customer demographics, providing actionable insights to business stakeholders.
Deployed scalable data analytics solutions using Docker and Kubernetes, ensuring efficient containerized environments for real-time financial data processing and analytics and integrated Elasticsearch and Kibana for advanced search and analysis of transactional data, improving retrieval times and enabling quicker financial decision-making.
Managed version control and collaboration on financial datasets using GitHub and GitLab, enhancing team collaboration and streamlining project tracking for continuous improvements in banking technologies.
Enabled real-time data streaming and analytics with Apache Kafka, supporting continuous monitoring of financial transactions, loan processing, and customer interactions.
Environment: SQL, Tableau, Power BI, Informatica, Talend, Hadoop, Apache Spark, scikit-learn, TensorFlow, MongoDB, Matplotlib, Seaborn, Plotly, Python, Docker, Kubernetes, Elasticsearch, Kibana, GitHub, GitLab, Apache Kafka, Python, Snowflake, TensorFlow, Kera’s.
Sameera Shaik (Data Analyst)
*************@*****.***
OBJECTIVE
A detail-oriented and results-driven Data Analyst with 4+ years of experience in analysing complex datasets, providing actionable insights, and delivering data-driven solutions across multiple industries. Proven expertise in statistical analysis, data visualization, and reporting using tools such as SQL, Python, R.
EDUCATION
Atlantis university, Masters in
Information Technology, USA from 2022– 2023.
PROFILE SUMMARY:
Data Analyst with 4+ years of experience in interpreting and analyzing complex datasets to drive actionable insights and strategic decision-making.
Proficient in SQL for querying, managing, and optimizing relational databases and Skilled in programming languages such as Python and R for data analysis, statistical modelling, and automation, expertise in data visualization tools like Tableau, Power BI, and Excel to create dynamic dashboards and reports.
Experienced in utilizing big data tools like Hadoop and Spark for handling and processing large datasets.
Hands-on experience with cloud platforms including AWS, Azure, and Google Cloud Platform (GCP) for data storage and processing.
Adept at conducting predictive analytics and machine learning using libraries such as Scikit-learn, TensorFlow, and Kera’s.
Skilled in performing statistical analysis and hypothesis testing to support business decisions and Proficient in API integration for seamless data flow between applications and proficient in Alteryx for automating data workflows, performing advanced analytics, and streamlining ETL processes to enhance data efficiency and accuracy and Skilled in leveraging Python libraries such as Pandas, NumPy, Matplotlib, Seaborn, and Scikit-learn for data manipulation, visualization, and machine learning model development.
Experienced in using Apache Spark for large-scale data processing and Apache Kafka for real-time data streaming and integration.
Proficient in Power BI for designing interactive dashboards, creating data models, and delivering and experienced in utilizing Snowflake for cloud-based data warehousing, performing advanced analytics, and optimizing data storage and retrieval processes and experienced in working with AWS and Azure technologies for cloud-based data storage, processing, and analytics, including services like S3, Redshift, Azure Data Lake, and Azure SQL Database.
Proficient in writing and optimizing complex SQL queries for data extraction, manipulation, and reporting to support data analysis and decision-making.
Proficient in using GCP technologies such as Big Query, Cloud Storage, and Dataflow for scalable data storage, processing, and analytics solutions in the cloud and adept at implementing CI/CD pipelines for deploying data workflows and analytics solutions in collaboration with DevOps teams.
Strong expertise in API integration for seamless data communication between systems and applications, and experienced with RESTful APIs and SOAP for web services integration.
TECHNICAL SKILLS:
Programming Languages: SQL, Python, R, JavaScript
Data Analysis & Statistical Tools: Excel (Advanced), SAS
Data Visualization: Tableau, Power BI, Looker, QlikView
ETL Tools: Informatica, Alteryx, Talend, Apache NiFi
Big Data Technologies: Hadoop, Spark, Snowflake, Apache Hive
Machine Learning: Scikit-learn,
TensorFlow, PyTorch, Kera’s
Databases & Data
Warehousing: MySQL,
PostgreSQL, Oracle, Teradata,
Amazon Redshift
Cloud Platforms: AWS, Google Big Query, Microsoft Azure, Snowflake.
Statistical Analysis: SciPy, Stats
Data Wrangling: Pandas, NumPy,
Web Scraping & APIs: Beautiful Soup, Selenium, Postman
Automation & Scripting: VBA (Excel), Power Automate, Bash
Version Control: Git, GitHub
Collaboration & Project Management: JIRA, Confluence, Microsoft Teams
Dashboarding & Reporting: Google Data Studio, Zoho Analytics
Natural Language Processing (NLP): NLTK, spaCy
Data Encryption & Security: SSL/TLS, AWS KMS, GDPR
Data Automation & Workflow
Orchestration: Apache Airflow, Luigi