Swathi Sarikonda
Senior Data Analyst H* EAD Chanhassen, MN
******.****@*******.*** +1-945-***-**** https://www.linkedin.com/in/swathi-sarikonda-a717b428/
SUMMARY PROFESSIONAL:
Results-driven IT professional with over 8 years of experience in Business and Data Analysis, ETL Development, ETL Testing, and Data Modeling.
Extensive expertise in Business and Data Analysis, Data Profiling, Data Mining, Data Migration, Data Integration, Data Architecture, and Metadata Management.
Linked data lineage to data quality and business glossary initiatives within broader data governance frameworks.
Established and implemented data governance policies, standards, and procedures to ensure regulatory compliance and adherence to industry best practices.
Designed and executed asset quality management strategies to enhance asset performance, reliability, and risk assessment.
Developed assessment frameworks for evaluating asset conditions and mitigating associated risks.
Worked closely with stakeholders to document and address data quality issues efficiently.
Partnered with cross-functional teams to develop and implement remediation plans, ensuring timely resolution of data quality concerns.
Conducted in-depth data analysis and profiling using complex SQL queries on Oracle and Teradata databases.
Skilled in Business Process Modeling, Process Flow Modeling, and Data Flow Modeling.
Adept at analyzing business requirements and creating Business Requirement Documents (BRDs).
Hands-on experience with data modeling tools such as Erwin, specializing in Entity-Relationship modeling, transactional database design, and Dimensional Data Modeling for Data Warehouses.
Proficient in enterprise repository tools, data mapping, and data profiling techniques.
Strong command of SQL, with expertise in developing T-SQL, Oracle PL/SQL scripts, stored procedures, and triggers for business logic implementation.
Experience working with Machine Learning algorithms, including Logistic Regression, Random Forest, Support Vector Machines (SVM), and Natural Language Processing (NLP).
Background in Statistical Analysis, with experience in developing R scripts during master's coursework.
Designed and developed multiple data pipelines using Python and complex SQL queries, scheduling jobs via AutoSys.
Hands-on experience in interacting with RESTful APIs using Python, extracting data, and automating tasks through Python scripting.
Well-versed in Agile and Waterfall methodologies.
TECHNICAL SKILLS:
Databases & Programming: SQL (T-SQL, PL/SQL, MySQL, PostgreSQL), Oracle, MS SQL Server, Teradata, Python (Pandas, NumPy, SciPy, Matplotlib, Seaborn), R.
Data Visualization & BI Tools: Tableau, Power BI, QlikSense, QlikView, Looker.
Cloud Platforms: AWS (S3, RDS, Glue, Redshift), Microsoft Azure, Google Cloud Platform (BigQuery, Cloud Storage).
Data Management & ETL: Data Modeling, Data Warehousing, ETL, Data Cleansing, Data Quality, Data Governance, Informatica PowerCenter, SSIS, Alteryx, Talend.
Big Data & Machine Learning: Apache Hadoop, Spark, Databricks, Scikit-learn, TensorFlow, Keras.
Project Management & Workflow Automation: JIRA, ServiceNow, Confluence, Apache Airflow, AWS Glue, Azure Data Factory.
Other Skills: Business Intelligence, Data Profiling, Statistical Analysis, Predictive Analytics, Metadata Management, Data Security & Compliance (GDPR, SOX, HIPAA, PCI DSS).
PROFESSIONAL EXPERIENCE
Guardian Life
Senior Data Analyst May 2023 – Present
Conducted data analysis and profiling using complex SQL on various source systems, including Oracle and Teradata.
Analyzed both functional and non-functional data elements, mapping data from source to target environments and documenting findings for structured remediation.
Built and deployed Machine Learning models using NLP, Logistic Regression, Random Forest, and SVM to classify text-based data, delivering Proof of Concept (PoC) solutions for business challenges.
Spearheaded end-to-end data integration and analysis of PHI/PII datasets across medical, dental, and vision insurance lines, ensuring compliance with HIPAA, GDPR, and Guardian’s internal governance policies.
Designed and maintained secure ETL pipelines in Alteryx and SQL for healthcare claims data, enabling efficient data transformation and reporting while adhering to PHI protection standards.
Partnered with cross-functional teams to implement role-based access controls and data encryption strategies for sensitive healthcare data, supporting Guardian’s enterprise-wide “defense-in-depth” security approach.
Performed advanced data wrangling in Alteryx to cleanse, reshape, and join large datasets from Oracle, Teradata, and external APIs, enabling accurate and timely KPI reporting.
Automated repetitive data wrangling processes using Alteryx macros and workflows, reducing manual effort and improving data pipeline efficiency by 50%.
Applied business rules and transformation logic during data wrangling to ensure alignment with compliance standards for PHI and PII data under HIPAA and GDPR regulations.
Designed and implemented data pipelines to extract and transform customer satisfaction and employee engagement surveys from Qualtrics, supporting enterprise-wide voice of customer (VoC) initiatives.
Built automated workflows in Alteryx to combine Qualtrics data with internal policyholder information for personalized outreach and service optimization.
Collaborated with clinical and business teams to translate Medicare Shared Savings Program (MSSP) requirements into actionable Arcadia dashboard components, aligning analytics with value-based care initiatives.
Worked directly within Arcadia’s data model and warehouse layer to create and optimize source tables used in dashboarding and analytics delivery across the enterprise.
Led reverse engineering efforts of legacy reports into Arcadia, ensuring consistent data definitions and enhancing accuracy and automation across clinical and operational reporting.
Performed data quality assessments, data profiling, and cleansing across both relational and non-relational databases.
Hands-on experience with AutoSys and Control-M for scheduling and monitoring ETL workflows.
Conducted extensive data validation by writing complex SQL queries, supporting backend testing, and resolving data quality issues.
Worked closely with business users to understand key data concepts, business requirements, and core information structures.
Designed and managed database objects, including tables, joins, nested queries, views, sequences, and synonyms, to support business applications.
Developed PL/SQL scripts, stored procedures, functions, and triggers, ensuring data integrity and automating data processes.
Utilized Oracle Data Integrator (ODI) for designing interfaces, defining data stores, and customizing Knowledge Modules to transform and load data efficiently.
Configured data mappings and optimized ETL workflows to meet project-specific requirements.
Assisted in defining business requirements, creating BRDs, functional specifications, and data mapping documents to guide developers.
Wrote and optimized complex SQL queries in T-SQL and Oracle for data extraction, transformation, and reporting.
Built and maintained metadata and data dictionaries to support ongoing data management and ensure future usability.
Mapped the trade cycle data flow from source to target systems and documented transformation logic.
Optimized SAS programs for data extraction and transformation, improving processing efficiency by 30%.
Standardized corporate variables in metadata, ensuring adherence to best practices in dashboard development and visualization.
Designed interactive dashboards and reports in Tableau and Power BI, tracking key performance indicators (KPIs) such as loan approvals, default rates, and customer acquisition costs.
Automated data preparation workflows using Alteryx, reducing manual efforts by 40% while enhancing data accuracy.
Led data governance initiatives, automating processes with Collibra to ensure data integrity across SQL Server and DB2.
Integrated Alteryx with various data sources, including databases, cloud platforms, and APIs, to enable seamless data ingestion and analysis.
Established data quality monitoring systems, enabling real-time tracking of remediation efforts and issue resolution.
Partnered with senior management to define data insights strategies, goals, and objectives to enhance decision-making.
Developed complex PL/SQL queries for enterprise data warehouses, integrating multiple tables to streamline data pipelines.
Scheduled data jobs using AutoSys for daily data loads.
Designed and developed Python-based data pipelines to extract, transform, and load (ETL) data from multiple sources into data marts.
Built standalone Python scripts for extracting structured data from PDF files using Regular Expressions, storing the output in databases for further analysis.
Automated RESTful API interactions to download, transform, and integrate data into business applications.
Re-engineered and optimized existing data ingestion pipelines, improving efficiency and reducing processing time.
Implemented performance tuning strategies to enhance SQL queries and ETL processes.
Provided regular updates to management and internal teams on the status of data initiatives and remediation efforts.
Citi Bank
Senior Data Analyst June 2020 – March 2023
Performed Data Analysis, Migration, Cleansing, Transformation, Integration, Import, and Export using Python to enhance business operations and analytics.
Developed and optimized PL/SQL stored procedures, functions, triggers, views, and packages, implementing indexing, aggregation, and materialized views to improve query performance.
Built logistic regression models using R and Python to predict customer subscription response rates, leveraging transaction history, prior interactions, promotions, demographics, and behavioral data.
Worked extensively on Snowflake modeling and data warehousing techniques, including Data Cleansing, Slowly Changing Dimensions (SCD), Surrogate Key Assignment, and Change Data Capture (CDC).
Designed and implemented ETL processes using Oracle Data Integrator (ODI), creating complex packages and automated workflows for seamless data movement.
Utilized ODI Operator for debugging and monitoring job execution.
Created and optimized indexes and dynamic SQL queries for faster data retrieval, improving database performance.
Used SQL*Loader for bulk data loading from external sources into DB2 databases.
Provisioned and managed Google Cloud Platform (GCP) infrastructure using Terraform, setting up VPCs, subnets, storage buckets, GKE clusters, GCP Composer, and Secret Manager.
Built and optimized ETL pipelines for data extraction, transformation, and loading across DB2, Snowflake, and Redshift environments.
Utilized Informatica Data Quality (IDQ) for data profiling, enrichment, and standardization, designing mappings for efficient data cleansing and loading.
Conducted data profiling and scorecard analysis to enhance the data model for improved analytics.
Developed ETL jobs using IBM DataStage, ensuring smooth data integration processes.
Migrated Talend Joblets to support Snowflake functionality, enhancing system compatibility and efficiency.
Implemented Snowpipe for continuous data ingestion and used COPY commands for bulk data loads.
Configured data sharing across Snowflake accounts and established internal and external staging areas for optimized data transformation.
Refactored and optimized Snowflake views, conducting unit testing to ensure data consistency between Redshift and Snowflake.
Designed and built interactive dashboards in Looker and Tableau, utilizing Snowflake connections for real-time insights.
Developed end-to-end ETL workflows, from data ingestion to staging, data marts, and final reporting layers.
Built ETL processes using Alteryx, facilitating seamless data transformations between sources and targets.
Partnered with data engineers and analysts to establish data quality standards and governance policies, leveraging Alation’s data cataloging capabilities.
Collaborated with cross-functional teams to define data requirements and implement customized SAS solutions to address business challenges.
Managed and maintained large-scale datasets in SAS, performing data cleaning, merging, and transformation to support key projects.
Designed data lake architecture, dimensional modeling, and Data Vault 2.0 on Snowflake, leveraging logical data warehouses for optimized compute performance.
Developed Azure Data Factory (ADF) pipelines using Linked Services, Datasets, and Pipelines to extract, transform, and load data from Azure SQL, Blob Storage, and Azure SQL Data Warehouse.
Engineered ADF pipelines to process and transform data from multiple sources, ensuring accuracy and consistency in Azure SQL environments.
Developed use case diagrams and documentation, outlining system specifications for key business applications.
Executed advanced SQL queries and complex joins on Azure SQL Data Warehouse, ensuring data accuracy and consistency across systems.
Performed data profiling and validation, reconciling warehouse and source system data to maintain data integrity.
Created Tableau dashboards and reports, providing data visualization and analytics to support business decision-making.
Collaborated with senior management to define data insight goals, objectives, and key performance indicators (KPIs).
Cigna Health
Senior Data Analyst Dec 2018 – May 2020
Bridged the gap between business and IT teams by blending technical expertise with business acumen, serving as an IT liaison to align business objectives with technology solutions.
Engaged with stakeholders and customers to gather user requirements and define business goals for healthcare data initiatives.
Integrated and analyzed Press Ganey survey data to assess patient satisfaction trends across multiple healthcare providers, identifying 12% improvement opportunities in care delivery metrics.
Automated data pipelines using Alteryx and SQL to clean, normalize, and load Press Ganey and internal Qualtrics survey data into GCP-based data marts for executive dashboards.
Ensured PHI/PII compliance with HIPAA and internal data governance policies while handling survey data from Press Ganey and Qualtrics.
Facilitated Joint Application Development (JAD) sessions to elicit requirements, conduct use case and workflow analysis, define business rules, and develop domain object models.
Performed data analysis and validation using complex SQL queries in TOAD against Oracle databases to ensure data accuracy and integrity.
Documented data lineage using Microsoft Visio, providing visual representations of process flows, workstreams, and data movement.
Implemented Collibra to automate data management processes, ensuring efficient data governance and compliance.
Conducted end-to-end data lineage assessments and documented critical data elements (CDEs) to enhance data traceability.
Collaborated with Finance, Risk, and Investment Accounting teams to establish a Data Governance Glossary, Governance Framework, and Process Flow Diagrams.
Assessed data lineage processes to identify vulnerabilities, control gaps, data quality issues, and areas requiring stronger governance.
Designed conceptual models based on input from functional and technical teams, ensuring alignment with business objectives.
Ensured data quality and integrity when pulling data from multiple Db2 tables with large volumes of structured healthcare records.
Supported Power BI dashboard development by connecting directly to Db2 datasets and creating data models for visualizations.
Coordinated with DBA teams to request indexes, manage performance, and validate table structures.
Validated reports by writing SQL queries in TOAD on Oracle databases, ensuring data accuracy and reliability.
Developed process documentation, reporting specifications, training materials, and presentation decks to support application development teams and management.
Authored and maintained Source-to-Target Mapping (STM) documents, outlining data transformation logic for seamless integration.
Worked with Subject Matter Experts (SMEs) to analyze data extracts from legacy systems (Mainframes and COBOL files), assessing data source integrity and formatting requirements.
Translated business requirements into optimized data structures, facilitating efficient data storage, retrieval, and processing.
Partnered with data modelers and ETL developers to create Data Functional Design Documents, ensuring adherence to best practices.
Ensured data models aligned with industry standards, supporting scalability, normalization rules, and cost-effective adaptability.
Created and maintained specifications and process documentation to deliver data profiling, source-to-client mappings, and data flow analysis.
Infosys, Hyderabad, India
Data Analyst June 2014 – Jan 2017
Designed and developed ETL workflows and datasets using Alteryx, enabling seamless data processing and transformation.
Processed data in Alteryx to generate TDE and Hyper files for enhanced Tableau reporting.
Built analytical applications in Alteryx Designer, deployed them on the Alteryx Server, and enabled access for non-technical users.
Developed in-database Alteryx workflows for optimized data preparation and visualization in Tableau and Power BI.
Established ETL workflows to integrate data warehouses such as Snowflake and Redshift using Alteryx.
Applied Erwin and normalization techniques to design logical and physical data models for optimized relational database structures.
Worked with Informatica Cloud to develop source-to-target mappings, ensuring seamless data transformation.
Conducted Informatica performance tuning, identifying and resolving bottlenecks at source, target, mapping, and session levels.
Developed ETL programs using Informatica PowerCenter 9.6.1/9.5.1, implementing business requirements efficiently.
Collaborated with cloud service providers to ensure compliance with data governance, security policies, and contract management.
Partnered with data integration teams to improve customer data quality (CDQ) across multiple systems, ensuring data consistency and reliability.
Defined and documented data remediation protocols and best practices, ensuring uniform application of data quality standards.
Designed parallel jobs using various ETL components, including Sequential File, Complex Flat File, ODBC, Join, Merge, Filter, Lookup, Modify, Transformer, Change Capture, and Funnel.
Integrated data lineage tracking with data quality and business glossary workflows, strengthening the data governance framework.
Conducted comprehensive data analysis using SAS Enterprise Guide, generating detailed reports that provided business insights for decision-making.
Implemented Data Governance strategies using Excel and Collibra, ensuring structured data management and compliance.
Automated data management processes through Collibra, improving governance and operational efficiency.
Actively participated in Data Governance working group sessions, contributing to the creation of data governance policies.
Automated ETL processes for dynamic data migration and validation, leveraging Informatica, SSIS, and Alteryx.
Led and delivered Master Data Governance, Data Quality (DQ), and D&B enrichment/cleansing solutions for a global life sciences company.
Developed data mapping, governance policies, transformation rules, and cleansing protocols for Master Data Management (MDM) architecture.
EDUCATION:
Master’s in information management, University of Texas at Arlington - 2018
Bachelor of Engineering in Electrical Engineering, Osmania University, India - 2014