Venkata Naga Sanjay Reddy Narra
DATA ANALYST
Mobile : 551-***-**** Email : ***********.*****@*****.*** Location: Fairfax, Virginia LinkedIn SUMMARY
Data Analyst with 4 years of experience in analyzing large datasets, developing predictive models, and providing data- driven insights to support business decisions in industries such as healthcare and customer retention.
Proficient in Python, SQL, Tableau, and machine learning techniques, skilled in data cleaning, analysis, and visualization to enhance data accuracy and optimize business processes, improving efficiency.
Expertise in building ETL pipelines using Python, Apache Airflow, and cloud platforms such as AWS S3 and Redshift, ensuring scalable and secure data storage and management.
Strong background in applying statistical analysis and predictive modeling (regression, classification models, machine learning) to forecast trends, improve operational efficiency, and drive business outcomes, including reducing churn and improving predictions.
Effective team collaborator with experience in Agile environments, maintaining high standards in data governance, and ensuring data accuracy and project delivery within timelines. TECH - SKILLS
Methodologies: SDLC, Agile, Waterfall
Programming Language: Python, SQL, R
Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow, Seaborn, dplyr, ggplot2 Visualization Tools: Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP) IDEs: Visual Studio Code, PyCharm, Jupyter Notebook, IntelliJ Database: MySQL, PostgreSQL, MongoDB, SQL Server
Cloud Platform: Amazon Web Services (AWS), Microsoft Azure Other Technical Skills: SSIS, SSRS, Machine Learning Algorithms, ETL\ELT Tools, Statistics, ServiceNow, Hadoop, Spark, MapReduce, Alteryx, Google Big Query, Power Query, Probability distributions, Mathematics, Confidence Intervals, ANOVA, Hypothesis Testing, Regression Analysis, Linear Algebra, Advance Analytics, Data Mining, Big Data, Data Integration, Data Interpretation, Data Pipeline, Data Visualization, Data warehousing, Data transformation, Data Governance, Data Association rules, Clustering, Classification, Regression, A/B Testing, Forecasting & Modelling, Data Cleaning, Data Wrangling, Descriptive analytics, Git, GitHub, JIRA Operating Systems: Windows, Linux
EDUCATION
George mason University August 2023 - May 2025
Master of Science, Data Analytics and Engineering Fairfax, Virginia Chandigarh University June 2018 - April 2022
Bachelor of Engineering, Mechanical Engineering Chandigarh, Punjab CERTIFICATION
• Acquired Skill Mastery Certification in SQL from Scaler Academy
• AWS Cloud Practitioner (CLF-C02) and Snowflake Foundation Certification from Data Camp
• Completed SQL, Python, R, Power Bi, Tableau certifications from Data Camp PROJECT
Database management system for Public Library
Designed MySQL database with 8+ relational tables to manage library users, materials, and borrowing operations.
Built 15+ optimized SQL queries using joins, sub queries, and windows, increasing data retrieval speed by 40 percent.
Automated overdue tracking and deactivation with SQL triggers, reducing manual effort by 60 percent. Predicting Credit Card Account Cancellations
Built and compared three ML models; Decision Tree achieved 90.49 percent accuracy and 95.05 percent AUC.
Identified churn-prone customers based on demographics, credit limits, spending behavior, and employment type.
Recommended incentives using spending/utilization data to reduce churn and retain revenue-generating customer segments.
Credit Card Default prediction using machine learning
Applied advanced classification algorithms (Random Forest, Gradient Boosting) to predict customer default using 20+ financial and behavioral features.
Applied advanced analytics and PCA to synthesize complex data, enhancing interpretability and model efficiency.
Used SMOTE to balance training data and enhanced minority class recall, improving model sensitivity and fairness EXPERIENCE
Aug 2024 - Present DATA ANALYST Inova Health System, USA
Designed, developed, and implemented advanced machine learning models using Scikit-learn and TensorFlow for anomaly detection to identify and flag potentially fraudulent healthcare claims, improving fraud detection accuracy by 35% while reducing false positive rates by 25%.
Extracted, transformed, and loaded (ETL) large-scale healthcare claims data from multiple sources using PostgreSQL and Apache Spark, ensuring efficient data processing and enabling timely analytics, resulting in a 40% increase in data throughput.
Created dynamic, interactive dashboards and detailed data visualizations with Power BI and Advanced Excel to present actionable fraud insights and trends to cross-functional teams and senior management, enhancing decision-making efficiency by 30%.
Leveraged Amazon Web Services (AWS) tools including S3 for secure data storage and SageMaker for scalable machine learning model training, validation, and deployment in a cloud- based environment, reducing model deployment time by 20%.
Applied machine learning algorithms including Random Forest, Gradient Boosting, and Support Vector Machines (SVM), resulting in a significant increase in model precision and recall for detecting complex fraudulent claim patterns.
Applied advanced statistical analysis methods such as ANOVA and A/B testing to rigorously evaluate model performance, optimize detection thresholds, and measure the impact of fraud detection initiatives, contributing to operational cost savings of 45%.
Automated complex data workflows and preprocessing tasks using Alteryx, resulting in improved data pipeline efficiency by 50%, repeatability, and reduced manual errors by 30%. May 2020 – Jun 2023 DATA ANALYST Wipro, India
Partnered with cross-functional Agile teams (Finance, IT, and Business stakeholders) to gather, document, and prioritize data and reporting requirements using JIRA, ensuring alignment with business objectives and smooth sprint delivery.
Designed automated ETL pipelines using Python (pandas, NumPy) and Apache Airflow, enabling seamless extraction, transformation, and loading of financial data from disparate sources such as Excel and SQL databases. Automation reduced manual processing time.
Authored complex SQL queries in SQL Server to join, filter, and aggregate multi-source financial data for dashboards and ad hoc analysis. Utilized window functions, CTEs, and subqueries to optimize 20% performance and handle large datasets efficiently.
Built dynamic financial dashboards and summary reports using Tableau, integrating slicers, filters, and drill-through capabilities. Empowered stakeholders to explore trends in revenue, expenses, and variances independently, significantly improving reporting turnaround time.
Consolidated operational and financial datasets into a unified analytics layer using Snowflake and Power Query, enabling a single source of truth for enterprise-wide reporting and analysis.
Implemented data governance and compliance protocols through role-based access controls and row-level security, securing sensitive audit and financial data using Azure Blob Storage and SharePoint, ensuring alignment with SOX and GDPR standards.
Established data validation workflows using Great Expectations and custom Python-based quality checks, performing source-to-target reconciliations to maintain data integrity.
Employed unsupervised learning techniques like K-means clustering to segment business units based on performance indicators, guiding targeted strategic initiatives.