UJJWAL GANGOLU Boston, MA 203-***-**** ***********@*****.*** LinkedIn
Data Analyst
SUMMARY
Data Analyst with over 5 years of experience delivering impactful insights through advanced analytics, machine learning, and cloud technologies. Expertise in data cleaning, exploratory data analysis (EDA), and predictive modelling using Python, R, SQL, and Spark. Skilled in utilizing AWS (EC2, S3, Redshift, Lambda, Athena) for scalable data processing and optimizing ETL pipelines. Proficient in creating interactive dashboards and visualizations with Tableau and Power BI to drive data-driven decision-making. Led initiatives to enhance fraud detection, improve product analytics, and automate reporting, achieving operational efficiency and reducing costs. Adept at collaborating with cross-functional teams to ensure data governance, compliance, and alignment with business objectives.
TECHNICAL SKILLS
Languages: Python (Pandas, NumPy, Seaborn, Matplotlib), R, SQL, Bash, Excel-VBA
Data Visualization: Power BI, Tableau, Excel, Spotfire, QlikView
Databases/Warehousing: SQL Server, MySQL, PostgreSQL, Oracle, MongoDB, Snowflake, Redshift
Cloud Platforms: AWS (EC2, S3, Lambda, Redshift, EMR, Athena, Bedrock), GCP (BigQuery)
Big Data & ETL Tools: Spark, Databricks, Hadoop, Hive, Kafka, Informatica, Airflow, Oozie, Alteryx, SSIS/SSRS/SSAS
Analytics & ML: Scikit-learn, TensorFlow, PyTorch, A/B Testing, NLP, EDA, Regression
APIs & DevOps: REST API, Git, GitHub, Jenkins, Salesforce, Putty
GenAI & LLM: Chatbots, Generative AI, LLM Integration, NLU/NLP (Databricks Genie)
Security & Compliance: HIPAA, GDPR, CCPA, GLBA; Data Governance, Encryption
Methodologies & Tools: Agile, SDLC, Waterfall; Visual Studio Code, PyCharm
PROFESSIONAL EXPERIENCE
Allstate, MA
Data Analyst Aug 2023 – Current
Accelerated insurance claims processing by 25% by developing Python-based predictive models to triage claims and identify high-risk cases, reducing resolution time and minimizing fraudulent payouts across Property & Casualty (P&C) claims.
Developed interactive Tableau dashboards to visualize claim trends, eliminating 80% of manual reporting and enabling 50+ managers to access real-time KPIs.
Collaborated with business stakeholders and claims teams to define key metrics (cycle time, loss ratios), translating requirements into self-service analytical tools with 100% adoption.
Designed and deployed a Generative AI chatbot using Amazon Bedrock, REST APIs, and Python to automate user queries, enhancing self-service capabilities and improving response time by 45%.
Built scalable pipelines in Databricks using PySpark and Spark SQL, processing over 500,000 P&C claims records and optimizing data refresh intervals.
Integrated Salesforce CRM data with internal analytics systems using SQL and Python, building a unified customer view that enhanced churn modeling.
Designed and implemented data-driven credit risk policies for BNPL, POS lending, and card products, improving risk-adjusted returns.
Created Chat-to-SQL functionality via Databricks' Genie workspace to empower non-technical users to run queries using natural language, improving data accessibility by 50%.
Ensured compliance with GDPR and CCPA by embedding data governance rules and collaborating with legal teams during chatbot development.
Delivered ad-hoc reports and exploratory insights using SQL, Excel, and Tableau in response to urgent stakeholder requests, enabling faster underwriting and compliance decisions.
Built self-service reporting templates in Tableau/Power BI, reducing dependency on technical teams by 40%.
Applied data encryption techniques and field-level masking in accordance with HIPAA (healthcare) and GLBA (insurance/financial) standards to protect sensitive insurance and patient data across cloud platforms.
Performed advanced data wrangling using Python (Pandas) and SQL to clean, normalize, and enrich multi-source datasets for predictive analytics and reporting.
Built star schema models and DAX measures in Power BI to enable high-performance KPI dashboards for claims analytics.
Defined functional KPIs and led UAT workshops with underwriters to align analytics solutions with P&C claim workflows.
Performed cohort and churn analysis using SQL and Tableau to support targeted retention strategies for P&C insurance customers.
Used Git and GitHub for version control and collaborative development of analytics pipelines and reporting templates, improving code quality and traceability across the data team.
Delivered business-ready visual stories through Power BI and Tableau dashboards, translating complex insurance data into actionable insights for executives and non-technical stakeholders.
Leveraged Google Analytics data to monitor customer engagement and conversion trends across digital platforms, aligning marketing insights with retention strategies.
Developed and optimized complex SQL queries in Snowflake to support real-time analytics for credit risk assessment and customer segmentation.
Partnered with product owners, underwriters, and claims managers to define reporting goals, enabling 100% adoption of self-service analytics through KPI-driven dashboards.
DXC Technology, India
Data Analyst Jan 2019 – Dec 2021
Project: Fraud Detection and Product Analytics @European Central Bank
Saved millions in fraud losses by implementing a real-time fraud detection system using Python and Spark, increasing transaction scoring coverage from <10% to 100% and reducing false positives by ~30%.
Migrated fraud analytics workflows to AWS (S3, EMR, Glue), improving agility and reducing infrastructure costs by 25%.
Built Tableau and Power BI dashboards to monitor fraud KPIs and compliance metrics, enabling auditors to respond in real-time.
Extracted and analyzed operational and financial data from SAP S/4HANA and Oracle PeopleSoft ERP systems for risk profiling and cost optimization.
Developed interactive fraud monitoring dashboards in Power BI, integrating Snowflake and AWS data sources.
Conducted gap analysis between legacy fraud detection processes and redesigned pipeline, documenting workflows in Confluence.
Applied real-time anomaly detection on financial transactions using Spark MLlib, reducing false positive alerts.
Partnered with cross-functional teams including Data Science, Risk Strategy, and Product to align credit policies with business goals at national scale.
Implemented secure data sharing and role-based access controls within Snowflake to ensure compliance with data governance policies.
Integrated ERP master data (GL, vendor, procurement) into data lakes via AWS Glue, improving financial reporting visibility.
Automated reconciliation reports between ERP and CRM (Salesforce) systems, enhancing audit traceability by 20%.
Managed fraud detection pipelines and business intelligence dashboards using Git/GitHub to track iterations, automate testing, and align data assets across distributed teams.
Project: Pharmaceutical Sales and Drug Performance Analytics @AstraZeneca
Increased sales insight impact by 15% by analyzing CRM and sales data using SQL and R, uncovering trends and identifying new market opportunities.
Delivered Power BI reports for clinical trial tracking with drill-through features for site-level performance monitoring.
Collaborated with marketing to define physician segmentation criteria, influencing campaign allocation strategies.
Built demand forecasting models using Python and time-series analysis to optimize inventory planning.
Delivered Power BI dashboards for product performance and physician targeting, influencing strategic marketing allocations.
Presented drug performance trends and physician targeting strategies to cross-functional stakeholders, influencing data-driven decision-making in AstraZeneca’s marketing campaigns.
Designed story-driven dashboards in Power BI to communicate clinical trial progress and sales KPIs to business leaders, boosting engagement and improving reporting accuracy.
Built demand forecasting models using time series and regression in Python, improving inventory planning and reducing overstock by 18%.
Automated clinical trial data reporting using Python, saving 40+ hours/month and improving compliance.
Applied masking and data encryption methods aligned with HIPAA and 21 CFR Part 11 to secure clinical trial data and patient health records, ensuring regulatory compliance and protecting sensitive healthcare information.
Bridged business and technical teams to translate analytics requirements into scalable reporting systems adopted enterprise-wide.
Conducted data wrangling on pharmaceutical datasets to resolve inconsistencies, missing values, and outliers prior to advanced statistical modeling.
Prop Technology, India
Data Analyst Feb 2018 - Dec 2018
Built data pipelines using Databricks (Spark), Informatica, and AWS (S3, RDS), processing millions of records with 99% accuracy.
Integrated data from REST APIs and internal systems, ensuring robust validation and increasing processing throughput by 50%.
Designed a centralized SQL data warehouse for operational data, reducing report generation time by 30%.
Wrote optimized PostgreSQL queries and automated visualizations in Power BI and Python, improving executive visibility into performance metrics.
Delivered Snowflake-powered dashboards to business users by connecting Snowflake views to Tableau, allowing real-time exploration of customer churn and campaign performance.
Conducted ad-hoc EDA to uncover trends and anomalies within large datasets, supporting executive strategy.
Applied data governance policies and Agile methodologies to ensure secure and scalable analytics workflows.
Performed complex data wrangling using Python and SQL to cleanse and structure unorganized real estate datasets for location analysis and investment profiling.
EDUCATION
University of New Haven West Haven, CT
Master of Science in Business Analytics Jan 2022 – May 2023
Vardhman College of Engineering INDIA
Bachelor’s in Electronics and Communication Engineering June 2014 – May 2018
PROJECTS
Credit Card Fraud Detection
Developed a machine learning-based solution for real-time credit card fraud detection, leveraging advanced algorithms and addressing challenges of data imbalance using SMOTE.
Evaluated and optimized models such as Random Forest and Gradient Boosting, achieving an AUC-ROC score of 0.98 through feature engineering and hyperparameter tuning.
Built a scalable fraud detection pipeline using Python (scikit-learn, NumPy, Pandas), enhancing financial security and mitigating potential losses.
Healthcare Supply Chain Optimization
Analyzed and optimized healthcare supply chain operations by identifying inefficiencies in inventory management and delivery processes using data- driven techniques.
Implemented optimization models and analytics tools to streamline operations, resulting in improved efficiency, reduced costs, and enhanced service levels in critical supply chains.
Conducted in-depth data analysis using tools like Python (Pandas, NumPy), Excel, and Tableau, providing actionable insights to support decision- making and resource allocation.
CERTIFICATES
AWS Certified Solutions Architect – Associate Certification Splunk Enterprise Security Certified Admin