Post Job Free
Sign in

Data Analytics Business Intelligence

Location:
Pune, Maharashtra, India
Posted:
September 11, 2025

Contact this candidate

Resume:

Soujanya Sankathala

Glen Allen, VA 669-***-**** ********.************@*****.*** LinkedIn GitHub

(Open to Relocate)

SUMMARY

Data professional with 4+ years of experience in data analytics, data engineering and business intelligence with a Master’s in Data Science & Analytics. Actively seeking opportunities to apply data analytics expertise to deliver data-driven insights, optimize business processes, and support strategic decision-making for leadership. AWARDS, CERTIFICATIONS, AND RECOGNITIONS

● Published Research – IEEE CAI 2025: “InvadeAI: Interactive Advertising with Multimodal AI”

● Salesforce Certified Administrator, Salesforce Certified Platform Developer I

● SnowPro Associate: Platform Certification

Associated with Birlasoft

● Received multiple awards, including Applause(x2), Manager Select, and Individual Excellence(x2), consistently recognized for teamwork, responsibility, customer satisfaction, and driving success in project delivery.

● Received Star Award for driving critical success in the Cloud transformation project, delivering high-impact analytics solutions that improved efficiency and earned client and leadership appreciation.

● Awarded ‘Best Trainer’s Choice’ from a cohort of 150+ trainees for exceptional technical aptitude, rapid learning curve, and demonstrated leadership potential during the training program. PROFESSIONAL EXPERIENCE

U.S. Bank Remote, US

Data Analyst May 2024 - Present

Initiative 1: Consolidated siloed data systems that slowed compliance reporting by integrating them into a Snowflake platform, designing scalable schema models and secure data flows that improved reporting accuracy, strengthened audit readiness, and delivered actionable business insights.

● Integrated Snowflake with AWS S3, implementing star schema models to unify siloed datasets and enabling Tableau dashboards that improved reporting accuracy by 40% and delivered insights for risk reporting and lending decisions.

● Optimized Snowflake query performance by leveraging clustering keys, micro-partition pruning, schema design strategies, and warehouse workload tuning, reducing compute overhead and cutting dashboard latency by 30%.

● Led Scrum practices by organizing daily stand-ups and sprint reviews, enhancing team accountability and project delivery while fostering continuous improvement

● Collaborated with finance and compliance teams to define regulatory KPIs, design data flows (source staging curated), and implement validation rules, enhancing trust in audit and compliance reporting.

● Built automated ELT pipelines in Apache Airflow, using Python + SQL for DAG scheduling and dependency management to load data into Snowflake daily; reducing refresh times by ~30% and keeping Tableau dashboards up-to-date for near real-time operational and compliance decisions.

Initiative 2: Due to large volumes of structured and semi-structured financial data received daily, the existing systems struggled to process them within compliance and reporting timelines. To address this, I designed and implemented scalable ETL pipelines on AWS to ensure timely, secure, and efficient data processing.

● Automated scalable ETL pipelines with SQL, PySpark and AWS serverless tools (Lambda, Step Functions, Athena) to ingest and process terabytes of structured and semi-structured data daily, reducing onboarding and refresh times by 60%.

● Implemented granular IAM-based security policies across AWS data pipelines, safeguarding sensitive financial data and ensuring 100% compliance with internal governance and SOC2/GDPR requirements.

● Automated CI/CD workflows with Bitbucket and Terraform for ingestion pipelines, standardizing deployments across dev, test and production environments and improving reproducibility with built-in version control. Other Duties:

● Developed predictive models (time series, regression) using Python (Pandas, Scikit-learn) to forecast spending patterns and detect anomalies, improving decision accuracy and reducing false positives by 20%.

● Automated Excel-based risk assessment and executive reports with VBA, INDEX-MATCH/VLOOKUP and Pivot Tables. Built Power Pivot/DAX models for 1M+ rows and centralized reports in SharePoint for stakeholder access.

● Built reusable Python scripts for data wrangling, data quality, and drift detection to validate schema and category-level changes in financial datasets; integrated alerts into Tableau dashboards, reducing manual QA effort by 70%.

● Leveraged JIRA to track project progress & maintained extensive documentation on Confluence pages.

● Conducted knowledge training sessions on BI dashboards for key users, enabling self-service analytics and improving user adoption, leading to 30% fewer change requests in reports and a higher number of active report users.

● Conducted A/B testing on Tableau dashboard layouts and navigation flows, improving usability and increasing stakeholder engagement by 15%.

BIRLASOFT Pune, India

Data Analyst Dec 2020 - Jun 2023

Client: Invacare

Initiative 1: Invacare, a medical device manufacturer, relied on Oracle Fusion for product lifecycle tracking, but its native reporting lacked flexibility and left stakeholders with limited visibility into supply chain performance. I partnered with stakeholders to design and deliver Power BI and OTBI dashboards that provided real-time insights and enhanced decision-making.

● Designed and delivered Power BI dashboards by integrating Oracle Fusion OTBI subject areas with Power Query/DAX, tracking 10+ KPIs (product lifecycle status, product readiness %, approval rates, user adoption metrics); presented insights to leadership and boosted executive decision-making by 25%.

● Developed scalable data models and SQL queries in Oracle OTBI to design 40+ reports tracking product lifecycle and change management KPIs, reducing dashboard load times from 25s to <15s and enabling self-service analytics.

● Applied statistical methods for trend analysis and anomaly detection on 1.2M+ records, and conducted A/B tests to identify inefficiencies and optimize Invacare’s supply chain for cost savings and revenue growth. Initiative 2: Stakeholders faced challenges manually compiling data extracts and meeting audit requirements. I automated reporting by leveraging BI Publisher to build SQL data models, and implemented advanced recurring scheduling with pixel-perfect outputs and bursting for role-specific delivery of regulatory and operational reports, reducing effort by 80%.

● Developed custom SQL-based data models in BI Publisher and tuned queries (joins, indexing, parameterization) to extract large medical device datasets (>500k records) from Oracle Fusion database tables; optimized runtime by 40%.

● Automated scheduling of pixel-perfect BI Publisher reports by aligning to business cycles (daily, monthly, quarterly) and staggering heavy jobs during off-peak hours, which reduced server load and guaranteed timely delivery to stakeholders.

● Implemented BI Publisher bursting schedules to consolidate data processing into a single execution and distribute role-specific outputs in multiple formats (Excel, PDF, CSV) via email/SFTP; improved accessibility for 50+ global users.

● Maintained governance and audit readiness for scheduled reports by retaining job history for traceability and validating outputs, parameters, and delivery channels; reduced redundant schedules by 25% and ensured full compliance. Initiative 3: Improved system reliability and compliance by delivering end-to-end Oracle Cloud PLM support through UAT, regression testing, role-based security, and cross-functional collaboration, ensuring accurate reporting and smoother operations.

● Led UAT and data validation across integrated systems to ensure accuracy and reliability; supported quarterly Oracle Cloud release cycles by executing regression scripts, validating fixes, and maintaining stability of the PD module.

● Conducted root-cause analysis on part revisions and change impacts to prevent inventory errors; provided hypercare support post go-live by resolving defects, enhancements, and integration issues.

● Configured Role-Based Access Control (RBAC) for 150+ global users via Oracle Security Console, preventing unauthorized changes, and achieving zero findings in internal IT audits.

● Collaborated with 3+ cross-functional teams (Product, Operations, Supply Chain, Engineering) to gather requirements, create business requirements documentation (BRD) and deliver dashboards to support board-level strategic planning and data-driven decision-making.

Initiative 4: Led migration from legacy Agile PLM to Oracle Fusion SCM Cloud – Product Lifecycle Management (PLM, Product Development) to replace an on-premise system with high costs, complex custom integrations, and limited new feature development. Migrated processes and data to a scalable, compliant Cloud platform by configuring attributes/BOMs, executing change management processes (ECR, ECA, ECO, Change Control), automating OIC workflows and FBDI data loads, and ensuring UDI compliance and data integrity.

● Led Agile PLM to Oracle Cloud PLM migration by leveraging FBDI templates (ZIP via UCM) for item, BOM and ECO data loads. Developed Excel/SQL pre-validation scripts to enforce attribute and LOV compliance, eliminating 95% of staging import errors and ensuring data integrity for go-live.

● Configured 200+ product and regulatory attributes, BOM structures for PD module, and built Oracle Integration Cloud

(OIC) workflows to automate change orders and data mapping, cutting ECR/ECO cycle times up to 80%.

● Implemented UDI attributes (FDA product codes, submission numbers, marketing authorization types) and automated UDICO submission workflows to meet medical device regulatory requirements with validation and error-handling logs to achieve 100% FDA submission accuracy and enhance compliance tracking. PROJECTS

InvadeAI: Interactive Advertising with Computer Vision & LLMs (IEEE CAI 2025)

● Developed an AI-powered advertising framework that integrates YOLOv8 for real-time product detection, BLIP3 for image captioning, and LLaMA3 with the Groq API for product link retrieval.

● Achieved mAP@50 of 0.535 across 89 product classes and 97% accuracy in matching detected products to purchase links, enhancing multimodal e-commerce engagement. Airbnb Analytics for Santa Clara - (Python, Pandas, NumPy, NLP, Sentiment Analysis, Tableau)

● Conducted data cleaning and exploratory data analysis (EDA) on Airbnb listings using Python (Pandas, NumPy) to identify trends in pricing, availability, and room types across 18 neighborhoods in Santa Clara.

● Performed sentiment and statistical analysis on 50K+ user reviews, applying NLP techniques to understand guest experience. Created a Tableau dashboard that improved booking decisions and user satisfaction by 25%. Real-Time Analytics of Arbitrage Opportunities - (Apache Kafka, PySpark, MongoDB, Grafana)

● Designed and deployed a real-time cryptocurrency arbitrage detection pipeline using Apache Kafka for high-throughput data ingestion from exchanges (Bitstamp, Bitfinex, Kraken) and PySpark for distributed stream processing.

● Implemented Bloom filters to ensure transaction uniqueness and applied differential privacy techniques, enabling 100% anonymization of sensitive trading data. Implemented InfluxDB and MongoDB to store 4,500+ real-time trades per exchange, enabling pattern recognition with K-means clustering and LSH.

● Built a Grafana dashboard with 5-minute refresh intervals to visualize price disparities across exchanges, and developed a Python-based email alerting system to trigger notifications for profitable arbitrage opportunities with <1-minute latency.

Smart Solar Scheduling - (Scikit-learn, Machine Learning, Feature Engineering, Tableau)

● Developed 5 supervised learning models, including Linear Regression, Decision Trees, Random Forest, XGBoost, and SVM, to forecast solar energy usage for EV charging and home consumption.

● Achieved R = 0.93 and RMSE = 0.40 with Random Forest and deployed predictions through a Tableau dashboard for real-time monitoring, reducing grid dependency and household energy costs by 25%. U.S. Accident Trend Analysis - (ETL, Mage, Python, SQL, BigQuery, Power BI )

● Orchestrated a scalable ETL pipeline on GCP utilizing Mage to transform and process over 3 million U.S. traffic accident records (2016–2023); led EDA with Python and ingested data into Google Cloud Storage.

● Analyzed accident patterns by weather, time, and road conditions via SQL and BigQuery, visualized key risk factors with a Power BI dashboard, improving stakeholder insight and risk visibility into public safety metrics TECHNICAL SKILLS

Programming Languages: Python (NumPy, Pandas, Matplotlib, Scikit-learn, PyTorch, TensorFlow), R Data Visualization & BI: Power BI (Power Query, DAX), Tableau, Oracle OTBI, OBIEE, Streamlit, Looker, Power Apps Databases & Warehousing: Snowflake, Microsoft SQL Server, MySQL, Oracle SQL, NoSQL, Google BigQuery Cloud & Big Data: Oracle Cloud (OCI, SCM Cloud), AWS (S3, Redshift, Glue, Lambda), GCP, Azure Synapse, Apache Airflow, Apache Kafka, PySpark, Hadoop

Machine Learning & AI: Predictive Modeling, Statistical Analysis (Regression, Hypothesis Testing, Time Series), Supervised ML (Logistic Regression, Decision Trees, Random Forest, XGBoost), Unsupervised ML (K-means), NLP, Classification techniques, A/B testing, LLMs, RAG, AI Agents, Prompt Engineering Tools & Technologies: Microsoft Excel, PowerPoint, JIRA, Agile/Scrum, Git, Docker, Salesforce CRM EDUCATION

San Jose State University San Jose, US

Masters of Science in Data Science (GPA: 3.7/4) Aug 2023 - May 2025 Savitribai Phule Pune University Pune, India

Bachelor of Engineering in Computer Science Aug 2016 - May 2020



Contact this candidate