SHASHANK BANALA
Data Engineer BI Developer
+1-984-***-**** ****************@*****.*** LinkedIn
SUMMARY
Data Engineer with 3 years of experience designing scalable cloud-native pipelines and automated reporting systems. Skilled in Python, SQL, and modern platforms like AWS, Azure, and Snowflake, with expertise in ETL, API integrations, and cross-team collaboration. Applied AI-driven automation and LLM APIs for anomaly detection and reporting, cutting latency and saving 480+ engineer hours annually. Recognized for problem-solving, adaptability, and clear communication in delivering impactful dashboards and data products for finance, operations, and compliance while ensuring responsible AI adoption.
KEY ACHIEVEMENTS
•Migrated 12+ legacy pipelines to AWS across graduate projects and early career work, contributing to $6,500/month savings in server costs.
•Streamlined data quality checks that prevented 145+ production issues over 9 months.
•Reduced Tableau dashboard refresh times from 15 minutes to under 3 minutes, enabling finance teams to close monthly reports 2 days faster.
•Embedded 3 external vendor APIs into internal data systems, improving SLA adherence for reporting.
•Experimented with vector database–based retrieval (Pinecone, FAISS) to enable semantic querying of unstructured datasets in graduate projects.
TECHNICAL SKILLS
Programming & Querying — Python, SQL, Bash, PowerShell Databases & Data Warehousing — Snowflake, MySQL, Oracle, Pinecone (Vector DB) Cloud Platforms — AWS (EC2, S3, Glue, Kinesis), Azure (Data Factory, Synapse, OpenAI Service) Data Processing & Big Data — Databricks, Delta Lake, PySpark, Apache Flink ETL & Orchestration — Apache Airflow, Jenkins, Kafka, dbt BI & Visualization — Tableau, Power BI, Excel (Pivot Tables, Macros) AI/ML & Advanced Tools — scikit-learn, OpenAI APIs, LLMOps (RAG with vector databases) Version Control & DevOps — Git, Docker
Collaboration & Workflow — JIRA, Agile (Scrum/Kanban) PROFESSIONAL EXPERIENCE
Data Engineer, Inspira Financial Feb 2025 – Present USA
•Contributed to cloud migration of batch pipelines from on-prem to AWS (EC2, S3), helping eliminate 60+ manual intervention steps per week and improve scalability.
•Leveraged Delta Lake and Databricks to build a real-time ingestion pipeline for 3.8M+ financial records weekly, applying time-travel and schema enforcement to ensure data accuracy, consistency, and auditability.
•Designed and deployed RESTful API ingestion pipelines for 3 external data providers, processing 200K+ records daily with standardized error handling, retry logic, and improved reliability of downstream reporting.
•Contributed to 40+ automated validation scripts and alert systems, helping detect 90+ data anomalies pre-deployment.
•Assisted in containerizing PySpark ETL jobs with Docker and integrated them into Jenkins CI/CD workflows; supported 20+ production releases with smooth rollouts.
•Piloted integration of OpenAI API with Snowflake for optimized summarization of financial trends, showcasing innovative and ethical AI-driven enhancements in BI workflows. Data Analyst, Wipro May 2020 – Dec 2022 India
•Automated ingestion from 5+ external sources (APIs, flat files, Oracle) using Python scripts and schedulers, standardizing workflows and saving ~20 hours per week while improving data accuracy and timeliness.
•Wrote and optimized 70+ SQL queries and stored procedures across Oracle and MySQL to perform transaction-level transformations, improving query efficiency and ensuring reliable reporting outputs.
•Cleaned and migrated 1.6M records using Salesforce Data Loader and Python-based validation, reducing data duplication by over 12,000 entries.
•Created and maintained Tableau dashboards for 15+ stakeholders with hourly refresh cycles tied to Salesforce data, enabling timely insights and more informed decision-making.
•Collaborated with data architects to design star schema models for reporting needs, improving query performance and simplifying stakeholder reporting.
•Assisted in designing and maintaining data warehouse structures to support enterprise reporting and analytics.
•Supported QA and release cycles in Agile teams, helping deploy biweekly reporting updates without delays. EDUCATION
Master of Science in Data Science, Jan 2023 – Dec 2024 Dartmouth, MA University of Massachusetts Dartmouth
Relevant Coursework & Graduate Projects: Data Warehousing, Cloud Data Engineering, Big Data Analytics Bachelor of Technology in Information Technology,
Vignana Bharathi Institute of Technology
Aug 2017 – Aug 2021 Hyderabad, India
PROJECTS
NYC Taxi Data Analytics (Graduate Project, 2023), Azure Synapse, Power BI, Python
•Built a pipeline to ingest and clean 25M+ NYC taxi records from Azure Blob into Synapse.
•Used scikit-learn to model fare prediction and passenger density for route planning analysis.
•Designed Power BI dashboards with filters for borough, time, and fare trends viewed by over 50 users. Retail Sales Pipeline Automation (Capstone Project, 2024), Snowflake, Airflow, Jenkins, Tableau
•Developed a DAG-based pipeline with Airflow to extract, transform, and load 100K+ daily sales rows.
•Deployed Jenkins CI/CD pipeline to refresh Tableau reports and push Snowflake data models nightly.
•Created dashboards displaying SKU-wise trends, reducing report creation time by over 12 hours/week.
•Incorporated Azure OpenAI service to auto-generate narrative insights alongside Tableau dashboards, enhancing stakeholder adoption.
CERTIFICATIONS
SnowPro Core Certification — Snowflake
Oracle Cloud Infrastructure Data Platform Associate — (Certification Track) Microsoft Azure Data Fundamentals (DP-900)
BADGES AND MICRO-CREDENTIALS
dbt Fundamentals Badge
Snowflake Hands-on Labs (Project Badges)