Post Job Free
Sign in

Data Engineer Analytics

Location:
O'Fallon, MO, 63366
Posted:
May 10, 2025

Contact this candidate

Resume:

Shruthi Bashetti

Dublin, OH **************@*****.*** 614-***-**** LinkedIn: Shruthi Bashetti

SUMMARY:

Data Engineer with 6 years of IT experience, specializing in designing and implementing scalable cloud data solutions. Skilled in GCP services such as BigQuery, Dataflow, Pub/Sub, and orchestration tools like Airflow. Proficient in SQL, Python, PySpark, and dimensional data modeling using Star and Snowflake schemas. Experienced in building real-time and batch data pipelines, optimizing ETL workflows, and improving data performance.

SKILLS

Programming: SQL, Python, R

Cloud Platforms GCP: GCS, BigQuery, Cloud Composer, DataProc, Pub/Sub, Dataflow

Big Data Frameworks: PySpark, Hadoop

Workflow Orchestration: Airflow

Databases: MySQL, Oracle

Workflow & Issue Tracking: Jira, ServiceNow

Development Environments: Jupyter Notebook, RStudio, Anaconda, SQL Server Management Studio (SSMS), SAP HANA Studio

Text Manipulation, Pattern Matching: REGEX, Template Creation.

Data Visualization & BI Tools: Tableau, Power BI

Dimensional Data Modeling: Star Schema, Snowflake Schema

Version Control & Collaboration: GitHub

ERP Systems: Oracle ERP, SAP BW on HANA.

Other Tools: Curve Explorer, MS Office Suite (Excel with Pivot Analysis, Word, PowerPoint).

WORK EXPERIENCE

Vish Consultants Inc, Data Engineer Dec 2024 to Till Date

•Developed and optimized scripts to transfer data from Google Cloud Storage (GCS) to BigQuery, ensuring efficient and reliable data ingestion.

•Designed and created partitioned tables in BigQuery to enhance query performance and data management.

•Analyzed raw data based on business requirements and structured it into well-organized datasets for effective reporting and analytics and built and maintained BigQuery datasets, tables, and data pipelines to support business objectives and streamline data processing workflows.

•Utilized Google Cloud Shell and SDK for tasks such as data loading, bucket creation, and dataset management across GCS and BigQuery environments and worked extensively with various file formats including CSV, JSON, and AVRO, ensuring seamless data integration and consistency.

•Created authorized views in BigQuery to securely share processed data with downstream applications while maintaining data governance and leveraged GitHub for version control, maintaining clean, collaborative, and documented codebases for data engineering projects.

LTI Mindtree, GCP Data Engineer June 2024 to Nov 2025

•Designed and implemented data integration solutions on Google Cloud Platform (GCP) using services such as Cloud Storage, Dataflow, Pub/Sub, and BigQuery. Developed efficient scripts to transfer data from GCS to BigQuery, applying partitioning strategies for optimized performance and data management.

•Built and managed large-scale data pipelines using technologies like Apache Beam, Apache Spark, and Google Cloud Dataflow. Utilized Google Cloud Shell for seamless data transfers, data bucket creation, and dataset management, handling diverse file formats such as CSV, JSON, and AVRO.

•Established well-structured BigQuery datasets, tables, and pipelines to support business objectives, ensuring data was aligned with specific requirements. Created authorized views in BigQuery for enhanced data security and controlled access, enabling stakeholders to access relevant data efficiently.

•Maintained version control and facilitated collaboration using GitHub as the Source Code Management tool. Documented data pipelines, data models, and architectural decisions to support knowledge sharing and team productivity.

ICE Data Services, Data Engineer November 2019 to July 2022

•Designed and implemented real-time data pipelines using Cloud Dataflow and Apache Beam to process streaming data from Pub/Sub and ingest it into BigQuery, enabling low-latency analytics and business insights while ensuring data consistency with windowing and transformation techniques.

•Developed reusable Dataflow templates to streamline pipeline deployments, reducing development time and improving scalability for future ingestion tasks. Integrated these pipelines with Cloud Composer (Apache Airflow) for automating ETL workflows, including data extraction, transformation, and loading into BigQuery and Cloud Storage.

•Created and optimized Airflow DAGs, incorporating task retries, monitoring, and alerts to manage failures and minimize downtime. Integrated Airflow with GCP services like Dataflow, Pub/Sub, and BigQuery to achieve seamless end-to-end data orchestration.

•Optimized SQL queries in BigQuery by applying techniques such as partitioning, clustering, and denormalization, resulting in a 30% reduction in query execution time and improved data processing efficiency.

GENPACT India Pvt Ltd, Data Engineer April 2017 to October 2019

•Demonstrated expertise in writing complex SQL queries to retrieve, update, and manipulate data, while designing and optimizing database structures (tables, indexes, constraints) for efficient storage and retrieval.

•Skilled in query performance optimization, including analyzing execution plans, optimizing indexes, and implementing query optimization techniques to resolve performance issues.

•Hands-on experience with ETL processes to extract, transform, and load data between databases and different systems, ensuring data integrity and accuracy throughout the pipeline.

•Proficient in developing stored procedures, user-defined functions, and triggers to encapsulate business logic within the database and support automated data processing.

•Strong data analysis skills, generating insights from large datasets and integrating SQL queries with reporting tools to support business decision-making. Experienced in implementing data quality checks and using version control systems for managing SQL scripts and database objects.

PROJECTS

Chicago Crime Analysis - Conducted an in-depth analysis of the Chicago crime dataset using R, revealing significant spikes in crime rates during summer months and holiday seasons. The project involved comprehensive data cleaning and transformation, followed by detailed visualizations using dplyr and ggplot2 to highlight temporal and geographic crime trends. Additionally, developed a predictive model to forecast high-risk periods, enabling data-driven decision-making for resource allocation.

Data Warehouse Integration & Mobile Analytics at Global Bikes - Using Eclipse with SAP BW Modeling Tools and SAP GUI, I've built a data warehouse for Global Bikes Inc., incorporating key figures, dimensions, and analytical cubes. A structured data flow, including data sources, transformations, and ETL processes, ensures efficient loading of master and transactional data. The mobile analytics app developed with SAP BusinessObjects Design Studio enables real-time monitoring of revenue and net sales, empowering decision-makers with actionable insights for strategic decision-making.

North-Point Software Production Company Consortium - Developed classification models to target potential purchasers accurately and regression models to predict spending behavior. Reduced mailing costs by selectively targeting high-potential customers, optimizing resource allocation. Demonstrated expertise in data-driven decision- making, contributing to sustainable business growth. Leveraged R, Python & tableau, and collaborative teamwork for successful project execution.

CERTIFICATIONS:

•Google Professional Data Engineer



Contact this candidate