Srujan Chinta
Arlington, Texas • **************@*****.*** • +1-682-***-**** • linkedin.com/in/srujanchinta EDUCATION
The University of Texas at Arlington Arlington, Texas M.S. Computer Science, GPA: 3.90/4.0 Aug 2023 – Dec 2025 National Institute of Technology, Calicut Kozhikode, Kerala B.Tech. Computer Science and Engineering, GPA: 7.86/10 Jul 2019 – May 2023 SKILLS
• Languages: Python, SQL, Bash, JavaScript, C++, YAML, Jinja
• Data Engineering: dbt, PySpark, Airflow, Kafka, AWS Glue, Delta Lake, Batch/Streaming Pipelines, Query Optimization, Data Modeling, Analytics Engineering, dbt Testing
• Cloud & Databases: Snowflake, PostgreSQL, MySQL, AWS (S3, EC2, Lambda, Redshift, Glue, Athena, Quick- Sight, EventBridge, SNS, SSM, CloudFormation), Azure (ADF, Databricks, Synapse), GCP (BigQuery, Com- poser)
• Visualization: Looker Studio, Power BI, QuickSight
• DevOps & Tools: Docker, Git, Linux, Shell Scripting
• AI & NLP: LangChain, LLaMA 3, Regex, NLP, Tabulate, Streamlit (planned)
• APIs: OpenRouteService API
EXPERIENCE
Research Assistant – AI for Supply Chain Optimization Arlington, Texas University of Texas at Arlington April 2025 – Present
• Built conversational AI app using LangChain and LLaMA 3 to automate facility location and shipping cost opti- mization.
• Developed Python tools with regex to extract variables from natural language for optimization models.
• Integrated OpenRouteService API for geospatial distance and cost calculations.
• Designed ReAct-style agent for multi-turn interaction and error recovery.
• Technologies: Python, LangChain, LLaMA 3, OpenRouteService, Streamlit (planned), Tabulate, Regex, NLP Data Engineer Jan 2022 – May 2023
Ceyline Shipping Services Pvt. Ltd. Navi Mumbai, India (Remote)
• Automated ingestion of 10K+ daily shipment records by developing an S3-triggered AWS Lambda function in Python that parsed uploaded CSVs and inserted clean data into RDS MySQL, reducing manual effort by over 90%.
• Converted raw CSV datasets into partitioned Parquet files using Athena CTAS and INSERT INTO queries, reduc- ing S3 storage costs by 65% and improving query performance by 70%.
• Scheduled Glue Python Shell jobs using EventBridge for daily transformations, ensuring timely updates to dash- boards and SLA reports.
• Implemented SNS notifications to alert the team of Lambda failures, reducing missed data loads by 25% and improving pipeline visibility.
• Practiced deploying sample AWS infrastructure with CloudFormation templates to provision S3, Lambda, and SNS, building foundational infrastructure-as-code skills. Data Engineer Intern Jun 2021 – Dec 2021
Ceyline Shipping Services Pvt. Ltd. Navi Mumbai, India (Remote)
• Developed an ETL pipeline using Python and Apache Airflow to process 10K+ shipment records per day. 1
• Optimized SQL queries used in logistics dashboards, reducing average execution time by 25%.
• Automated S3 ingestion workflows using Boto3, ensuring reliable delivery of 50+ daily log files with zero data loss.
• Applied data transformations (type casting, standardization, enrichment) and quality checks (nulls, duplicates, value ranges), exporting clean data to Parquet format for reporting and analytics. PROJECTS
Azure Data Engineering Pipeline – Adventure Sales Data Technologies: Azure Data Factory, Databricks, Synapse, PySpark, Power BI, Delta Lake
• Designed a scalable ETL pipeline using Azure Data Factory to ingest data from CSV and GitHub into Azure Data Lake Gen2, following the Medallion Architecture (Bronze, Silver, Gold).
• Transformed and optimized data in Azure Databricks (PySpark), leveraging Delta Lake for ACID compliance and schema evolution.
• Built external tables and views in Azure Synapse Analytics, enabling efficient querying and structured analysis.
• Integrated Power BI with Synapse Analytics, creating real-time dashboards for business intelligence.
• Optimized data storage and performance, reducing query execution time by 50% and improving scalability and governance.
Stock Market Real-Time Pipeline
Technologies: Kafka, AWS S3, Glue, Athena, EC2, Python, SQL
• Developed an end-to-end real-time data pipeline to process and analyze stock market data using Apache Kafka and AWS services.
• Implemented Kafka Producers & Consumers to stream real-time stock market data, ensuring low-latency data ingestion.
• Stored processed data in AWS S3 and leveraged AWS Glue Crawlers & Glue Catalog for automated schema inference and metadata management.
• Integrated AWS Athena for efficient querying and analysis of real-time stock data using serverless SQL.
• Deployed and managed Kafka on AWS EC2, ensuring high availability and scalable data streaming.
• Optimized data processing and storage, reducing query execution time by 50% and enabling real-time analytics for financial insights.
Netflix Data Analytics Pipeline
Technologies: dbt, Snowflake, Amazon S3, SQL, Looker Studio, Power BI, Git
• Built an end-to-end ELT pipeline to analyze Netflix movie metadata and user ratings using dbt, Snowflake, and Amazon S3.
• Ingested raw CSV datasets into Amazon S3 and loaded them into Snowflake using COPY INTO for scalable data warehousing.
• Developed layered dbt models (raw, staging, dim, fact, mart) using Jinja and YAML for modular, reusable trans- formations.
• Created analytical tables including movie_analysis, genre_rating_distribution, user_engagement_summa and tag_relevance_analysis for BI and dashboarding.
• Applied 14+ dbt tests to validate data quality, covering null checks, uniqueness, referential integrity, and tag relevance logic.
• Connected final models to Looker Studio and Power BI to create dashboards visualizing top-rated movies, genre trends, and user behavior.
• Used GitHub for version control and dbt CLI for orchestration (dbt run, dbt test, dbt compile) to enable CI-ready, reproducible workflows.
2