Data Engineer Processing

Location:

Bengaluru, Karnataka, India

Posted:

November 06, 2025

Contact this candidate

Resume:

UJWAL K

DATA ENGINEER

Email Id : *************@*****.***

Mobile : +91-916*******

LinkedIn : UJWAL K

Innovative Data Engineer with hands-on experience in architecting scalable ETL Pipelines, Data Validation, and Cloud-Native Data Solutions. Targeting challenging opportunities in Data Engineering or Analytics to leverage expertise in Big Data Technologies, Cloud Platforms, and Automation to drive Data-Driven Decision-Making and Operational Excellence in Bengaluru. PROFILE SUMMARY

Dynamic Data Engineer with 3 years of proven experience in designing, developing, and deploying end-to-end ETL Pipelines on AWS cloud environments, driving a 90%+ improvement in data processing speed.

Expert in Apache Spark and PySpark for distributed Data Processing and Workflow Orchestration using Apache Airflow, managing complex pipelines supporting multi-source ingestion.

Proficient in SQL (PostgreSQL), Snowflake, and Elasticsearch to optimize Data Storage, Retrieval, and Real-time Analytics Queries, resulting in improved maximum query efficiency.

Skilled in leveraging AWS Services such as S3, Athena, Glue, and Lambda for automated, event-driven data transformations.

Developed multilingual NLP-based classification pipelines integrating Azure OpenAI, enhancing Product Categorization Accuracy and Automating Manual Tagging Processes.

Strong collaborator and communicator, actively driving cross-team coordination with data scientists, engineers, and stakeholders to resolve critical data issues and ensure pipeline reliability. ACHIEVEMENTS

Awarded Gold Medalist for outstanding academic performance in M.Tech program.

Recognized as Best Employee at Ai Palette for designing and developing a scalable Spark data processing pipeline, significantly enhancing system efficiency.

Automated Google Trends Data Collection and Validation Pipeline achieving 98% data coverage by auto-recovery of missing datasets.

Built and managed 4 Spark + Airflow pipelines on AWS EC2 for ingestion and Developed and maintained automated workflow using Apache Airflow on AWS EC2, professionally managing over 5 million high-volume data events and processing 10 GB of data monthly.

Spearheaded the integration of an Instagram data collection Airflow pipeline using an external vendor API, enabling automated ingestion and scalable storage in AWS S3 for downstream analytics.

Implemented multilingual foodscore scoring pipelines indexing results in Elasticsearch, supporting language-specific model selection and real-time insights.

CORE COMPETENCIES

End-to-end Data Pipeline

Development

Distributed Data Processing &

Optimization

Cloud-native Data Architecture

Automated Data Validation &

Quality Checks Real-time Search and Analytics

Multilingual NLP and AI-based

Classification

CI/CD and Infrastructure

Automation

Cross-team Collaboration & Agile

Methodologies Data Governance and Compliance

SOFT SKILLS

TECHNICAL SKILLS

Strong analytical and

problem-solving abilities

Effective communicator and

team collaborator

Agile and adaptable to fast-

paced environments

Detail-oriented with a

commitment to data quality

Proactive in continuous

learning and knowledge

Good Adapatbility Skills

EDUCATION

2024-2025, M.Sc. in Data Science, Liverpool John Moores University, UK 2023-2024, PG Programme in Data Science, International Institute of Information Technology, Bengaluru 2018-2016, M.Tech, Manipal Institute of Technology, Manipal (Gold Medalist) 2012-2016, Bachelor of Engineering (BE)National Institute of Engineering, Mysore Big Data & ETL: Apache Spark, PySpark, Apache

Airflow, MageAI

Databases: SQL, PostgreSQL, Snowflake, Elasticsearch. Data Warehousing, Data Modeling, DataOps.

Cloud Platforms: AWS (S3, Athena, Glue, Lambda),

Azure OpenAI

Programming: Python, Shell scripting

Containerization & Orchestration: Docker, Kubernetes Version Control & CI/CD: Git, Jenkins

Data Visualization & Reporting: Kibana, MS Excel,

Power BI.

Others: Data Quality Assurance, Automation, Machine Learning basics

PERSONAL DETAILS

Date of Birth: 2nd September, 1994

Languages Known: English, Kannada, Hindi and Tulu

Address : 2AM-709, 2nd A main road, B Channasandra, OMBR Layout, Kasthuri Nagar, Bengaluru, Karnataka, 560043, India WORK EXPERIENCE

Dec’22- Present, AI- Palette (A GlobalData Company), Bengaluru, Data Engineer

Enhancing data storage and retrieval efficiency by optimizing Elasticsearch queries and utilizing AWS Athena, achieving significant improvements in query response times.

Designing and building 4 Spark + Airflow pipelines using AWS EMR and EC2, automating ingestion and enrichment of 5M+ high-volume data events and processing 10+ GB of data monthly with minimal manual intervention.

Engineering a robust pipeline for Google Trends data collection and validation, ensuring 98% data completeness through automated recovery of missing entries.

Automating product classification workflow with Azure OpenAI and prompt engineering, boosting classification accuracy from 70% to 85-90%, thereby enhancing operational efficiency.

Built a New Trend Addition Pipeline with historical tagging and LLM-based cleanups to enhance classification accuracy and trend analytics scalability.

Developing a multilingual foodscore pipeline that dynamically selects scoring models based on document language and indexes results into Elasticsearch, resulting in improved scoring accuracy.

Collaborating across 3-4 cross-functional teams to diagnose and resolve critical data issues, improving pipeline reliability and ensuring system uptime.

Implementing rigorous data quality checks throughout pipelines, achieving 100% data consistency and accuracy.

Documenting all processes and workflow in Confluence, fostering knowledge sharing and seamless collaboration across 3-4 projects.

Contact this candidate