UJWAL K
DATA ENGINEER
Email Id : *************@*****.***
Mobile : +91-916*******
LinkedIn : UJWAL K
Innovative Data Engineer with hands-on experience in architecting scalable ETL Pipelines, Data Validation, and Cloud-Native Data Solutions. Targeting challenging opportunities in Data Engineering or Analytics to leverage expertise in Big Data Technologies, Cloud Platforms, and Automation to drive Data-Driven Decision-Making and Operational Excellence in Bengaluru. PROFILE SUMMARY
Dynamic Data Engineer with 3 years of proven experience in designing, developing, and deploying end-to-end ETL Pipelines on AWS cloud environments, driving a 90%+ improvement in data processing speed.
Expert in Apache Spark and PySpark for distributed Data Processing and Workflow Orchestration using Apache Airflow, managing complex pipelines supporting multi-source ingestion.
Proficient in SQL (PostgreSQL), Snowflake, and Elasticsearch to optimize Data Storage, Retrieval, and Real-time Analytics Queries, resulting in improved maximum query efficiency.
Skilled in leveraging AWS Services such as S3, Athena, Glue, and Lambda for automated, event-driven data transformations.
Developed multilingual NLP-based classification pipelines integrating Azure OpenAI, enhancing Product Categorization Accuracy and Automating Manual Tagging Processes.
Strong collaborator and communicator, actively driving cross-team coordination with data scientists, engineers, and stakeholders to resolve critical data issues and ensure pipeline reliability. ACHIEVEMENTS
Awarded Gold Medalist for outstanding academic performance in M.Tech program.
Recognized as Best Employee at Ai Palette for designing and developing a scalable Spark data processing pipeline, significantly enhancing system efficiency.
Automated Google Trends Data Collection and Validation Pipeline achieving 98% data coverage by auto-recovery of missing datasets.
Built and managed 4 Spark + Airflow pipelines on AWS EC2 for ingestion and Developed and maintained automated workflow using Apache Airflow on AWS EC2, professionally managing over 5 million high-volume data events and processing 10 GB of data monthly.
Spearheaded the integration of an Instagram data collection Airflow pipeline using an external vendor API, enabling automated ingestion and scalable storage in AWS S3 for downstream analytics.
Implemented multilingual foodscore scoring pipelines indexing results in Elasticsearch, supporting language-specific model selection and real-time insights.
CORE COMPETENCIES
End-to-end Data Pipeline
Development
Distributed Data Processing &
Optimization
Cloud-native Data Architecture
Automated Data Validation &
Quality Checks Real-time Search and Analytics
Multilingual NLP and AI-based
Classification
CI/CD and Infrastructure
Automation
Cross-team Collaboration & Agile
Methodologies Data Governance and Compliance
SOFT SKILLS
TECHNICAL SKILLS
Strong analytical and
problem-solving abilities
Effective communicator and
team collaborator
Agile and adaptable to fast-
paced environments
Detail-oriented with a
commitment to data quality
Proactive in continuous
learning and knowledge
Good Adapatbility Skills
EDUCATION
2024-2025, M.Sc. in Data Science, Liverpool John Moores University, UK 2023-2024, PG Programme in Data Science, International Institute of Information Technology, Bengaluru 2018-2016, M.Tech, Manipal Institute of Technology, Manipal (Gold Medalist) 2012-2016, Bachelor of Engineering (BE)National Institute of Engineering, Mysore Big Data & ETL: Apache Spark, PySpark, Apache
Airflow, MageAI
Databases: SQL, PostgreSQL, Snowflake, Elasticsearch. Data Warehousing, Data Modeling, DataOps.
Cloud Platforms: AWS (S3, Athena, Glue, Lambda),
Azure OpenAI
Programming: Python, Shell scripting
Containerization & Orchestration: Docker, Kubernetes Version Control & CI/CD: Git, Jenkins
Data Visualization & Reporting: Kibana, MS Excel,
Power BI.
Others: Data Quality Assurance, Automation, Machine Learning basics
PERSONAL DETAILS
Date of Birth: 2nd September, 1994
Languages Known: English, Kannada, Hindi and Tulu
Address : 2AM-709, 2nd A main road, B Channasandra, OMBR Layout, Kasthuri Nagar, Bengaluru, Karnataka, 560043, India WORK EXPERIENCE
Dec’22- Present, AI- Palette (A GlobalData Company), Bengaluru, Data Engineer
Enhancing data storage and retrieval efficiency by optimizing Elasticsearch queries and utilizing AWS Athena, achieving significant improvements in query response times.
Designing and building 4 Spark + Airflow pipelines using AWS EMR and EC2, automating ingestion and enrichment of 5M+ high-volume data events and processing 10+ GB of data monthly with minimal manual intervention.
Engineering a robust pipeline for Google Trends data collection and validation, ensuring 98% data completeness through automated recovery of missing entries.
Automating product classification workflow with Azure OpenAI and prompt engineering, boosting classification accuracy from 70% to 85-90%, thereby enhancing operational efficiency.
Built a New Trend Addition Pipeline with historical tagging and LLM-based cleanups to enhance classification accuracy and trend analytics scalability.
Developing a multilingual foodscore pipeline that dynamically selects scoring models based on document language and indexes results into Elasticsearch, resulting in improved scoring accuracy.
Collaborating across 3-4 cross-functional teams to diagnose and resolve critical data issues, improving pipeline reliability and ensuring system uptime.
Implementing rigorous data quality checks throughout pipelines, achieving 100% data consistency and accuracy.
Documenting all processes and workflow in Confluence, fostering knowledge sharing and seamless collaboration across 3-4 projects.