Bejoy Chate
Data Analyst
IN 219-***-**** ***********@*****.*** LinkedIn
SUMMARY
Data Analyst with around 3 years of experience in data analytics, data engineering, and business intelligence, specializing in SQL, Python, Power BI, and Big Data technologies to drive data-driven decision-making.
Expertise in ETL pipeline development, data warehousing (Snowflake, Redshift, PostgreSQL), and automation using Apache Airflow, AWS Glue, and dbt.
Business Intelligence (BI) & dashboard development using Power BI, Looker, and Tableau, providing actionable insights for stakeholders.
Real-time data streaming and API integration with Kafka, Spark, and REST APIs, improving data accessibility for large-scale datasets.
Machine Learning & Predictive Analytics using Scikit-learn, TensorFlow, NLP, supporting fraud detection, price prediction, and customer sentiment analysis.
Cloud Data Engineering expertise in AWS (S3, Redshift, Lambda, Glue, EC2), Snowflake, and Azure (Data Factory, Databricks, Synapse).
TECHNICAL SKILLS
Programming Languages Python, SQL, Scala, Bash Scripting Big Data Technologies Apache Spark, Hadoop (HDFS, Hive, HBase), Kafka, Airflow Cloud Computing AWS (S3, Redshift, Lambda, Glue, EMR, EC2), Snowflake, Azure (Data Factory, Databricks, Synapse)
Database & Data Warehousing Snowflake, SQL Server, PostgreSQL, MySQL, MongoDB, Cassandra, Amazon Redshift, Teradata
ETL & Data Integration Apache Airflow, AWS Glue, Informatica, Talend, dbt (Data Build Tool) Data Visualization Power BI, Tableau, Looker, Matplotlib, Seaborn, Plotly Machine Learning & Analytics Pandas, NumPy, Scikit-learn, TensorFlow, NLP, Predictive Analytics DevOps & CI/CD Git, Jenkins, Docker, Kubernetes, Terraform Project Tools Jira, Confluence, ServiceNow, MS Office Suite (Excel, Word, PowerPoint) Methodologies Agile (Scrum, Kanban), Waterfall
WORK EXPERIENCE
Data Analyst / Software Engineer IN, USA
Merck Oct 2024 – Current
Built an automated data pipeline for processing clinical trial records daily, reducing ingestion time from 10 hours to 3 hours using Apache Airflow and AWS Glue.
Developed RESTful APIs in Python (FastAPI, Flask) to facilitate secure data exchange between clinical trial management systems, reducing API response time from 800ms to 250ms.
Formulated interactive Power BI dashboards, integrating Snowflake to provide real-time tracking of 50+ ongoing patient enrollments, improving visibility for research teams.
Optimized SQL queries in PostgreSQL, reducing compliance report generation time from 15 minutes to 3 minutes, improving response time for regulatory audits.
Implemented Kafka-based real-time streaming pipelines, monitoring daily drug safety events across multiple clinical studies, ensuring faster detection of potential risks.
Designed predictive analytics models using Scikit-learn, analyzing historical patient records to enhance outcome predictions, assisting researchers in treatment evaluation.
Engineered Python-based data validation scripts, identifying and correcting 500+ data inconsistencies per month before ingestion into Snowflake, improving dataset reliability.
Deployed machine learning models via AWS Lambda, delivering near real-time insights into adverse drug reactions, reducing detection time from 24 hours to under 1 hour.
Data Analyst India
Airbnb Nov 2021 – Dec 2022
Designed SQL-based data models in Amazon Redshift, processing 1 million+ booking records to analyze global trends in reservations and cancellations.
Built ETL pipelines using dbt (Data Build Tool) to automate daily data transformations across multiple sources, reducing manual effort and improving operational efficiency.
Processed over 2 TB of customer interaction data using Hadoop and Spark, enabling real-time sentiment analysis for guest reviews and improving customer experience insights.
Developed anomaly detection models in Python, identifying fraudulent activity across transactions, helping prevent revenue loss.
Created Looker dashboards with real-time revenue insights, enabling leadership to make data-driven pricing and marketing decisions across 10+ global regions.
Automated cloud infrastructure deployment using Terraform, reducing cloud provisioning time from 2 hours to under 10 minutes, ensuring seamless scalability for data pipelines. SQL Developer Intern India
Trigent Software Nov 2020 – Oct 2021
Optimized SQL queries handling records, improving data retrieval speed for enterprise applications.
Redesigned MySQL database schemas, reducing data redundancy and improving query performance, cutting execution time.
Implemented indexing and partitioning strategies, reducing slow query execution time from 30 seconds to under 5 seconds.
Documented database structures and process changes, ensuring seamless handover and reducing onboarding time for new developers.
Used Jira to track and manage database enhancement tasks, contributing to 15 successful sprint releases within the internship period.
PROJECTS
Audiobook Generator using Python
Engineered an automated audiobook converter that transforms any PDF document into speech using pyttsx3 and PyPDF2, reducing manual reading time for users.
Optimize text-to-speech processing, enabling conversion of 50+ pages in under 2 minutes, improving accessibility for visually impaired individuals.
Screen Recorder with Face Detection
Create a real-time screen recording tool using cv2, NumPy, and ImageGrab, capturing both on-screen activities and facial expressions for enhanced usability.
Reduced video processing lag by optimizing frame capture techniques, ensuring smooth 30 FPS recordings in 1080p resolution. IPhone Price Prediction using Machine Learning
Trained a linear regression model using Pandas and Matplotlib, analyzing 100+ historical price records to predict iPhone pricing trends.
Hands-Free Social Media Navigation using Computer Vision
Generated a gesture-based control system using OpenCV (cv2) and NumPy, allowing hands-free navigation for Facebook and Instagram.
EDUCATION
Master of Science in Information Technology IN
Valparaiso University, Valparaiso Jan 2023 – May 2024 Bachelor of Computer Applications India
B.V. Bhoom Reddy College Nov 2018 - Oct 2021
CERTIFICATIONS
Udemy-Machine learning course
Udemy-python developer course