Post Job Free
Sign in

Data Analytics and engineer

Location:
Folsom, CA
Salary:
80k -100k
Posted:
May 08, 2025

Contact this candidate

Resume:

Gauri Parmar

Sacramento, CA +1-408-***-**** **************@*****.*** linkedin.com/in/gauriparmar EDUCATION

MS in Computer Science California State University, Sacramento, CA May 2025 Relevant Coursework – Database System Design, Algorithms and Paradigms, Artificial Intelligence, Machine Learning, Computer Systems Structure, Programming Language Principles, Software Verification and Validation BE in Computer Engineering Savitribai Phule Pune University, India May 2018 TECHNICAL SKILLS

Languages: Python (Pandas, NumPy, Seaborn, Matplotlib), SQL, R, PL/SQL, HTML, CSS, SML Data Analytics: Tableau, Excel, Hadoop, Impala, Hue, Hive, HDFS, Informatica, Jupyter Notebook Databases: PostgreSQL, MySQL, OracleDB, MSSQL, SSMS, HiveQL, HBase, Hive, MongoDB Frameworks & Services: Django, Flask, Git, JIRA, REST API, UNIX, Visual Studio Code, ServiceNow PROJECTS

Flight Data Analysis (Tableau, Hadoop, Impala, Hive, Hue, SQL)

Processed large-scale flight data using Apache Impala and Hive via the Hue interface on a Cloudera-powered Hadoop MPP engine, optimizing query performance for big data analytics.

Designed and implemented a real-time data pipeline, integrating Tableau with a Hadoop cluster enabling live analytics.

Addressed performance bottlenecks by tuning Impala queries and optimizing resource allocation for efficient query execution.

Developed interactive Tableau dashboards to analyze flight delays, passenger traffic trends, providing actionable insights. ChatGPT-Powered Assistant (Python, ChatGPT API, LangChain, Google API, OAuth)

Developed an AI-driven virtual assistant by integrating the ChatGPT API with Google Calendar and Google Docs using Python, Flask, and OAuth 2.0 for authentication and API access.

Leveraged Google Docs API and google calendar API enabling real time updates, create & edit via natural language commands.

Enabled seamless natural language interactions using LangChain for prompt engineering and text processing. Vendia Carbon Dashboard (Python, Pandas, NumPy, Matplotlib, Seaborn, ML)

Developed Python scripts using Pandas, NumPy, Seaborn and Matplotlib to analyze and visualize carbon emission data from transport vehicles, identifying trends based on fuel type, vehicle category and mileage data.

Built a predictive modeling module using scikit-learn and XGBoost, enabling forecasting of future emissions based on vehicle usage, fuel consumption trends, and regulatory impact factors. Fire Detection using YOLO and Python (Django, Python, YOLO, OpenCV, HTML, CSS)

Developed an AI-driven fire detection system using YOLO (You Only Look Once) for real-time fire identification.

Implemented Python-based image preprocessing techniques including Gaussian blurring, edge detection, and histogram equalization, to enhance fire feature extraction to enhance detection accuracy.

Designed and deployed an interactive visualization dashboard using Matplotlib, Seaborn, heatmaps and statistics for improved monitoring and decision making.

Optimized YOLO model parameters, fine-tuning anchor boxes, confidence thresholds, to achieve higher detection accuracy.

Integrated OpenCV for real-time video stream processing, enabling continuous detection. EXPERIENCE

Data Engineer Saama Technologies, Pune, India Dec 2018 – Jan 2021

Produced and deployed interactive dashboards in Tableau resulting in stakeholders gaining immediate insights for project evaluations, leading to faster reporting timelines.

Cleansed and integrated clinical trial data from multiple sources, including patient records and site reports, using Python and SQL, enhancing data accuracy and reliability for analysis.

Conducted exploratory data analysis (EDA) on clinical trial data to resolve ambiguities, leveraging domain knowledge to identify key participant drop-off factors.

Automated data validation and reporting workflows using Python and SQL scripts, increasing data accuracy and reducing manual intervention, thereby streamlining data pipeline operations.

Assisted in data modeling to ensure compliance with HIPAA regulations, assuring data security and integrity.

Revamped database schemas and optimized queries for data extraction by leveraging advanced indexing, query tuning techniques, reducing query execution time enhancing overall system performance.



Contact this candidate