Post Job Free
Sign in

Analytical Data Scientist for Real Estate Tech

Location:
Chicago, IL
Posted:
January 06, 2026

Contact this candidate

Resume:

NEHA K. NAYAK

Chicago, IL (open for relocation) 872-***-**** *******@****.************.*** in/neha-kiran-nayak github.com/NEHAKIRANNAYAK SUMMARY

Data Science graduate student at the Illinois Institute of Technology with 2+ years of hands-on experience in machine learning, big data engineering, and AI-driven analytics. Built and optimized real-time data pipelines processing 100K+ records using Apache Kafka, AWS, and Spark. Developed deep learning and Explainable AI models improving prediction accuracy by up to 15% across healthcare and finance projects. Published author (3 Springer papers) with proven expertise in Python, R, SQL, TensorFlow, and large-scale data systems. SKILLS

• Programming Languages: Python R SQL C

• Databases & Storage: PostgreSQL AWS S3 DynamoDB NoSQL Databases

• Big Data & Streaming: Apache Spark Apache Kafka (v3.5.1) Hadoop Spark Streaming AWS Glue Athena

• Containerization & Orchestration: Docker Kubernetes

• Machine Learning: Supervised Learning (Regression Classification – SVM Random Forest Gradient Boosting) Unsupervised Learning (Clustering Techniques) Model Evaluation & Validation Scikit-learn TensorFlow Keras XAI

• Deep Learning: Neural Networks (CNNs RNNs LSTMs) Image Processing PyTorch TensorFlow Computer Vision

• Natural Language Processing (NLP): Text Mining Sentiment Analysis NLTK spaCy

• Emerging AI Paradigms: Agentic AI Retrieval-Augmented Generation (RAG) Neurosymbolic AI

• Data Visualization: Matplotlib Seaborn Tableau Power BI IBM SPSS

• Statistical Analysis: Descriptive & Inferential Statistics Hypothesis Testing A/B Testing ANOVA

• Cloud Computing & Infrastructure: AWS GCP Azure Server Setup SSH Connections

• Data Engineering Practices: ETL Design Streaming Ingestion Schema Design Data Modeling Pipeline Optimization

• Mathematics for Data Science: Linear Algebra Calculus Differential Equations Graph Theory Probability Sampling

• Software Development & Tools: Git Agile Project Management GitHub Portfolio Jupyter Notebooks WORK EXPERIENCE

Data Science Fellow Build Fellowship, Chicago, IL Feb 2025 – Apr 2025

• Analyzed 100,000+ hospital encounters using Python and R, improving readmission prediction accuracy by 14%.

• Built logistic regression and ANOVA models, identifying risk factors in 95% of high-priority patient cases.

• Created interactive Tableau dashboards, visualizing trends for 200+ hospital staff to optimize patient management strategies.

• Optimized data pipelines, reducing preprocessing time by 35% for faster and reproducible analytics workflows. Data Analyst Kasturba Medical College, Manipal, India Apr 2024 - May 2024

• Processed 1,500+ patient records using Python Pandas, maintaining 100% compliance with data privacy regulations.

• Trained deep learning models with TensorFlow and Keras, improving tumor classification accuracy from 78% to 90%.

• Applied SHAP-based Explainable AI, reducing misclassification by 15% and increasing clinician confidence in predictions.

• Visualized patient trends with Power BI and Seaborn, supporting insights for 50+ healthcare professionals’ decisions. Research Analyst IIIT Allahabad, India Oct 2022 - Jan 2023

• Developed real-time water quality pipeline with Apache Kafka and Spark, reducing data latency by 25%.

• Applied ML models in Scikit-learn, improving water quality classification accuracy from 81% to 93%.

• Managed ETL processes for 500,000+ records, improving data reliability and integration efficiency by 30%.

• Collaborated with UP Government, providing actionable insights impacting 10+ regional environmental monitoring initiatives. EDUCATION

Illinois Institute of Technology, Chicago, IL Aug 2024 - May 2026 Master of Data Science, GPA 3.50

Visvesvaraya Technological University, Bengaluru, KA Aug 2020 - Jun 2024 Bachelors in AI & Data Science, GPA 3.9

PROJECTS

Real-Time Stock Market Data Pipeline Apr 2025 – Jul 2025

• Built a real-time data pipeline using Apache Kafka (v3.5.1), efficiently processing 1M+ livestock records daily.

• Deployed EC2-hosted brokers with Python producers/consumers, achieving 99.8% reliable message delivery across nodes.

• Integrated DynamoDB for low-latency data queries, reducing retrieval time by 40% and enhancing user responsiveness.

• Automated AWS orchestration and monitoring pipelines, improving system scalability, uptime, and fault-tolerance by 30%. Navigation Assistant for Visually Impaired Jan 2025 – Apr 2025

• Developed a real-time pipeline integrating video streams with CNN inference, achieving 95.68% detection accuracy.

• Built an end-to-end computer vision workflow using Python, OpenCV, TensorFlow, ensuring sub-5-second latency.

• Applied statistical validation and visualization to assess detection trends, improving model reliability across test environments.

• Optimized inference architecture and GPU utilization, enhancing processing efficiency and overall system performance by 25%. Driver Drowsiness Detection System Oct 2024 – Dec 2024

• Designed a real-time driver state analytics system using TensorFlow, OpenCV, and dlib, analyzing 100+ frames/second.

• Implemented blink-rate detection on streaming video, achieving 92% accurate driver state monitoring in real-time conditions.

• Built a low-latency inference pipeline with parallel frame processing, reducing alert-generation time by 2 seconds (33% faster).

• Optimized feature engineering and performance, achieving 93.04% accuracy and enabling deployment for 1,000+ drivers. HONORS AND ACHIEVEMENTS

• University Gold Medalist in Artificial Intelligence and Data Science Jun 2024

• High-Speed Visual Navigation for the Visually Impaired with Real-Time Mapping and Voice Interaction, ERCICA. Apr 2024

• Driver Drowsiness Alarm System: A Simulation Using CNNs, Springer Feb 2024

• Analysis and Prediction of PCOD using ML Pipelines and Ensemble Techniques, Springer Apr 2022



Contact this candidate