Machine Learning Business Intelligence

Location:

Ahmedabad, Gujarat, India

Salary:

70000

Posted:

October 15, 2025

Contact this candidate

Resume:

HARSH MAKWANA

California, USA +1-980-***-**** ****************@*****.*** LinkedIn

SUMMARY

• Data Analytics Engineer with 4+ years of expertise in skilled in designing, developing, and optimizing end-to-end data pipelines and architectures to support advanced analytics and business intelligence initiatives across multiple industries.

• Proficient in implementing data integration solutions using cloud-based data warehouses such as Snowflake and Redshift, and orchestrating ETL/ELT workflows with tools like dbt, Apache Airflow, and SSIS to ensure data quality and reliability.

• Adept at applying machine learning and statistical techniques using Python (scikit-learn, pandas) for customer segmentation, predictive modeling, and personalization engines that enhance business decision-making and customer engagement.

• Strong background in building scalable data infrastructure, including feature stores and real-time scoring APIs, to enable rapid deployment of data-driven applications and support cross-functional teams.

• Skilled in developing interactive dashboards and reporting solutions with Power BI, Tableau, and SQL-based tools to deliver actionable insights and facilitate strategic planning for stakeholders at all organizational levels.

• Collaborative team player experienced in Agile development environments, leveraging tools such as Jira and Confluence for project management and ensuring alignment between data engineering, analytics, and business objectives. SKILLS

Methodologies: SDLC, Agile, Waterfall

Data Analysis: Exploratory Data Analysis (EDA), Descriptive & Inferential Statistics, Hypothesis Testing, ANOVA, Regression Analysis, Statistical Analysis, Data Mining, Data Cleaning, ETL Processes, Quantitative Analysis, A/B Testing, Cohort Analysis, Funnel Analysis Programming Language: Python, MySQL, JavaScript, R, C++, Scala Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow, Seaborn, dplyr, ggplot2, NLTK Data Visualization: Tableau, Power BI, Looker, Excel Dashboards, Advanced Excel (VLOOKUP, Pivot Tables, Macros, Power Query), SRSS

IDEs: Visual Studio Code, PyCharm, Jupyter Notebook, IntelliJ Database: Cosmos DB, SQL Server RDBMS, PostgreSQL, Snowflake Cloud Platform: Microsoft Azure, Amazon Web Services (AWS) Other Technical Skills: SSIS, SAS, Alteryx, Machine Learning Algorithms, ETL\ELT Tools, Spark, Kafka, Hadoop, Docker, Kubernetes, Apache Airflow, dbt, Data Wrangling, Data Cleaning, Data Transformation, Data Warehousing, Data Lakes, Data Storytelling, Executive Presentations, Technical Documentation, Cross-functional Collaboration, KPI Development, Metric Definition, Business Requirements Analysis, Executive Reporting, Requirements Gathering, Data Storytelling, HIPAA, SOX, GDPR, Data Governance, Data Privacy, Healthcare Analytics, Financial Reporting, Insurance Analytics, Sales & Marketing Analytics, JIRA, Confluence, Trello, Git, Github Soft Skills: Time Management, Leadership, Strategy Planning, Problem-Solving, Negotiation, Decision- Making, Documentation and Presentation, Analytical Thinking, Attention to Detail, verbal and written communication

WORK EXPERIENCE

Lugano USA

Data Analytics Engineer Nov 2024 – Jun 2025

• Integrated Salesforce CRM, e-commerce transactions, and in-store appointment data into a centralized Snowflake warehouse using automated ETL pipelines, consolidating over 1.2M client records for segmentation analysis.

• Designed dbt transformation models to standardize purchase history, engagement frequency, and demographic variables, improving overall data accuracy from 78% to 96% and ensuring consistency across downstream analytics.

• Developed segmentation algorithms in Python (pandas, scikit-learn, embedding techniques) to cluster ultra-high-net-worth clients into six actionable personas, enabling concierge teams to tailor engagement strategies and increase bespoke offer acceptance by 22%.

• Implemented a feature store to centralize attributes such as lifetime value, spend tier, and product affinity, which reduced ML retraining cycles by 30% and ensured uniform feature definitions across analytics models.

• Deployed a real-time scoring API that delivered personalized recommendations directly to the sales portal, reducing concierge profiling time by 45 seconds per interaction and enhancing client experience during live consultations.

• Conducted controlled A/B experiments to measure the impact of segmentation-driven outreach; targeted campaigns achieved an average order value increase of $18K per client segment and improved campaign ROI by 12%.

• Built Power BI dashboards to track segment profitability, engagement trends, and campaign performance, cutting manual reporting by 25 hours per month and equipping executives with real-time decision support.

• Coordinated delivery through Agile sprints using Jira and Confluence, aligning data engineering deliverables with marketing and sales objectives, and directly contributing to a 15% year-over-year improvement in repeat purchase velocity. Outlier California, USA

AI Consultant – Data Scientist Jan 2024 – Nov 2024

• Optimized large language models (LLMs) and multimodal generative AI systems by implementing unit testing and advanced prompt engineering, enhancing model accuracy and reducing error propagation.

• Developed Reinforcement Learning from Human Feedback (RLHF) pipelines using Python and PyTorch to align AI outputs with human preferences, improving decision consistency and reliability.

• Designed and implemented AI algorithms and prototypes in Python, Scala, and Swift, covering NLP, computer vision, and multimodal data processing for production deployment.

• Integrated Retrieval-Augmented Generation (RAG) techniques into AI chatbots with LangChain and vector databases, improving contextual response accuracy by 30%.

Stanford Health Care California, USA

Data Analyst May 2022 – Aug 2022

• Extracted and consolidated over 2M patient records from Epic EHR, laboratory systems, and insurance claims using SQL Server and Python (Pandas, NumPy), ensuring a comprehensive dataset for predictive modeling of 30-day hospital readmissions.

• Built automated ETL pipelines with SSIS and PySpark to integrate structured clinical data (lab values, vitals, admission history) and unstructured physician notes, reducing manual data handling efforts by 40% and enabling near real-time updates to analytic datasets.

• Applied advanced data cleaning and transformation techniques to standardize metrics, resolve missing values, and normalize clinical variables, which improved data accuracy and consistency across disparate systems by 30%.

• Designed and implemented feature engineering workflows in Python, creating 150+ predictive variables such as comorbidities, medication adherence, and prior readmissions, contributing to an increase in model AUC.

• Conducted exploratory and statistical analysis using Python (Matplotlib, Seaborn) to identify clinical and demographic risk factors most strongly correlated with readmissions, enabling physicians to design targeted interventions for high-risk populations.

• Developed interactive Tableau dashboards and SQL-based reporting layers that visualized patient-level risk scores, cohort trends, and operational KPIs, reducing manual chart reviews by 30% and providing executives with actionable performance insights.

• Partnered with clinicians, data scientists, and IT staff in an Agile environment using Jira and Confluence to translate medical requirements into measurable data metrics while ensuring full HIPAA compliance in data workflows.

• Delivered insights that directly supported a 12% reduction in 30-day readmissions, generating an estimated $4.5M in annual savings from avoidable hospitalization costs and aligning with CMS regulatory performance benchmarks. Capgemini India

Data Analyst Jul 2019 – Jul 2021

• Consolidated transactional data from 20+ banking systems by developing advanced SQL queries in Amazon Redshift and Google BigQuery, creating a centralized repository that supported real-time analysis of 10M+ transactions per day

• Designed and deployed Python-based ETL pipelines (pandas, NumPy) to cleanse and normalize streaming data, reducing latency by 35% and ensuring >99% dataset accuracy for fraud detection models.

• Created and integrated 150+ fraud-specific features such as velocity checks, merchant profiling, device fingerprinting, and geo-location patterns, which improved model detection performance by 25%.

• Partnered with data scientists to validate models using statistical testing, EDA, and interpretability techniques (SHAP, LIME), ensuring compliance with internal audit and regulatory standards

• Developed real-time dashboards in Tableau and Power BI to track flagged transactions, investigator workloads, and fraud patterns, which accelerated case resolution by 40% and optimized resource allocation.

• Automated daily fraud monitoring and investigation reports with SQL and Excel VBA macros, reducing manual effort by 15+ hours per week and providing leadership with timely risk insights

• Authored detailed business and technical documentation in JIRA and Confluence, aligning data workflows with compliance requirements and supporting transparency during regulatory reviews.

• Delivered enhancements in bi-weekly Agile sprints, including new validation scripts, reporting modules, and dashboard features, which reduced investigation backlog by 30% and improved delivery velocity. PROJECTS

Real-Time Stock Price Prediction Using LSTM

• Developed an LSTM-based model to predict stock prices using historical data from financial APIs, and deployed it via Django for real- time predictions. Created interactive visualizations to compare model predictions with actual stock prices. Tech Stack Used: Python, TensorFlow, Pandas, NumPy, Matplotlib, Django Natural Language Processing (NLP) for Chatbot Development

• Developed and deployed an NLP-based chatbot using Flask, including intent recognition and entity extraction. Integrated the chatbot with a web interface for a seamless user experience. Tech Stack Used: Python, NLTK, PyTorch, Flask

EDUCATION

Master of Science in Data Science - University of Massachusetts, Dartmouth, USA Sep 2021 – Dec 2023 Bachelor of Engineering in Computer Engineering - Gujarat Technological University, India Aug 2016 – Aug 2020

Contact this candidate