Archana Shete
+1-248-***-**** *************@*****.*** Wixom, MI https://www.linkedin.com/in/archana-shete-903939b/
•8+ years of experience in the IT industry/healthcare. Highly motivated Data scientist, AI/ML engineer and visual storyteller skilled at building and deploying impactful solutions across various industries and creating insightful data visualizations to solicit feedback from stakeholders and enhance the efficacy of processes. Design, develop, and deploy advanced AI/ML models, specifically focusing on Large Language Models (LLMs), to address complex healthcare challenges.
•Proficient in SQL, Python, AI/ML frameworks, technologies and tools, Tableau, Power BI for data wrangling, analysis, and visualization.
Proven ability to present complex data findings to stakeholders in clear and actionable formats. Generative AI, Predictive modelling & Data exploration.
•Experience working effectively with engineers and product managers to translate data insights into actionable solutions using Power BI and Tableau. Data Analysis Expressions (DAX) Creating DAX calculations and measures to support data analysis.
•Developed and optimize ETL processes to ingest, transform, and load data (ETL) from multiple sources into our data warehouse, ensuring data quality and consistency.
•Proven experience in designing, developing, and deploying data pipelines and data warehouses on Snowflake and AWS. worked on leveraging AWS services such as S3.
•Deep learning algorithms such as object detection, image classification, image segmentation, and video analysis, NLP, LLM, GPT, BERT
•Passionate and eager to contribute expertise to a fast-paced, innovative team and continue learning in the dynamic field of AI/ML.
•Generating, testing, and interpreting product experiments, formulating hypotheses: Developing clear, testable statements about the relationship between changes in a product and their impact on specific metrics.
•Expertise in LLMs, such as GPT-3 or similar models. Built a recommend engine using ML techniques as collaborative filtering, matrix factorization, content-based filtering on X scale of data. Evaluation metrics for recommendation systems (precision, recall, NDCG). Developing and deploying predictive models using machine learning frameworks like TensorFlow and Keras on GCP.
AI-Powered Chatbot: Developed a chatbot using Lang Chain to provide customer support, answer FAQs, and guide users through complex tasks. Text Summarization Tool: Built a text summarization tool using Transformers to condense long articles into concise summaries used NLP Technique. Worked on statistical testing techniques (i.e., odds-ratios, t-tests, chi-squared, ANOVA, etc.).
TECHNICAL SKILLS
Programming Languages: Python, SQL, VB.NET, ASP.NET, C, C++, HTML, Visual Basic 6.0, C#, .Net, Excel Visualization and tools: Tableau, Power BI, DAX Query, Seaborn, Plotly, Bokeh, Tableau Public, UI, HTML.
Machine Learning Frameworks/Algorithms: Scikit-learn, TensorFlow, PyTorch, Keras, Image Processing, Deep Learning, ANN, NLP,
Generative AI, Transfer Learning, RNN, BERT, Autoencoder, OpenAI, Generative AI.
Deep Learning Libraries: NumPy, Pandas, Matplotlib, TensorFlow/PyTorch, CNN architectures (mention specific ones used), model training, evaluation metrics (mAP, precision, recall).
Databases and tools: RDBMS, MS SQL Server, MS Access, PostgreSQL, Oracle, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Snowflake. Business intelligence tools – Looker, Trello.
Additional Tools and Technologies: Jupyter Notebook, GitHub, JIRA, Trello, Crystal Reports, Seagate Crystal Report.
Image Processing/Computer Vision: OpenCV, image preprocessing, feature extraction.
Cloud Platform: AWS/GCP
Training and Certifications:
University of Michigan Nexus AI & Machine Learning
National Institute of Information Technology (Dot Net)
SQL Server Integration Services and SQL Server Reporting Services
PARTICIPATION AND PUBLICATIONS
Hackathon / Datathon / SQL and Python Bootcamp / Blog Writing
Blogs: https://www.numpyninja.com/post/how-to-connect-python-to-sql-server-using-pyodbc
https://public.tableau.com/app/profile/archana5506/viz/Sepsis_FinalPresentation/LiverAnlysis-2
https://public.tableau.com/app/profile/archana5506/vizzes
EXPERIENCE
Data Scientist/Analyst Medical /Health Care Numpy Ninja DE, USA /Remote November 2022 – Present
Project- Sepsis diagnosis
Using Python Contributed with clinical dataset for designing a KPI dashboard (Length of stay, throughput etc.) for ICU Sepsis patients using Tableau calculated field, Level of Detail (include, exclude, fixed), filters, parameters, sets which improved ICU operational efficiency by 30%. Data Modeling: Creating data models, designing data flows, and transforming raw data into a usable format. Data Visualization: Building interactive dashboards and reports using Power BI's rich visualization capabilities. DAX: Using DAX (Data Analysis Expressions) to create calculated columns, measures, and custom calculations.
●Data Sources: Connecting to various data sources, including SQL databases, Excel files, and cloud-based data warehouses
●Identified the dominant biomarkers of Sepsis by using concepts of advanced Tableau visualization techniques like Dual axis, Blended axes, control charts, spark line charts, funnel charts, pareto charts, correlation charts.
●Created basic charts like Bar plots, Box plots, scatter plots, heat map to enable regular monitoring by the clinicians and created dashboards for SIRS (Systemic Inflammatory Response Syndrome) patients to help stakeholders to monitor patients at risk for further stages of Sepsis.
●Leveraged data analysis skills for visually appealing data visualization in Power BI using Dax functions effectively communicate findings based on demographics and health information to the stakeholders.
●Articulated and communicated data cleansing and pre-processing using Python libraries such as NumPy and Pandas. Data preparation from relational database using PostgreSQL by using SQL queries.
●Utilize Agile tools like Jira and Trello to effectively track progress, manage backlog, and facilitate communication.
●Contribute to and review code collaboratively through pull requests and issue tracking. Leverage GitHub features like branches, forks, and merges to maintain code quality and efficiency. Utilize GitHub for code version control, collaboration, and project management, hosting service and web interface for the Git code repository.
●Proven expertise in Snowflake, AWS, and Python, with a strong focus on data warehousing, ETL/ELT processes, and data quality. Passionate about leveraging data to drive business decisions and improve operational efficiency.
●Worked on Amazon SageMaker A robust and feature-rich platform from AWS, providing a wide range of capabilities for building, training, and deploying machine learning models.
Project: Diabetes Prediction
●Conducted a data science project to predict the onset of diabetes. Developed a diabetes prediction model using a combination of machine learning and deep learning techniques. Utilized Python data science packages (scikit-learn, matplotlib, seaborn) for analyzing and visualizing large datasets. Developed and evaluated machine learning models using train-test splits, with a focus on regression models.
●Performed in-depth exploratory data analysis on a diabetes dataset, identifying key features and handling missing values.
●Implemented and evaluated various machine learning algorithms (e.g., Logistic Regression, Support Vector Machines, Random Forest) using Python libraries like scikit-learn.
●Achieved precision, recall, F1-score, or AUC to demonstrate the effectiveness of your model. prediction accuracy, demonstrating a strong understanding of data science methodologies and model evaluation.
●Collaborated with cross-functional teams to understand business requirements and translate them into data science solutions.
●Developed Tableau dashboards for adhoc data reporting. Delivered projects within tight deadlines, demonstrating adaptability and proficiency in managing individual and team tasks.
●Worked on a project analyzing an effect of Diabetes on Cerebrovascular disease in elderly patients.
●Conducted descriptive and statistical analysis, resulting in key findings including a 38% increase in the likelihood of developing Dementia for individuals with diabetes.
●Analyzed patient data and determined that 96% of patients with a history of Diabetes for more than 15 years, develop Hypertension.
Conducted correlation analysis and identified a strong positive correlation between blood pressure and poor diabetes management.
●Designed and implemented Dashboard, Power BI Project/visualization report for Gestational Diabetes Mellitus (GDM), Getting the Knowledge of good medical biomarkers and good document of Data Definition with all Biomarkers cut off Values.
Data Scientist/Analyst ATOS Syntel Troy, USA June 2018-Feb 2021
Leveraged machine learning techniques, including data cleaning, outlier identification and removal, feature selection, and data transformations to optimize ML algorithm models. Applied data preprocessing methodologies, such as standardizing and nominalizing data, to enhance, Machine learning model performance. knowledge of tokenization techniques and their impact on model performance. Attention Mechanism, Prompt Engineering.
●Model Architecture: Discuss your familiarity with different LLM architectures (e.g., Transformer, BERT, GPT).
●Utilized Python libraries, including pandas and NumPy, for comprehensive data cleaning, transformation, and analysis for various projects. Developed a scalable e-commerce platform for a fashion brand, enabling customers to browse, select, and purchase clothing items online. Incorporated features such as product catalog, shopping cart, secure checkout, order management, and customer account management with e-commerce. Improved conversion rate by 25% by implementing personalized product recommendations.
●Utilized Python data science packages (scikit-learn, matplotlib, seaborn) for analyzing and visualizing large datasets.
●Developed and evaluated machine learning models using train-test splits, with a focus on regression models.
●Automated data flow tasks using Python and SAS, streamlining day-to-day operations, and improving efficiency.
●Collaborated with cross-functional teams to understand business requirements and translate them into data science solutions.
●Developed Tableau dashboards for adhoc data reporting. Delivered projects within tight deadlines, demonstrating adaptability and proficiency in managing individual and team tasks.
●Designed and developed user interfaces (UI) for project, resulting in user engagement.
●Processed a high volume of medical claims, ensuring accuracy and timely submission. Mastered complex insurance guidelines and payer requirements. Effectively resolved claim denials through clear communication with providers and payers .
Analyst Programmer TriZetto / Humana Health care system Syntel Ltd. Pune, India September 2010 - March 2011
Analyze the requirements to create System Design Document (SDD) for assigned service / operation. Created semantic data model (SDM) and developed ERD Diagram and data flow diagram.
●Design the use case diagrams, class diagrams and sequence diagrams.
●Developed the project using C#, ASP.NET and used manual testing tool to test the code.
●Implemented Used the MVC architecture model and used appropriate object-oriented programming concepts.
●Worked on SQL Server Integration Services for database integration and migration. Build the SQL queries and created stored procedures, triggers, cursors. Ensure compliance with healthcare regulations and data privacy standards, such as HIPAA .
Transaction Processing Officer Project Sphere-Releases Mphasis/HP Company Pune, India February 2010 - August 2010
●This System is designed to automate the process of releasing the land property by checking the loans and the mortgages that have taken on that land, if it is clear to pay all pay out figure then customer is ready to take his property again on his own name by releasing the mortgages amount.
●Designed and implemented the Semantic Data Model (SDM).
●Verified the different document like CT, property, mortgage document for amount, loan number, surplus amount etc.
Sr. Software Engineer Land Acquisition & Biometric Attendance System SSTS Pvt. Ltd. Pune, India January 2006 – January 2009
●This System is designed to automate the process of acquiring land for purpose under control of land acquisition department. It provides case wise and stage wise tracking of land acquisition process. It includes district level and commissioner level reports.
●Developed biometric attendance system for hospital using biometric device to track attendance of patient.
●Designed and implemented crystal reports, executive dashboard, and other analytic / programming tools.
●Designed and implemented the database schema using SQL Server, ensuring data integrity and efficient data retrieval. Utilized T-SQL queries to manipulate data and integrate with the .NET application. Visual Basic 6.0. Developed a comprehensive reporting system using SQL Server Reporting Services (SSRS) to deliver insightful data visualizations and facilitate informed decision-making.
TECHNICAL PROJECT
Object Detection Project (Autonomous vehicle) June 2024
Developed an object detection system for autonomous driving using e.g., YOLO, SSD and Python. Trained and evaluated the model on a custom dataset of road scenes, achieving accuracy metric, e.g., 92% mAP in detecting vehicles, pedestrians, and traffic signs. Gained hands-on experience in computer vision, deep learning, and model deployment.
Used two neural networks or even mix machine learning and hard coded solutions (e.g. SSD and hard coded classifier) for the traffic light detection and classification. Worked on Faster R-CNN which consists of following parts: a region proposal network, region of interest pooling and finally classification and linear regression. Performed image preprocessing using OpenCV.
Utilized a pre-trained model, froze most of the layers and retrained only the classification layers to classify red, yellow and green traffic lights solely. We trained a single model on both simulator and real-world data.
The traffic light classifier was created by retraining an existing model from the TensorFlow Object Detection Model Zoo. We chose the "Faster R-CNN" model, because its accuracy was good on traffic lights and it is still fast enough to use in our application.
Achieved 92% accuracy in object detection, image segmentation exceeding the target threshold for safe autonomous operation.
Gained valuable experience in computer vision, deep learning, and the application of AI in the automotive industry.
Retail sales recommendation system February 2024
Developed a recommendation engine for Retail Co., improving customer experience and driving sales through personalized product suggestions. Built a classification model to predict customer income, enabling targeted marketing campaigns
Worked on Feature Engineering, Exploratory Data Analysis (EDA): Visualize the distribution of income levels and relationship between features and income, Fix the imbalanced dataset by oversampling method.
Model Evaluation: Evaluate model accuracy and F1-score and recall and precision matrix, improved customer satisfaction, identified new market opportunities.
Developed classification algorithms using Python including K-Nearest Neighbors (KNN), Naive Bayes, Random Forest, Logistic Regression Classifier, SVM, and Decision Tree Classifier, proved recommendation accuracy by 25% through feature engineering and model optimization. With Hyperparameter Tuning to Optimize model hyperparameters for improved performance. Generating visualizations and a comprehensive report for Retail Co. Led a team to develop a predictive churn model for a retail company using ML tools, resulting in a 15% reduction in customer churn.
Developed and implemented a recommendation engine using collaborative filtering techniques to suggest personalized product.
Analyzed customer purchase history, browsing behavior, and product attributes to identify patterns and relationships. The recommendation engine led to 15% increase in average order value.
Sentiment Analysis System: Implemented a sentiment analysis system using PyTorch to analyze customer feedback and identify trends
EDUCATION
University of Michigan Nexus AI/ML Certification Program, USA December 2023 - June 2024
Powered by Fullstack Academy
Immersive AI & Machine Learning program which utilized active learning to gain proficiency in data technologies and tools including Python, Keras, and TensorFlow, NLP, Neural network and understanding of machine learning, recommendation engine, ensemble learning, deep learning, and Applied data science processes.
Savitribai Phule Pune University, India
Masters in computer science (MSc.) June 2005 - May 2007
Savitribai Phule Pune University, India
Bachelor in computer science (BCS) June 2001- May 2004