Post Job Free

Resume

Sign in

Machine Learning Data Scientist

Location:
San Francisco, CA, 94102
Salary:
100000
Posted:
January 18, 2024

Contact this candidate

Resume:

Microsoft – Power BI Data Analyst Associate

PCAP – Certified Associate in Python Programming

Microsoft – Azure AI Engineer Associate

Microsoft – Azure AI Fundamentals

IBM Cognitive Class – Deep Learning Fundamentals

Open SAP – Getting started with Data Science

Dipannita Ghosh

ad2vz2@r.postjobfree.com

510-***-****

PowerBI and PCAP-certified associate in Python Programming and Microsoft-certified Azure AI engineer with 2 years of industry experience as an analytical data scientist offering experience-backed insights in machine learning, artificial intelligence, algorithm development, and implementing action-oriented solutions to real-life business problems. Leverage structured and unstructured data analysis to extract actionable business insights and enhance data science processes. Demonstrated competence in developing advanced mathematical and statistical techniques to predict business outcomes. Data Scientist/ Machine Learning Engineer with good experience in Data Analytics, Deep Learning, Prediction modeling, Business Analytics, Machine Learning, and Artificial Intelligence. Proficient in using Python and Python libraries like NumPy, Pandas, and TensorFlow. Experience in SQL and Data Science tools in data manipulation, analytical modeling, and data visualization with PowerBI on a variety of structured and unstructured data. Core Skills:

Programming Language, Libraries,

Frameworks:

Python, SQL, NumPy, Pandas, Scikit-Learn, Tensorflow2, Keras, OpenCV, PyTorch, Pyspark

IDEs/Development Tools: Jupyter Notebook, Google Colab BI Tool and Data Visualization: Tableau, Microsoft Excel, Power BI, seaborn, matplotlib Machine Learning Algorithms:

Linear Regression, Logistics Regression, Random Forest, Naïve Bayes, K-Means, SVM, CNN.

Cloud Services: Azure, AWS

WORK EXPERIENCE

SAM LLC - Austin, TX (Remote) March 2023 - May 2023 Data Scientist

Project: Object and anomaly detection from geospatial survey imaging Description: The goal of the project is to identify a two-way problem. The first step would be to detect the components present in the images, and the second step would be to determine if those components are anomalous. The dataset included images captured from transmission and distribution lines from different geographic locations. The images contain six classes of components. Anomalies are also present in components that are labeled as defective or non-defective.

Preprocessing structured and unstructured data, cleansing and validating data for analysis, and analyzing large amounts of data to find patterns and solutions. Creating and optimizing classifiers using machine learning algorithms. Developing prediction and machine learning algorithms. Explaining results and proposing solutions to business problems.

Roles and Responsibilities:

• Determined the problem statement by analyzing project requirements, interpreting data, and processing geospatial survey images using technologies such as Python, Microsoft SQL Server Management Studio, and Microsoft Azure.

• Provided support in data extraction from the database, ensuring the availability of relevant images for training and validation, and utilizing tools like Azure Blob Container for image storage.

• Contributed to the project plan, ensuring proper coordination and alignment with the team's objectives and timelines, and leveraged technologies such as Numpy, Pandas, Seaborn, and Scikit-learn for data manipulation and analysis.

• Extracted required images from the Azure blob container and performed image annotation using the SuperbAI suite.

• Participated in training the images using the Mask RCNN model, analyzing and evaluating results with TensorFlow, and fine-tuning the models for improved performance. Technologies: Python 3, Microsoft SQL server management studio, Microsoft Azure, Numpy, Pandas, Seaborn, Sklearn, TensorFlow.

SynergisticIT - Fremont, CA January 2022-February 2023 Data Scientist/ML Engineer

Project: Fire Risk Assessment for Buildings

Description: The goal of this project is to develop a machine-learning model for assessing the risk of fire incidents in buildings based on various attributes such as location and structure. By analyzing historical fire data and relevant building characteristics, the model will help identify buildings that are more prone to fire accidents, enabling proactive measures to mitigate the risks. Roles and Responsibilities:

• A diverse dataset is collected, consisting of historical fire incident records and building attributes.

• Performed data preprocessing steps to ensure data quality and compatibility. This includes handling missing values, feature engineering, encoding categorical variables, and exploratory data analysis.

• Utilized a multiclass classification model to classify buildings into different risk categories (i.e., percentage of damage) based on their attributes.

• Evaluated the results using appropriate metrics and applied k-fold cross-validation to assess the performance on unseen data.

• A visualization report highlighting buildings with higher fire risk has been generated to aid end users in making informed decisions regarding fire prevention, safety measures, and insurance coverage. Technologies: Python 3, NumPy, Pandas, Sklearn, seaborn SynergisticIT - Fremont, CA

Data Scientist/ML Engineer

Project: Risk Assessment of online payment transactions Description: To protect personal property and help credit card companies, it is crucial to detect fraudulent activities as quickly as possible. The dataset consisted of transactions made by credit cards, and the data was highly imbalanced. Several sampling techniques were used to solve these problems, and different machine- learning models were tested to determine whether the transaction was fraudulent or genuine. Roles and Responsibilities:

• Interacted with the team to analyze the project requirements, interpret data, and determine the problem statement based on the given data.

• Responsible for data cleaning and feature engineering. Used Python libraries such as NumPy, pandas, and Seaborn to perform exploratory data analysis and check outliers.

• Handled imbalanced data using SMOTE.

• Executed different ML algorithms such as Logistic Regression, Naïve Bayes, and Random Forest.

• Prepared visualization using Tableau.

Technologies: Python 3, Jupyter Notebook, NumPy, Pandas, Seaborn, Sklearn, Matplotlib SynergisticIT - Fremont, CA

Data Scientist/ML Engineer

Project: Sentiment Analysis: Depression and Anxiety from Social Media Comments Description: To investigate user sentiments from social media comments using sentiment analysis Description: In sentiment analysis, user texts and their sentiments were examined. The machine learning model was used to classify the sentiments of users as positive, negative, or neutral. Roles and Responsibilities:

• Loaded the raw text data and performed text cleaning using stemming, lemmatization, and regex

• Extracted meaningful data from comments using NLP techniques such as tokenization and stopwords.

• Used CountVectorizer to convert the categorical data into numerical and used the features for the machine learning models.

• Created the target column of sentiments (positive, negative, neutral) using the opinion Lexicon.

• Built a machine-learning model that classifies the comments as positive, negative, or neutral. Technologies: Python 3, NumPy, pandas, NLTK, Sklearn, seaborn, TensorFlow, Keras Education:

Doctor of Philosophy (Ph.D.) in Computer Science and Engineering, National Institute of Technology, Durgapur, India

Publications:

Dipannita Ghosh, Amish Kumar, Palash Ghosal, Amritendu Mukherjee, and Debashis Nandi, Filtering Super-resolution Scan Conversion of Medical Ultrasound Frames, Wireless Personal Communications, 116, (2021): 883–905. Dipannita Ghosh, Amish Kumar, Palash Ghosal, Tamal Chowdhury, Anup Sadhu, Debashis Nandi, Breast Lesion Segmentation in Ultrasound Images Using Deep Convolutional Neural Networks, In 2020 IEEE Calcutta Conference

(CALCON), pp. 318-322. IEEE, February 2020.

Dipannita Ghosh, Amish Kumar, Palash Ghosal, and Debashis Nandi, Speckle Reduction of Ultrasound Image via Morphological Based Edge Preserving Weighted Mean Filter, In Advances in Communication, Devices and Networking, pp. 307-316. Springer, Singapore, 2019.

Debashis Nandi, Sudipta Mukhopadhyay, Dipannita Ghosh, and Baisakhi Chakroborty, A Novel Framework of Speckle Reducing Scan Conversion in Ultrasound Imaging Systems, IETE Technical Review 35, no. 6 (2018): 618-630. Dipannita Ghosh, Debashis Nandi, Palash Ghosal, and Amish Kumar, A Novel Speckle Reducing Scan Conversion in Ultrasound Imaging System, In Progress in Intelligent Computing Techniques: Theory, Practice, and Applications, pp. 335- 345. Springer, Singapore, 2018.



Contact this candidate