Post Job Free
Sign in

Lead Data Scientist

Company:
Sahaj Software
Location:
Bengaluru, Karnataka, India
Posted:
May 21, 2024
Apply

Description:

Data scientist with a strong background in data mining, machine learning, recommendation systems, and statistics. Should possess signature strengths of a qualified mathematician with ability to apply concepts of Mathematics, Applied Statistics, with specialisation in one or more of NLP, Computer Vision, Speech, Data mining to develop models that provide effective solution.A strong data engineering background with hands-on coding capabilities is needed to own and deliver outcomes.

A Master’s or PhD Degree in a highly quantitative field (Computer Science, Machine Learning, Operational Research, Statistics, Mathematics, etc.) or equivalent experience, 7+ years of industry experience in predictive modelling, data science and analysis, with prior experience in a ML or data scientist role and a track record of building ML or DL models.

Responsibilities and skills:

Work with our customers to deliver a ML / DL project from beginning to end, including understanding the business need, aggregating data, exploring data, building & validating predictive models, and deploying completed models to deliver business impact to the organisation.

Selecting features, building and optimising classifiers using ML techniques.

Data mining using state-of-the-art methods, create text mining pipelines to clean & process large unstructured datasets to reveal high quality information and hidden insights using machine learning techniques.

Should be able to appreciate and work on:

Computer Vision problems – for example extract rich information from images to categorise and process visual data— Develop machine learning algorithms for object and image classification, Experience in using DBScan, PCA, Random Forests and Multinomial Logistic Regression to select the best features to classify objects.

OR

Deep understanding of NLP such as fundamentals of information retrieval, deep learning approaches, transformers, attention models, text summarisation, attribute extraction, etc. Preferable experience in one or more of the following areas: recommender systems, moderation of user generated content, sentiment analysis, etc.

OR

Experience of having worked in these areas : speech recognition, speech to text and vice versa, understanding NLP and IR, text summarisation, statistical and deep learning approaches to text processing.

Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc. Needs to appreciate deep learning frameworks like MXNet, Caffe 2, Keras, Tensorflow.

Experience in working with GPUs to develop models, handling terabyte size datasets.

Experience with common data science toolkits, such as R, Weka, NumPy, MatLab, mlr, mllib, Scikit-learn, caret etc - excellence in at least one of these is highly desirable.

Should be able to work hands-on in Python, R etc. Should closely collaborate & work with engineering teams to iteratively analyse data using Scala, Spark, Hadoop, Kafka, Storm etc.

Experience with NoSQL databases and familiarity with data visualisation tools will be of great advantage.

Apply