Post Job Free
Sign in

Senior Machine Learning Engineer

Company:
Centric Software
Location:
Madrid, Spain
Posted:
May 10, 2024
Apply

Description:

Centric Pricing™ (formerly StyleSage), is an AI driven competitive assortment benchmarking and market trend insights solution for fashion, beauty and home goods brands and retailers.

We are a key innovation partner for iconic and emerging brands across the world.

Our Platform is able to analyze the info of more than 1.000 retailers, processing data from more than 600.000 brands, tracking millions of products!

The Data Science team is responsible for enriching the data that our crawlers collect massively from fashion related websites with our own machine learning models. Our models add information to the existing products such as the categories (clothing, footwear, beauty…), genders, attributes, colors, bounding boxes, etc. The database already contains more than 500 millions of products (growing daily) and we process 1-2M new products every week.

To do that, you will use the latest and best open-source technologies out there. We code in Python (and we love it, you may want to come to the PyCon Spain conference with us!), using Keras as our main Deep Learning framework (although we are starting using Pytorch for certain projects) along with the other machine learning and computer vision libraries like scikit-learn or OpenCV.

In the engineering side, we use Django as our main framework for accessing the data. We are a cloud-native company, so our code runs in AWS. Our massive amount of data lives in PostgreSQL databases and we keep an eye on all this using observability tools like Grafana, Influxdb and Telegraf.

If you do not know a lot about some of those technologies, worry not, our engineers will be happy to support you while you are on your journey to becoming an expert in them.

Responsibilities:As a data scientist you will be responsible of ensuring that our current data science pipelines run smoothly over time with the best performance, as well as developing new machine learning pipelines and algorithms by:

Creating datasets from our huge data lake of products and social media data, selecting the most relevant items for your use case and ensuring the data quality

Hands-on train, deploy, productionize and operate Machine Learning models and pipelines at scale, including both batch and real-time use cases.

Contribute to expanding and improving the infrastructure to support all stages of the machine learning model lifecycle, including feature engineering, feature store, model training, testing, monitoring, and deployment in a production environment.

Proactively identify, and implement internal process improvements including automating manual work, optimizing data delivery, re-designing infrastructure for greater scalability.

Stay up-to-date with the latest industry trends and technologies to ensure our ML capabilities remain competitive and cutting-edge.

Onboard and enable Data Scientists with different levels of engineering expertise

Your Skills:

5+ years of experience working as a software engineer.

Bachelor’s degree in Computer Science, Engineering or related field

3+ years of experience as a production level Python developer and Deep learning frameworks: Tensorflow, Keras or Pytorch

Machine learning and Python data libraries like scikit-learn, pandas or numpy

Experience with SQL databases: preferably PostgreSQL

Linux shell command line.

Version Control in a collaborative environment with Git

Strong communication skills (written and oral) in English

Bonus Points:Additionally, it would be nice if you are familiar with:

Diffusion models, especially text to image models like Stable Diffusion

Django ORM

Image processing libraries like OpenCV or Pillow

NLP processing libraries such as Spacy or NLTK

Asynchronous processes with RabbitMQ and Celery

System monitoring with InfluxDB, Grafana

Working knowledge of containers (Docker)

Experience working with cloud based infrastructures (AWS, Azure…)

Apply