Data Science

Company:

Swornim Consulting

Location:

Arlington, TX

Posted:

August 22, 2025

Apply

Description:

Job Description

Job Title: Data Scientist – Machine Learning, Big Data, GenAI (8–10 Years Experience)

Location: Remote Employment Type: Contract

About the Role

We are seeking a highly experienced Data Scientist with 8–10 years of expertise in delivering production-grade AI/ML solutions at scale. This role requires deep technical proficiency in Machine Learning, Big Data, Generative AI (GenAI), and Large Language Models (LLMs), combined with hands-on cloud experience (AWS, Azure, or GCP) and expertise in data platform migration. You will lead end-to-end projects, from architecture design to deployment, while mentoring teams, optimizing performance, and aligning solutions with business objectives.

Key Responsibilities

Design, develop, and deploy end-to-end AI/ML solutions in cloud-native environments.

Architect and implement Generative AI solutions leveraging LLMs (e.g., GPT, LLaMA, Claude, Mistral) and Retrieval-Augmented Generation (RAG) pipelines with vector search.

Build and optimize Big Data pipelines using Apache Spark, PySpark, and Delta Lake integrated with cloud storage (AWS S3, Azure Data Lake, GCP Cloud Storage).

Design and maintain scalable lakehouse architectures using Databricks, Snowflake, or Delta Lake.

Deploy MLOps pipelines using MLflow, SageMaker, Azure ML, or Vertex AI with Docker, Kubernetes, and CI/CD tools (e.g., Jenkins, GitHub Actions).

Implement and manage vector databases (e.g., Pinecone, FAISS, Milvus, Weaviate, ChromaDB ) for RAG applications.

Oversee ETL/ELT workflows using tools like Airflow, dbt, or Azure Data Factory.

Lead cloud migration projects, including Hadoop-to-Databricks and Hive-to-Delta Lake transitions, ensuring minimal downtime and data integrity.

Integrate real-time streaming data solutions using Apache Kafka.

Conduct feature engineering, hyperparameter tuning, and model optimization to ensure scalability and performance.

Mentor junior data scientists and guide best practices for AI/ML development and deployment.

Collaborate closely with product, engineering, and executive teams to align AI solutions with business KPIs and compliance requirements.

Required Skills & Experience

8–10 years of experience in data science, machine learning, and AI/ML solution delivery.

Strong hands-on expertise in at least one major cloud platform ( AWS, Azure, or GCP ), with proven production deployments.

Expertise in Python, PySpark, and SQL for building scalable ML pipelines.

Proven experience with Apache Spark, Hadoop, and Big Data processing.

Hands-on experience with Generative AI frameworks, including Hugging Face, LangChain, or LlamaIndex.

Deep expertise in RAG architectures and vector databases (e.g., Pinecone, FAISS, Milvus, Weaviate ).

Experience in deploying MLOps workflows using MLflow, Docker, Kubernetes, and CI/CD tools (e.g., GitLab CI, GitHub Actions ).

Proven migration experience, including AI/ML workloads and Big Data platform migrations (e.g., Hadoop to Databricks, Hive to Delta Lake ).

Familiarity with cloud storage solutions ( AWS S3, Redshift, Azure Synapse, GCP BigQuery ) and infrastructure-as-code (e.g., Terraform, CloudFormation, ARM templates ).

Experience with streaming data technologies (e.g., Apache Kafka ) and distributed query engines (e.g., Hive, Presto, Trino ).

Strong foundation in statistics, probability, and ML algorithms for model optimization.

Preferred Qualifications

Experience with knowledge graphs and semantic search.

Background in NLP, transformer architectures, and deep learning frameworks (e.g., TensorFlow, PyTorch ).

Exposure to BI tools (e.g., Power BI, Tableau, Looker ).

Domain expertise in finance, healthcare, or e-commerce.

Apply

Data Science

Description:

Report this job