Please send your updated resume at Data Scientist Hybrid 3 days Onsite 2 days remote Pittsburgh PA, Cleveland OH, Strongsville, Birmingham AL, Dallas, TX, Phoenix Contract Roles and Responsibilities: Collaborate with other data scientists, data engineers and DevOps engineers to help build and deploy models using SageMaker in a hybrid environment Coordinate the build and automations for the entire MLOps pipeline including data and features, model (re)developments, deployment and ongoing monitoring of inference endpoints and model performance Implement automated monitoring and alerting systems to detect and remediate potential issues proactively Look for opportunities to optimize timelines, resource utilizations and resiliency of end-to-end MLOps process Collaborate for the development and integration of customized LLMs to enhance data analysis, natural language understanding, and generation tasks for agentic systems Stay updated on the latest developments, explore and experiment to push boundaries and contribute to team and intellectual property development Must Have Technical Skills: Python and PySpark proficient Statistical analysis with data cleaning and augmentation experience Strong footing on ML algorithms and their suitability for varied use cases Deep learning and NLP experience (TensorFlow/PyTorch, BERT/GPT-3, etc.) AWS SageMaker and additional AWS services (Lambda, StepFunctions, etc.) Flex Skills/Nice to Have: Fine-tuning LLMs, SageMaker pipelines, Infrastructure-as-a-code (IaaC), CI/CD, Model Monitoring, Explainable AI (XAI) Education/Certifications: AWS Certified Machine Learning – Specialty AWS Certified DevOps Engineer – Professional Other Cloud Solution Provider (CSP) certifications in these areas will also count Additional Data Science and LLM focused certification will be a plus Screening Questions: Explain MLOps and key components of that in context of AWS SageMaker or similar experience?
Explain an end-to-end MLOps implementation on SageMaker and if the same had to be implemented in a hybrid state?
What are some common LLM architectures and explain how they work?
How would you approach fine-tuning an existing LLM for a specific domain?
How do you evaluate a model’s performance and specifically what metrics would you use to perform this task for an LLM or a model grounded on an LLM?
Ref: #404-IT Pittsburgh
Full Time