Job Title: Senior Data Engineer (AI/ML)
What You Will Do:
Drive the optimization of AI/ML data science models for enhanced performance and efficiency.
Establish consistent and secure access methodologies for data and cloud resources (AWS focus).
Implement and manage robust data pipelines and the deployment of AI/ML models into production.
Integrate diverse data sources and standardize data formats to facilitate efficient AI/ML model consumption.
Orchestrate the deployment and execution of machine learning models (e.g., Sagemaker) and statistical models (e.g., SAS) across cloud and potentially on-premise environments.
Partner closely with data scientists, operations teams, and end-users to ensure thorough testing, achievement of business objectives, and ongoing support.
Take a hands-on leadership role in consolidating and distributing data from various systems, including machine and sensor data processed in batch, near real-time, and real-time streams.
Engage in collaborative design and development of data pipelines, optimizing code for cloud platforms and utilizing services such as data warehousing solutions (e.g., RedShift), object storage (e.g., S3), serverless compute (e.g., Lambda), ETL services (e.g., Glue), and other cloud-based tools.
Proactively manage personal learning and contribute to the technical growth of the team.
Apply an engineering mindset and systems-level thinking to challenges. Collaborate with architecture teams to envision and design future-oriented data solutions.
Support the data needs of various business stakeholders for both existing and new data assets and domains.
Demonstrate strong technical, team, and solution leadership through clear communication, providing actionable, data-driven recommendations.
Work collaboratively with team members, business experts, and data subject matter experts to gather requirements and develop comprehensive technical designs.
Who You Are (Basic Qualifications):
Proven experience in optimizing AI/ML data science models.
Demonstrated ability to standardize data access patterns and manage cloud resources (AWS).
Experience in deploying and maintaining data pipelines and AI/ML models.
Track record of successfully aggregating and harmonizing data for efficient AI/ML model utilization.
Experience orchestrating machine learning and statistical models across different environments (cloud and/or on-premise).
Strong collaborative skills, working effectively with data science teams, operations, and business stakeholders to ensure successful outcomes.
Hands-on experience leading data consolidation and syndication efforts from diverse source systems, including real-time and batch data.
Experience in collaborative software design and development of data pipelines, with a focus on optimizing code on cloud technologies such as data warehousing, object storage, serverless compute, and ETL services.
What Will Put You Ahead:
Bachelor’s degree in a technical field such as Analytics, MIS, or Computer Science.
Familiarity with infrastructure-as-code tools like Terraform or other CI/CD DevOps automation practices.
Experience with code management and version control systems (e.g., Git/GitHub, GitLab, ADO/TFS).
Minimum of 2 years of active development experience in big data environments.
Solid understanding of cloud-based Business Intelligence solution deployments, particularly within the AWS ecosystem.
Experience developing backend data solutions to support data science models or front-end BI tools (e.g., Tableau, PowerBI, Qlik Sense).
Proficiency in data markup languages (JSON, XML, YAML).
Ability to integrate complex and varied data sources, establish data warehouses, and design a foundational architecture for BI and analytics in a dynamic data landscape.
Familiarity with statistical modeling tools (e.g., SAS).
Key Changes Made for Anonymity and Reduced Traceability:
Company Name Removed: Obvious, but essential.
Vague Language: Replaced potentially company-specific terminology or project names with more general terms (e.g., "cloud resources" instead of specific internal AWS account structures).
Industry Standard Keywords: Focused on widely used technologies and methodologies (AWS, Sagemaker, SAS, RedShift, etc.) without mentioning specific internal implementations.
Emphasis on Skills and Responsibilities: Highlighted the core skills and responsibilities rather than the specific context within a particular company.
Broader Terminology: Used broader terms for teams and stakeholders (e.g., "end-users," "architecture teams") instead of potentially unique internal team names.