Machine Learning Infrastructure Engineer
, CA, United States
ML Systems Engineer - Lead Role
Join our team as the Lead ML Systems Engineer and drive the development of cutting-edge machine learning systems for video foundation (VFM) and language model (VLM) in production. In this role, you'll lead a talented team, set technical strategies, and ensure our systems exceed user expectations in speed, efficiency, and reliability.
Responsibilities:
Lead the development of machine learning systems for VFM & VLM production
Optimize inference infrastructure for scalability and reliability
Oversee ML deployment & operations (VFMOps / VLMOps) for model optimization and automation
Manage data infrastructure and preparation for high-quality video data
Design effective team processes for operational efficiency
Coach and develop team members for career growth
Drive recruiting efforts during rapid growth periods
Requirements:
10+ years of software development experience, including ML engineering
5+ years building end-to-end ML systems, including MLOps and data management
Experience managing high-output engineering teams for 2+ years
Proficiency in video processing and data pipelining
Experience with secure software development environments
Desired Experience:
MS or PhD in CS, Math, or equivalent experience
Startup engineering experience in a fast-paced environment
Familiarity with large-scale models and both cloud/on-premise environments
ML research experience is a plus
Understanding of large-scale computing systems and cloud scaling approaches
Tech Stack:
Languages: Python, Golang, C++, CUDA
ML/Platform: PyTorch, Docker, Kubernetes, Terraform
ML Ops: MLFlow, Weights and Biases
Data: Pachyderm, DVC
Automation: Airflow, Kubeflow
Model Serving: Triton, FasterTransformer
#J-18808-Ljbffr