Machine Learning Engineer Job at Evolve Group, San Mateo, CA

dDJvTmFjUlh2T0dhM1dYK2FJRlRyc3VoZGc9PQ==
  • Evolve Group
  • San Mateo, CA

Job Description

Machine Learning Engineer

Tech start-up

San Fransisco based

We’ve partnered with one of the most ambitious and technically rigorous AI research labs in the world. Based in San Francisco, this team is building foundation models entirely from scratch.

They are now hiring ML Infrastructure Engineers to design and scale the systems that power large-scale, distributed model training. If you’ve built infrastructure that runs across hundreds of GPUs, thrive under technical complexity, and want to work side-by-side with elite AI researchers — this is the role.

Key Responsibilities:

  • Build and scale distributed training systems for large-scale model training across LLMs, vision, and robotics.
  • Set up and run large-scale training across many GPUs using tools like Kubernetes, DeepSpeed, and FSDP.
  • Troubleshoot system issues (GPU errors, network problems) and build tools to monitor and recover from failures.
  • Optimize PyTorch pipelines, sharding, and sampling strategies.
  • Collaborate closely with researchers to support novel model training at scale.

Requirements:

  • 3–15 years in ML infrastructure, systems, or research engineering roles.
  • Proven experience scaling distributed training for large models.
  • Strong with PyTorch, CUDA, NCCL, Kubernetes.
  • Familiar with setting up distributed training clusters.
  • Deep understanding of PyTorch dataloaders, data sharding, and sampling.
  • Strong communicator with a collaborative, mission-driven mindset.

This is a fully in-person role based in San Francisco , it's ideal for engineers excited to build at the edge of what's possible in AI.

Job Tags

Immediate start,

Similar Jobs

Autism Speaks

Social Media Manager Job at Autism Speaks

 ...Autism Speaks is seeking a Social Media Manager to work in our dynamic Marketing Communications...  ...Media Manager will engage followers online and through the implementation of...  ...this region) Responsibilities: Community Engagement: Manage our portfolio of high... 

Global Edge Group

Senior Contract Administrator Job at Global Edge Group

 ...Job Title: Senior Contracts Administrator Location: Houston, Texas Type of Role: Contract POSITION OVERVIEW: Our team is currently looking for a Senior Contracts Administrator for a client in the LNG Industry. The Senior Contracts Administrator will... 

Plastipak Holdings

DOT Field Service Technician Job at Plastipak Holdings

Whiteline Express, Ltd. is a truckload carrier with headquarters in Plymouth, MI. It was founded in 1983 to provide value added services for our affiliated companies. Since that time, we have continually grown to include terminals in Plymouth, MI, Jackson Center, OH, Medina...

Briganti's Automotive Services

Automotive Smog Technician Job at Briganti's Automotive Services

 ...Description Job Description Briganti's Automotive Services in Hollister, CA is seeking a dedicated full-time Automotive Smog Technician to join our team! We offer competitive pay, comprehensive benefits to our team members, and a supportive work environment. If... 

Omaha Airport Authority

Part time Custodian - 3pm to 10pm 4 days a week Job at Omaha Airport Authority

 ...Job Description Job Description We are seeking part time custodian to work 3pm - 10pm 4 days a week. This non-exempt position involves...  ...observation and occasional inspection. We provide a 401(k) retirement plan with a company match and free covered parking. You must...