Distributed AI Systems Engineer (m/f/d) - #2186453
Sereact
Date: vor 19 Stunden
Stadt: Stuttgart
Vertragstyp: Ganztags
Arbeitsplan: Volle Tag

Who We Are:
We are a rapidly growing embodied AI company revolutionizing human labor. Leveraging cutting-edge robotics and advanced artificial intelligence, we develop transformative technologies that redefine how work is done across multiple industries—empowering businesses to streamline operations, boost productivity, and unlock new possibilities.
Overview:
As a Distributed Systems Engineer, you will design, implement, and optimize scalable systems that power modern AI and machine learning applications. You will work closely with engineering teams to build robust infrastructure, ensuring system reliability and performance.
Your Responsibilities:
Design
Education And Experience
We are a rapidly growing embodied AI company revolutionizing human labor. Leveraging cutting-edge robotics and advanced artificial intelligence, we develop transformative technologies that redefine how work is done across multiple industries—empowering businesses to streamline operations, boost productivity, and unlock new possibilities.
Overview:
As a Distributed Systems Engineer, you will design, implement, and optimize scalable systems that power modern AI and machine learning applications. You will work closely with engineering teams to build robust infrastructure, ensuring system reliability and performance.
Your Responsibilities:
Design
- Design and develop distributed systems and tools in Python and C++.
- Writing Shell Scripts, building Docker Containers, setting up training/inference cluster and automate the training/inference pipeline.
- Writing communication layers and connectors between different ML-related microservices/components.
- Optimize system performance and scalability.
- Debug and resolve complex distributed system issues.
- Implement fault-tolerant and high-availability features.
- Collaborate with cross-functional teams to integrate distributed systems into larger platforms.
Education And Experience
- Bachelor’s or Master’s degree in Computer Science or related fields.
- Experience working with asynchronous, parallel ML serving framework such as Torchserve, vLLM, LMDeploy, NVIDIA Triton.
- Experience in designing and deploying distributed systems.
- Experience with training ML/LLMs models in distributed settings.
- Proficiency in Python and Pytorch.
- Proficiency in bash scripting, docker building and ML orchestration tools such as Kubernetes, Kserve, Slurm, Torch Distributed.
- Proficiency working with cloud-based ML platforms such as AWS Sagemaker, GCP VertexAI, as well as other cloud services such as storage, docker registry.
- Proficiency working with distributed communication system such as MPI, NCCL, as well as general-purpose communication system such as RabbitMQ, MQTT.
- Strong understanding of networking, concurrency, and multithreading.
- Wellpass (gym membership)
- Free meals at the workplace
- Flexible working hours
- A motivated team and an open corporate culture
- Competitive compensation and excellent career development opportunities
Wie bewerbe ich mich?
Um sich für diesen Job zu bewerben, müssen Sie auf unserer Website autorisieren. Wenn Sie noch kein Konto haben, registrieren Sie sich bitte.
Veröffentlichen Sie einen LebenslaufÄhnliche Jobs
Head of Field Sales (m/w/d) Test & Measurement Equipment
GEJOBA GmbH,
vor 56 Minuten
Head of Field Sales (m/w/d) Test & Measurement Equipment Job-ID: 25018 GEJOBA - Partner für Ihre erfolgreiche Karriere! Sie möchten in einer Schlüsselrolle bei einem führenden deutschen Fachdistributor für Mess- und Prüftechnik arbeiten – mit Führungsverantwortung und eigenem Vertriebsgebiet? Dann...
PR - Öffentlichkeitsarbeit (w/m/d) Teilzeit 20 Std./Woche
VON M GmbH,
vor 1 Stunde
Werden Sie Teil eines jungen, aufgeschlossenen Teams mit 20 Architektinnen und Architekten und unterstützen Sie uns als engagierte/r, motivierte/r Mitarbeiter/in im Bereich PR - Öffentlichkeitsarbeit. Aufgaben / Aktive Gestaltung einer kanalübergreifenden, einheitlichen Öffentlichkeitsarbeit (Website / Social Media Kanäle / interne...

Scrum Master (m/w/d)
andrena objects ag,
vor 4 Stunden
Wir sind Anhänger erstklassiger Software und gehören zu den Pionieren der Agilität in Deutschland. Wir unterstützen unsere Kunden darin, innovative Softwareprodukte agil zu entwickeln. Die enge Verbindung von Agile Software Engineering und Agile Coaching ist für uns charakteristisch. Deine Aufgaben...
