Software Engineer, AI Training and Infrastructure
We are looking for a Software Engineer to work at the forefront of developing and optimizing the software infrastructure and tools necessary for training cutting-edge AI models. You will focus on building robust, scalable, and efficient training pipelines and frameworks that support the entire machine learning lifecycle, from data preparation to model deployment. You will collaborate with researchers and machine learning engineers to ensure seamless integration and operation of training systems, pushing the boundaries of what AI can achieve in real-world robotics applications. You will explore new ways to efficiently make use of many types of data in our training pipeline. Responsibilities: Develop and maintain robust, scalable, and distributed training pipelines (data preprocessing, training orchestration, and model evaluation) and frameworks for large-scale AI models. Optimize training processes for performance and resource utilization, ensuring scalability and reliability. Collaborate with researchers and machine learning engineers to integrate state-of-the-art algorithms and techniques into training pipelines. Monitor and analyze training, identifying bottlenecks and proposing solutions to improve efficiency and performance. Ensure the robustness and reliability of the training infrastructure, including automated testing and continuous integration. Preferred Qualifications: BS, MS or higher degree in Computer Science, Robotics, Engineering or a related field, or equivalent practical experience.