We are seeking a highly skilled Machine Learning Engineering Director with a strong background in ML Ops, infrastructure, and software engineering. The ideal candidate will have at least 10+ years of total experience in the industry, including a minimum of 3 years in a leadership role. This position requires both leadership and hands-on technical expertise, managing a team of engineers while actively contributing to the design, development, and deployment of machine learning models and systems.
Key Responsibilities:
- Lead, mentor, and manage a team of machine learning engineers, providing guidance on best practices in ML Ops, infrastructure, and software engineering.
- Lead development of infrastructure for ML model training, testing, and deployment.
- Be hands-on in the design, development, and deployment of machine learning models and systems, ensuring they meet high standards of performance, scalability, and reliability.
- Collaborate with data scientists, product managers, and other stakeholders to define project requirements and deliverables.
- Develop and maintain ML Ops pipelines, ensuring efficient model training, deployment, and monitoring.
- Implement and manage infrastructure for large-scale data processing, model training, and inference.
- Drive continuous improvement in engineering practices, including code quality, testing, and deployment automation.
- Stay up-to-date with the latest trends and advancements in machine learning, software engineering, and cloud infrastructure to inform team strategy and direction.
- Manage project timelines, resources, and deliverables, ensuring projects are completed on time and within budget.
- Foster a culture of innovation, collaboration, and continuous learning within the engineering team.
Qualifications:
- Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
- 10+ years of experience in software engineering, with a focus on machine learning, ML Ops, and infrastructure.
- Minimum of 3 years of experience in a leadership or management role, with a proven track record of leading engineering teams to successful project outcomes.
- Strong understanding of machine learning frameworks, tools, and libraries (e.g., TensorFlow, PyTorch, Scikit-learn).
- Experience with ML Ops practices, including model versioning, continuous integration, and automated deployment.
- Proficiency in software engineering practices, including object-oriented design, code versioning, and testing.
- Experience with cloud platforms (e.g., AWS, Google Cloud, Azure) and distributed computing.
- Strong problem-solving skills, with the ability to lead teams in troubleshooting complex technical challenges.
- Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams.
- Demonstrated ability to manage multiple projects simultaneously, prioritizing tasks and managing resources effectively.
Preferred Qualifications:
- Experience with containerization technologies (e.g., Docker, Kubernetes).
- Knowledge of big data technologies (e.g., Hadoop, Spark) and data engineering practices.
- Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI).
#LI-Remote