Job Description
You want to build and develop an app with a lot of responsibility in collaboration with a strong team to sustainably improve the working life of millions of people?
As part of our engineering team, you will ensure that the non-functional features (performance, stability, efficiency and security) of our system meet the high expectations of our customers. You're an enthusiastic Infrastructure Software Engineer and your code supports our Development and Customer Success teams in delivering the best employee app for frontline workers. Do you enjoy modern technologies and love to go above and beyond with your creativity in solving challenging problems? Then you've come to the right place!
What awaits you with us
- Elastic Computing: you'll help us scale to the sky. To do this, you'll continue to build out our Kubernetes-based stack and drive the evolution of our system and software architecture
- Zero Downtime: you'll work to ensure that Flip is always available whenever possible. Zero downtime rollouts, redundancy concepts and migration strategies are your topic
- Visibility and Troubleshooting: You develop our system monitoring, profiling and log aggregation and identify emerging problems.
- Security and Privacy: You ensure that Flip remains a secure application and protects the privacy of our users
- Safety and Resilience: You take care of the operational safety and resilience of our system and drive improvements in this field.
- Infrastructure Engineering: You design, develop and optimize our production, development and hosting infrastructure
- Platform Management: You develop our developer tooling and our provisioning and management system for the platform operation (Python/Django) and make sure that our engineers and our customer access team can work efficiently
- CI/CD: You will help us to further improve our CI/CD pipeline and thus help the team to get even faster feedback cycles and very good test platforms
Qualifications
What you bring to the table
- Experience with Kubernetes and Docker
- Metrics, alerting and logging (e.g. Loki, Grafana, Zabbix and Sentry).
- Software development (Python)
- Cloud infrastructure (e.g. Azure, AWS, GCP)
- Business fluent in German and English
Should-Have
- Operation of highly scalable, distributed cloud and cluster systems.
- Linux system management knowhow
Nice-To-Have
- Helm
- Message Queueing, Event Streaming (RabbitMQ, NATS, ...)
- Django Framework
- On Premise Knowhow
- Gitlab CI
- Pulumi
- ArgoCD
- Agile work structures: Scrum and Kanban