We are currently looking for a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team of five other SREs. As an SRE, you will play a crucial role in ensuring the reliability, availability, resilience, and performance of the Sekoia.io platform. If you are a curious problem-solver with a solid understanding of cloud technologies, Linux, and Kubernetes, you'll fit right in !
Your missions :
Design, build, and maintain a scalable and highly available cloud infrastructure to support our cybersecurity SaaS platform.
Implement and enhance monitoring, alerting, and incident response systems to proactively identify and resolve any performance or availability issues.
Optimize system performance and troubleshoot infrastructure bottlenecks to maintain high reliability and responsiveness.
Automate deployment, configuration, and management processes using industry-standard tools and frameworks.
Participate in designing and implementing disaster recovery strategies to ensure business continuity in the face of potential failures or disasters.
Conduct thorough root cause analysis for incidents and implement preventive measures to minimize the risk of recurrence.
Troubleshoot and resolve infrastructure issues promptly to minimize downtime.
Stay updated with the latest industry trends and emerging technologies related to cloud computing, Linux, Kubernetes, and other relevant areas.
đź“Ť The position is available in Rennes, Paris or full remote.