A Company that manages data platform
-Instrument and maintain our production systems to ensure a reliable and observable production environment.
-Equip the team with tools for debugging problems in production, system monitoring and identifying performance problems.
-Performance monitoring and system capacity planning.
-Define and instrument SLIs and SLOs for our products.
-Ensure that the code is efficient, maintainable and covered by appropriate test automation.
-Perform post-release monitoring and check telemetry in production.
-Experience in managing infrastructure with at least one public cloud provider.
-Experience deploying cloud-native applications.
-Hands-on experience building CI/CD pipelines (using CircleCI, GitHub Actions, Jenkins, or similar).
-Experience in instrumenting applications using OpenTelemetry.
-In-depth knowledge of at least one monitoring/instrumentation service (such as Sumo Logic, Honeycomb, Datadog, New Relic, ELK stack, or similar), and
-Fluent in English