SRE

Location Tokyo
Discipline Information Technology
Job type Permanent
Salary Negotiatable
Reference 36205

【COMPANY OVERVIEW】

A Company that manages data platform

 

【JOB RESPONSIBILITIES】

  • Instrument and maintain our production systems to ensure a reliable and observable production environment.

  • Equip the team with tools for debugging problems in production, system monitoring and identifying performance problems.

  • Performance monitoring and system capacity planning.

  • Define and instrument SLIs and SLOs for our products.

  • Ensure that the code is efficient, maintainable and covered by appropriate test automation.

  • Perform post-release monitoring and check telemetry in production.

 

【REQUIREMENTS】 

  • Able to understand code in common scripting languages used in Web applications (such as Ruby, Python, PHP, JavaScript, TypeScript, or similar). You need to be familiar with at least one language.

  • Experience in managing infrastructure with at least one public cloud provider.

  • Experience deploying cloud-native applications.

  • Hands-on experience building CI/CD pipelines (using CircleCI, GitHub Actions, Jenkins, or similar).

  • Experience in instrumenting applications using OpenTelemetry.

  • In-depth knowledge of at least one monitoring/instrumentation service (such as Sumo Logic, Honeycomb, Datadog, New Relic, ELK stack, or similar), and

  • Fluent in English