Software Engineer - Site Reliability Engineer
Sunnyvale, CA, United States
Description
Software Engineer - Site Reliability Engineer
Qualifications
A love of solving hard problems
Putting your customers first, whether they be internal or external, and making them more productive, happy, and successful
Experience with Azure AKS, AWS
Experience with Kubernetes, ECS, EKS, or other container orchestration system
Some sort of infrastructure-as-code system: Ansible, Terraform, CloudFormation, CDK, etc
Logging systems: Splunk, EventHub, ELK etc
Bachelors degree in Computer Science or similar or equivalent experience
Experience creating automated solutions & eagerness to automate
Responsibilities
Experience monitoring services and infrastructure, log collection, analytics, and application performance monitoring (APM)
Improve metrics on our main services, and act as a subject matter expert for dev teams
Recommend and guide improved monitoring and alerting processes
Identify performance bottlenecks and provide recommendations for improvement
Proactively identify and solve problems that we didn't even know we had
Help build, deploy, and scale a load testing environment that is analogous to production
Enforce security and operational safety controls
Experience with Performance testing or Chaos testing a plus
Contribute to the architectural improvements to meet future scaling and observability requirements
Strong performance issue triaging skills. Log analysis, thread dump analysis , heap dump analysis.
Self-motivated individual who is proactive in driving tasks to completion.
Participate in on-call rotation (Team is scattered across America and Europe, so you can sleep at night!), support developers questions and attending incidents
At least 5 years in a Reliability Engineering, DevOps or infrastructure focused role
Advanced experience with programming languages (GoLang, Python, Java)Passion for designing and building reliable systems
Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment
Deep systems and infrastructure knowledge
Advanced knowledge and hands-on experience with CI/CD systems
Automation advocate - you truly believe in removing operation load with software
Education: Bachelors Degree
#J-18808-Ljbffr