Senior Site Reliability Engineer - USA, Remote - $120,000 - $130,000

Senior Site Reliability Engineer

USA, Remote

$120,000 - $130,000

 

Job Description

Location

This is a remote position. Open to candidates located in OR, WA, CA, CO, ID, AZ, TX, IL

 

Who We Are

We are the largest crowd-sourced, community-driven database of recorded music information in the world. Every day, millions of people use our Marketplace to connect, learn about music, and buy and sell vinyl records, CDs, and cassettes. We continues to grow, we are looking for bright, dedicated, creative, and highly motivated people to help us realize our mission to serve the music fan in everyone. We are relatively small, so individual contributions can have a large impact. High value is placed on quality, critical thinking, and continuous improvement. Our teams work collaboratively but are distributed geographically and open-source tools are important to who we are and how we work. We value the experiences and skills each team member contributes to helping us serve our music community.

 

Who We’re Looking For

The Senior Site Reliability Engineer has wide latitude to automate and improve service reliability.  The role is also responsible for diagnosing, investigating, and resolving service issues.  The role will help teams build and adopt Service Level Objectives (SLOs) in order to broadly improve service reliability at the company.  They will also have a hand in supporting and improving both our technical infrastructure and the platform services built on it..

 

What You’ll Accomplish

·        Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

·        Maintain organization cloud presence in AWS

·        Automate and deploy infrastructure configurations using Infrastructure as Code (IAC)

·        Implement monitoring, as well as metric and log collection

·        Assist other teams with capacity planning and infrastructure budgeting

·        Participate in evening/night and weekend on-call rotation

·        Continuously improve infrastructure by automating away repetitive tasks and toil

·        Build out functionality of the Platform used by other engineering teams

·        Debug issues in application code and related services.

·        Demonstrate a consistent commitment to core values and operating principles.

·        Work with your team on planning and completing department goals that align with overall business objectives.

·        Be an effective communicator by listening carefully, asking questions, and being transparent, timely, and diplomatic across all levels of the organization.

·        Stay informed on what is happening within the business and help others understand business decisions and the company direction by positively representing the company view.

·        Provide technical knowledge, coach and mentor others in the department and company.

·        Stay informed on new technologies or processes within your specialization and implement them when necessary.

·        Use analytic skills to communicate and drive decisions for your team based on available data.

·        In partnership with your manager, start to plan, evaluate, and improve the efficiency of your department to enhance speed, quality, efficiency, and output.

 

What You’ll Contribute

·        5+ years software development experience

·        3 years of experience with AWS

·        3 years of experience using Terraform to manage AWS resources

·        2 years of experience with Kubernetes (EKS preferred)

·        Experience with Change Data Capture and Kafka

·        2 years of experience with a scripting language (e.g., Python, Bash)

·        1 year of experience with cloud network configuration

·        1 year of experience configuring CI/CD pipelines

·        1 year of experience supporting 24/7 web applications

·        Experience configuring monitoring and alerting

·        Experience with Kubernetes configuration tools like Helm and Kustomize

·        Excellent written communication skills.

 

Great to have:

·        Experience with systems programming languages like Rust or Go

·        Experience implementing observability through code instrumentation

·        Experience with GitOps (e.g., Argo CD, Flux)

·        Experience migrating applications from an on-prem environment to the cloud

·        Experience with Change Data Capture and Kafka

·        Bachelor's degree in computer science or related field

·        Bachelor's degree from four-year college or university, or equivalent technical work experience