Productboard Technology RadarProductboard Technology Radar

Service Level Definition

Adopt

First it is important to clarify the terminology when we talk about service levels. People often use SLO/SLA/SLI interchangeably but they refer to different albeit connected topics:

  • SLA - Service Level Agreement - are formal agreements (think part of a contract) between a service provider (Productboard) and the customer that outline the expected level of service. The SLA section in the contract often includes the agreed-upon SLOs and the consequences for failing to meet them, which can range from financial compensation to contract termination. SLAs are crucial in managing customer expectations and in building trust. They ensure that both the service provider and the customer have a clear understanding of the service standards. For the engineering team, SLAs also provide a clear framework within which they must operate, ensuring that their work aligns not only with internal objectives (as defined by SLOs) but also with external commitments to our customers.

  • SLO - Service Level Objectives - are specific, measurable goals that a service should achieve. For instance, an engineering team might set an SLO for their service that defines a target uptime of 99.99%. SLOs are important because they provide a quantifiable target for the team to aim for. This helps in aligning the engineering efforts with the overall business objectives and ensures that the team is working towards a common goal. In setting SLOs, it's crucial for teams to strike a balance between ambitious service quality and realistic operational capabilities. This alignment ensures that the service delivers value to the customers while remaining feasible for the team to maintain. By definition, SLO are higher than SLA. (Your objectives should aim higher than your commitment)

  • SLI - Service Level Indicators (or Index) - are the metrics used to measure the performance of a service against its SLO. In the context of an engineering group, SLIs provide a way to objectively monitor and assess the performance of the service. Continuing with the previous example, if the SLO is to maintain 99.9% uptime, the corresponding SLI would be the actual uptime percentage measured over a given period. SLI are essential for the engineering team as they offer a clear, data-driven insight into how well the service is performing. These metrics help in identifying areas where the service is not meeting its objectives, allowing the team to make informed decisions on where to focus their improvement efforts. They usually sit in our engineering reporting tool (DataDog, HoneyComb, JIRA, etc.) and reported in Looker