Performance Importance

Our systems meet or exceed their users’ expectations of performance. Degradations in performance are investigated, understood and either remedied, or accepted as appropriate in the business context.

Rationale

Systems that are fast and responsive for users enable them to carry out tasks more efficiently, and for the customer-facing areas of our business there is a strong correlation between slower websites and a negative impact on revenue and customer retention.

We must maintain a continued focus on the impact of changes to the performance of our applications or we risk impacting user experience, our ability to scale cost-effectively, and the stability of our systems.

Implications

  • Set a performance budget per page or endpoint as early as possible and incorporate testing of this into the software delivery pipeline. Teams should have an awareness of how their systems have been engineered to meet performance criteria, and the means to avoid accidentally introducing changes that breach the expected user experience.
  • Incorporate the analysis of performance tests into the release process. This is typically through frequent automating testing (potentially on every commit) using short component performance tests which leverage mocks/stubs. Understand what tools are available to help with the data collection and analysis of these results.
  • Choose the appropriate performance technique for your workload and platform. For example, client-side performance testing tools are more appropriate for applications with a frontend. Longer “soak” tests are only required for platforms that are long-lived, and “stress” tests are more appropriate when addressing specific stability concerns or risks.
  • Frequent or continuous performance testing should be implemented for the performance-sensitive components of our systems. Integration or end-to-end performance testing should be adopted very cautiously due to its high cost and complexity of maintaining fully integrated and always available testing environments - consider whether upstream dependencies can be adequately mocked under a variety of load profiles to validate any performance risks.
  • Live load tests can be a useful means of validating overall end-to-end performance for our larger and more complex systems, but noting the caveats above. They should not be relied on to assure the performance of smaller components of the system.
  • Aim to use the same tools and techniques for assuring Production Readiness when observing performance behaviour - this helps ensure that the same metrics can be relied on in the live environment, as well as engineer familiarity.
  • Consider using release strategies that allow performance of significant changes to be validated using a subset of end users and rolled forward or back quickly - such as canarying or dark launching with feature toggles.