Luxe Quality logo
Quality Assurance

Bohdan Mushta, Quality Assurance Engineer

Apr 02, 2024 14 min read

Сanary Testing: What It Is and Why It Matters

As a testing approach, canary testing is designed to gradually implement new software updates or features, initially to a small segment of users, before they are deployed to the entire user base. The technique involves releasing a small percentage of the update or feature to a select group of users, known as "canaries," and monitoring their feedback.

CanaryTesting

As a testing approach, canary testing is designed to gradually implement new software updates or features, initially to a small segment of users, before they are deployed to the entire user base. The technique involves releasing a small percentage of the update or feature to a select group of users, known as "canaries," and monitoring their feedback. The development team can use this information to identify issues overlooked during development, such as bugs that could affect the user experience.

By identifying flaws early in the development cycle, businesses can save time and money by avoiding costly maintenance and updatеs and ensuring their customers enjoy a seamless experience.    

 What is Canary Testing? 

This method incrementally introduces a new software version to a limited user base before a full-scale release, minimizing the chance of introducing errors in the production environment. This technique is highly recommended as a best practice for production deployments. In canary testing, the ingress controller points to a different service/deployment with a different image.

You can efficiently identify and resolve software bugs by gathering feedback from a small group of users. We recommend performing a canary test before any other test because it helps prevent any faulty software code from spreading to a massive user base. Limiting the number of software users affected makes detecting and resolving errors easier. The test can be done via a header, in this case, “canary: testing”, set with the curl command or in the browser with a header plugin.

Why is Canary Testing Effective?

A canary release is a great way to implement code changes associated with adding new features or creating a new software version. It lets the development team quickly assess whether the code changes provide the desired or expected results. Canary testing also enables developers to try out new application functionality on a small group of users. By involving only a portion of users in this testing, potential issues related to the new functionality are minimized. Developers find it easier to roll back changes and avoid the impact of errors on all users in the new version of the application.  

exclamation mark icon

Leave a request and receive an individual offer to test your product.

How does Canary Testing Work?  

It is conducted systematically and step-by-step. Below are the steps involved how to do canary testing:  

01

The development team selects users to form a test group. This group consists of a few users, but enough to obtain results for a comprehensive statistical analysis. Users are unaware that they are part of the test group.  

02

The team creates a test environment that runs parallel to the current production environment. They also configure the system's load balancer to direct user requests to the new environment.    

03

Developers conduct canary tests by directing requests from selected users to the new environment. They also observe the users for a certain period to ensure that the new version of the application performs as expected.    

04

If the new version meets expectations, the new functionality or software can be released to all users. However, if the new application version contains many bugs, performs poorly, or creates other user issues, testers revert to the original version of the software.  

05

The team fixes the identified bugs and then releases the application to a broader audience. 

HowdoesCanaryTestingWork

Three Phases of Canary Testing 

The process of the canary test involves three main phases, which are very simple to execute. Below are the main phases of the canary test: 

Canary Deployment Planning  

Canary deployment planning encompasses the following factors:   

  • User and Stage Count: The total number of users receiving the canary deployment and the number of stages are crucial factors to consider. Typically, more than 5% or 10% of the total user base is involved in this process. Sometimes, the development team selects a group of "test" users based on a specific geographic region (If you're interested in enhancing your understanding and skills in accessibility testing, our guide on how to do accessibility testing is a must-read).   
  • Timelines/Duration: A canary test can last minutes to hours, depending on the application and the code being tested. When planning a canary deployment, it's essential to consider the time required for all testing aspects.   
  • Evaluation Criteria: Like any other software testing, the success or failure of this type of testing can only be determined if evaluation criteria are predefined.   
  • Performance Metrics: Gathering metrics is necessary for analyzing testing progress, evaluating application performance, determining CPU and memory loads, and tracking errors.  

Canary Deployment Execution  

Upon completing the planning phase, the development team executes the canary deployment by giving access to the new project version. Developers prepare deployment files and configurations and write necessary code parts and scripts for testing.   

Next, the team creates a canary node using the load-balancing process and duplicates the actual production environment. This type of testing requires at least two production environments, one representing the original application without any code changes.  

Canary Deployment Analysis  

When the new code with changes is sent to the selected user group, it directs traffic to both base (regular) and test nodes. It allows the development team to compare the performance of both versions and determine if the tested application version meets the expectations set during the planning stage. 

Analyzing logs can identify and address any issues or errors before releasing the application's new version to a broader audience.  

Feature Flags in Canary Releases

Canary testing often incorporates feature flags, also known as feature toggles, where code is released, but features are not automatically activated. Developers remotely enable these features for a subset of users. The feature flags method allows targeting the application, for example, to 1% of users while monitoring metrics such as error rates, latency, and the impact of the activated feature on business metrics. This approach helps avoid negative consequences for all users if something goes wrong with the update.   

Conversely, it's easy to deactivate the update by toggling the feature flag and opting out if users disapprove or if technical difficulties arise. Everything can be done incrementally by gradually "unveiling" functionality to users.  

Canary Testing Tools  

  • New Relic: Provides real-time insights into application performance, user experience, and infrastructure metrics. Offers distributed tracing and anomaly detection.  
  • Datadog: Cloud-based platform for monitoring and analytics. Tracks performance of applications, servers, and components. Offers customizable dashboards and alerts (for those looking to streamline their testing processes and enhance efficiency, our article on cloud automation testing sheds light on the latest methodologies and tools that can transform your testing process).  
  • Splunk: Widely used platform for managing and analyzing logs. It enables centralized log collection, monitoring, and troubleshooting.  
  • Dynatrace: AI-powered observability platform offering comprehensive visibility into application performance and dependencies.   
  • Elastic Stack (ELK Stack): Collection of open-source tools, including Elasticsearch, Logstash, and Kibana. It enables log collection, storage, search, and visualization for monitoring and analysis.  

The choice of tools depends on project requirements, technology stack, and the desired complexity of canary deployment and monitoring processes.  

CanaryTestingTools

Advantages of Canary Testing 

This type of testing offers several advantages for developing high-quality software:    

  • Simplicity and Flexibility: Software canary testing is straightforward and minimally intrusive, requiring minimal effort from the application. If issues arise, the development team can easily reverse any changes made, ensuring that most end-users remain unaffected.    
  • Low Maintenance: Software canary testing requires minimal maintenance as it is conducted for a short duration, allowing developers to progress to the next stage once results are evaluated swiftly. Monitoring resources are reduced as only a small subset of end-users is involved, and users are responsible for maintaining performance.    
  • Cost-Efficiency: The infrastructure needed for canary software testing is minimal, reducing costs in addressing issues. Rolling back to the previous version is often the best solution, as only a small fraction of users are affected, sparing the development team from dealing with service interruptions and upset clients (for an in-depth understanding and expert insights into enhancing your digital products' quality and reliability, we invite you to explore our software testing and QA services).      
  • System Compatibility: Software canary testing is a versatile approach suitable for various systems without restrictions based on geographic location or deployment size. It expands the capabilities of DevOps teams, allowing them to effectively apply this method across a wide range of deployment conditions.    
  • Beta Testing: It facilitates the creation of software beta versions by allowing developers to invite users to participate in testing. It allows quick feedback to existing users, helps identify problem areas and bugs, and enhances collaborative relationships with users.  

Canary software testing promotes agility, cost savings, and user-driven improvement, making it an essential tool for software development teams.   

Common Challenges with Canary Testing 

Organizations may need help with challenges that affect the quality of their products or services.   

  • Identifying a Representative Canary Group: One of the main challenges is identifying a canary group representing a wider audience. To succeed, organizations must, thoroughly evaluate the demographics and behaviors of potential users. It is crucial to select a diverse group that can provide valuable feedback.   
  • Defining Clear Metrics: When conducting canary testing, it is crucial to establish precise metrics to monitor the canary group's performance and behavior. Without well-defined metrics, evaluating the testing process's effectiveness can be challenging. Organizations should prioritize defining clear and measurable metrics to overcome this challenge before initiating this type of testing.   
  • Maintaining Consistency: Ensuring consistency between the canary group and the wider audience can be challenging, which could result in inaccurate or misleading outcomes. To tackle this challenge, organizations should cautiously manage the deployment of changes to the canary group and ensure they represent the broader audience.   

Organizations can overcome this challenge by building trust and transparency with stakeholders and maintaining regular communication throughout testing.     

Canary Testing vs. A/B Testing  

Canary and A/B testing are two methods used in software development to assess changes and improve product quality. However, they often need clarification. A/B testing allows the comparison of two versions of a product or feature by simultaneously implementing them for different groups of users. The goal is to determine which version more effectively achieves the set objectives, such as increasing conversion on a website or enhancing user satisfaction levels.   

Both A/B testing, and canary testing are methods of controlled implementation of new updates or features, but they have different aims and are used in different contexts. Let's clarify their differences:    

 

Comparison

Canary testing 

A/B testing 

Scope 

Tests new features or changes on a smaller group of users before broader release

Evaluates and compares different versions of a feature or design element side by side

Sample size 

Small size of users 

Involves testing changes on a larger size  

Goals 

Focused on identifying bugs early in the development process. 

Focused on comparing various options to determine which one is more effective 

Approach 

Involves bringing specific changes to a specific group of users 

Comparing different options to determine which was most effective 

Timeframe 

Done over a shorter period  

Done over a more extended period 

Canary testing is best for detecting issues early, while A/B testing is better for comparing different variables. 

Setting Up Your Canary Test  

Setting up a canary test involves taking several key steps to ensure that new information or features are introduced to your infrastructure in a controlled and gradual manner. The goal is to mitigate risks and identify potential issues in advance, and you have used it ultimately. Here's a structured approach to setting up your canary test:  

  • Determine the Metrics: Set specific metrics to monitor, such as error rates, performance indicators, user engagement metrics, or any other relevant data that will help you measure the impact of a new update.  
  • Select the Canary Group: Identify a single, representative segment of your user base that is the Canary group. This group should be large enough to gather meaningful data but small enough to limit impact in case of issues. When selecting this group, consider geography, device type, or user behavior.  
  • Prepare the Environment: Ensure your infrastructure supports canary testing. It might involve setting up separate environments for the canary and the rest of the users or using feature flags to control who sees the new update. The setup should allow for easy monitoring, rollback, and incremental rollout.  
  • Deploy the Update: Release the new feature or update the selected canary group only. This step is critical in isolating the impact of the update, allowing for focused monitoring and analysis.   
  • Monitor and Analyze: Closely monitor the canary release's performance and behavior. Collect data on the defined metrics and compare it against the baseline established before the rollout. Look for any issues, or unexpected behavior.   
  • Plan: Based on the analysis, decide whether to:  
  1. Roll out the update to all users if the canary test is successful and the new update performs as expected without significant issues.   
  2. If you identify problems, roll back the update to prevent it from affecting more users. Analyze the issues, fix them, and consider running another canary test before attempting a full rollout again.  
  3. Extend the canary phase if the results are inconclusive or if you need more data to decide. Adjust the size of the canary group or the duration of the test.  
  • Document and Learn: Regardless of the outcome, document the process, decisions, and findings from your canary test. Use these insights to improve future releases and testing strategies.  

Setting up a canary test requires thoughtful planning and careful execution but is a valuable practice for enhancing the quality and reliability of software deployments.  

Best Practices for Canary Testing    

  • Define clear goals: Establishing clear goals and objectives is crucial before implementing canary testing. It helps guide the testing process and ensures that specific outcomes are achieved.   
  • Select a Representative Canary Group: Choose a canary group that accurately represents the wider audience and is large enough to provide meaningful feedback. Ensure diversity within the group to capture various user behaviors.   
  • Establish a Monitoring Plan: Develop a monitoring plan with specific metrics and tools to track the canary group's performance and behavior. Early identification of issues or bugs allows for quick resolution.   
  • Gradual Rollout: Changes to the canary group are rolled out gradually to minimize the impact of any issues or bugs. This approach enables quick adjustments as needed.   
  • Automated Rollback: Have an automated rollback plan to address any detected issues or bugs during testing swiftly. It ensures the stability and reliability of the product or service. 

Organizations can ensure product or service quality and mitigate risks by implementing canary testing best practices.   

Is Canary Testing Appropriate for All Types of Companies?

When deciding whether it is appropriate for a given company or product, some factors to consider include:   

  • Complexity: Complex applications or systems prone to bugs or issues may benefit more from testing than simpler ones.   
  • Size of the user base: It can help companies with a large user base identify and address issues before they affect a significant number of users.    
  • Resources: This type of testing requires additional resources such as servers, infrastructure, and personnel to monitor and analyze results. Companies must ensure they have the necessary resources to support canary testing.   
  • Frequency of releases: Companies that frequently release updates or new versions may find it valuable to ensure thorough testing before broader releases.   
  • Risk tolerance: Minimizing issues in new releases can be essential for companies with low-risk tolerance. Conversely, companies with a higher risk tolerance may prioritize something other than the additional testing this type of testing provides.   

Conclusion   

Canary testing is a valuable technique for assessing new versions of applications or systems before releasing them to a broader user base. This method allows companies to identify and address potential issues early in development, reducing the risk of costly rollbacks or deployment delays. However, before implementing this type of testing, companies must consider factors such as complexity, user base size, resource availability, release frequency, and risk tolerance. Overall, it effectively ensures the reliability and stability of new releases and improves user experience.   

Ready to enhance your deployment strategy with canary testing? Contact us today to discover how our expert solutions can streamline your next software release.

Have a project for us?

Let's make a quality product! Tell us about your project, and we will prepare an individual solution.

Frequently Asked Questions

Why is canary testing effective?

It allows developers to quickly assess code changes and new features, minimizing risks by testing them on a small user group before full-scale release. It helps detect issues early and enables easy rollback if necessary, reducing user impact.   

How does canary testing work?

It involves systematically selecting a small group of users for testing, creating a parallel test environment, directing user requests to the new environment, observing user behavior, and rolling out changes based on the results.  

What factors are considered in canary deployment planning?

Canary deployment planning involves considering factors such as the number of users involved, duration of testing, evaluation criteria, and performance metrics to ensure a smooth and successful deployment process.   

What are the advantages of canary testing?

It offers simplicity, low maintenance, cost-efficiency, encouragement of creativity, system compatibility, and beta testing opportunities.   

Is canary testing suitable for all types of companies?

Factors such as the complexity of applications, size of the user base, available resources, frequency of releases, and risk tolerance levels should be considered when determining the appropriateness of canary testing for a given company or product.  

Recommended Articles