Test case failures from test-case-mgmt blocking all release

Our production releases are completely blocked due to test case failures in the test-case-mgmt module. We’re using Azure DevOps 2024 with Test Plans, and our release pipeline has quality gates configured. The problem is that even minor test failures (like UI validation tests with 2-3% flakiness) are blocking critical production deployments.

Our current gate configuration:

gates:
- task: TestResultsGate@1
  inputs:
    testRunTitle: 'Regression Suite'
    minimumPassRate: 100

We’ve had three production releases delayed by 12+ hours in the past week because of intermittent test failures that aren’t actually blocking issues. The test impact analysis shows these failures don’t affect the changed components, but the gate doesn’t consider that. We need releases to proceed when critical tests pass, even if non-critical UI tests fail. How can we configure more intelligent test gates that consider test impact and severity rather than requiring 100% pass rate across all test cases?

Test impact analysis in ado-2024 can help here, but it needs to be explicitly enabled and configured. The gate task doesn’t automatically use impact analysis results. You’ll need to query the test results API to filter tests based on code changes and only evaluate relevant tests in your gate logic.

The 100% pass rate is too strict for real-world scenarios. You should use test categories or tags to separate critical from non-critical tests. Configure your gate to only evaluate critical tests at 100% pass rate, while allowing non-critical tests to have a lower threshold like 95%. This requires proper test categorization in your test cases first.

The flaky test problem needs to be addressed at the source. Use the Test Plans analytics to identify tests with inconsistent results over the past 30 days. Tests with less than 95% consistency should be marked as ‘Investigation’ and excluded from gate evaluation until they’re fixed. Azure DevOps 2024 has built-in flaky test detection in the test results dashboard.

Check your variable groups configuration. You can define different pass rate thresholds for different test suites using pipeline variables. Set up a variable group with thresholds like CriticalTestsPassRate=100, UITestsPassRate=95, IntegrationTestsPassRate=98. Then reference these in separate gate tasks for each test category. This gives you granular control without manual approvals.

We solved this by implementing a two-tier gate system. The first gate requires 100% pass rate for P0/P1 severity tests only. The second gate is an approval gate that triggers when P2/P3 tests fail, allowing a release manager to review and approve if the failures are acceptable. This gives you automation for critical paths while maintaining human oversight for edge cases. You can filter tests by priority using the test plan query in the gate configuration.

I’ve implemented exactly this solution for multiple teams using ado-2024. The key is addressing all four aspects of your gate configuration systematically.

Test Gate Configuration: Replace your single gate with a multi-tier approach using test categories:

gates:
- task: TestResultsGate@1
  displayName: 'Critical Tests Gate'
  inputs:
    testRunTitle: 'Regression Suite'
    testCaseFilter: 'Priority=0|Priority=1'
    minimumPassRate: 100

- task: TestResultsGate@1
  displayName: 'Standard Tests Gate'
  inputs:
    testRunTitle: 'Regression Suite'
    testCaseFilter: 'Priority=2'
    minimumPassRate: 95

Pass Rate Thresholds: Create a variable group named ‘TestGateThresholds’ with these values:

  • `P0_PassRate: 100
  • `P1_PassRate: 100
  • `P2_PassRate: 95
  • `P3_PassRate: 90 Reference them in your gates:
minimumPassRate: $(P0_PassRate)

Test Impact Analysis: Enable impact analysis by adding this to your test execution task:

- task: VSTest@2
  inputs:
    testSelector: 'testRun'
    testImpactEnabled: true
    runOnlyImpactedTests: false
    collectTestImpact: true

Then filter your gate to only evaluate impacted tests:

testCaseFilter: 'Priority<=1&ImpactedByChanges=true'

Variable Groups: Set up environment-specific thresholds. For production:


TestGateThresholds-Prod:
  CriticalPassRate: 100
  StandardPassRate: 98
  UIPassRate: 95

For staging, use more lenient thresholds to catch issues early without blocking:


TestGateThresholds-Staging:
  CriticalPassRate: 95
  StandardPassRate: 90
  UIPassRate: 85

Flaky Test Handling: Add a pre-gate script to exclude flaky tests:

- task: PowerShell@2
  inputs:
    script: |
      $flakyTests = Invoke-RestMethod -Uri "$(System.TeamFoundationCollectionUri)$(System.TeamProject)/_apis/test/runs/$(TestRunId)/results?api-version=7.1&outcomes=Failed&$top=100"
      $flakyTests.value | Where-Object {$_.failureType -eq 'KnownIssue'} | ForEach-Object {
        Write-Host "##vso[task.setvariable variable=ExcludeTests]$($_.testCase.name)"
      }

This approach reduced our production deployment delays from 12+ hours to under 1 hour, while maintaining quality standards. The key is categorizing tests properly in Test Plans before implementing the gates.