How to weight defect severity vs priority for dashboard metrics

Our executive dashboard is giving misleading signals about defect health because we’re not properly weighting severity versus priority in our KPI calculations. Right now we just count total open defects, which treats a cosmetic UI issue the same as a data corruption bug.

I’m trying to design custom reports in Insight Reporting that apply different weights to severity levels (Critical=10, High=5, Medium=2, Low=1) and combine that with priority scoring to create a composite “defect health score.” The challenge is figuring out the right KPI formulas - should severity and priority be multiplicative or additive? Has anyone implemented weighted defect scoring that actually improves decision-making?

Here’s a simple formula I’m considering:


Defect_Score = (Severity_Weight * Priority_Weight) * Age_Factor
Total_Health = Sum(All_Open_Defect_Scores) / Baseline_Target

But I’m not sure if this accurately reflects risk or if we should use a different approach for combining the dimensions.

For custom reports in ELM, I’d recommend creating calculated attributes rather than doing the math in the report layer. Define a “Risk Score” calculated attribute on the defect work item type that applies your formula, then your reports just sum or average that attribute. This keeps the logic consistent across all dashboards and lets you use the score in queries and filters. You can update the calculation formula in one place rather than modifying multiple report definitions.

After analyzing defect resolution patterns across 50+ releases, I can share what we’ve learned about severity weighting and priority scoring for meaningful KPI formulas:

Core Formula Structure: Use additive weighting for severity and priority, with age as a multiplicative factor. Multiplicative severity×priority creates exponential scaling that distorts the metric - a Critical/P1 becomes 100x more important than Medium/P3, which doesn’t match actual business impact.

Our Production Formula:


Base_Score = (Severity_Weight * 0.65) + (Priority_Weight * 0.35)
Age_Multiplier = 1 + (Weeks_Open * 0.12)
Final_Score = Base_Score * Age_Multiplier * Status_Factor

Severity weights: Critical=100, High=60, Medium=30, Low=10

Priority weights: P1=100, P2=70, P3=40, P4=15

Status factors: New=1.0, In Progress=0.7, Blocked=1.3

Why 65/35 Split: Our historical data showed severity is a better predictor of customer escalations and release blockers than priority. Priority captures business judgment but is more subjective and variable across teams.

Age Multiplier Calibration: 12% per week means a defect doubles in score after about 8 weeks, which aligns with when stakeholders typically start escalating. Adjust this based on your release cadence.

Handling Severity/Priority Conflicts: Build a separate “alignment score” that flags when severity and priority diverge by 2+ levels. Display this as a dashboard widget showing count of misaligned defects. Don’t try to hide the disagreement in a composite score - make it visible for triage discussions.

Custom Reports Implementation: Create the calculated attribute in your work item type definition, then reference it in Insight Reporting custom reports. This ensures consistency and lets you filter/sort by risk score. You can also use the REST API to bulk-calculate scores for existing defects if you’re changing the formula.

Dashboard Design: Show three metrics side by side - Total Defect Count (unweighted), Total Risk Score (weighted), and Average Score Per Defect. This helps teams understand whether they have many low-risk issues or few high-risk ones. We also trend the 4-week moving average of new score added vs score resolved to show if risk is accumulating or declining.

The biggest improvement in decision-making came from separating technical risk (severity-driven) from business impact (priority-driven) in our dashboards, as another commenter mentioned. Executives care about business impact, engineering cares about technical risk, and showing both prevents talking past each other in release readiness meetings.

We actually built separate scores for technical risk (severity-driven) and business impact (priority-driven), then display both in our dashboard. This makes the disagreements visible rather than trying to hide them in a single number. When severity and priority diverge by more than 2 levels, the dashboard flags it in yellow so the team can discuss whether the priority assignment makes sense. Sometimes it’s correct (low usage feature with high severity), sometimes it reveals a mis-prioritization that needs correction.

One issue with age factors is that some defects legitimately take longer to fix due to complexity, not neglect. We added a complexity adjustment to our scoring - defects tagged as “architectural” or “requires-research” get a 0.5x age multiplier so they don’t unfairly inflate the health score. You might also want to consider defect status - a defect actively in progress should score lower than one that’s just sitting in the backlog.