NSPIRE Scoring Trends in Multifamily and PBRA Properties: Part 2 of NAHRO’s NSPIRE Data Series
This article is part of an ongoing series of analyses of physical inspection score data as HUD transitions from the Uniform Physical Condition Score protocol to the National Standards for the Physical Inspection of Real Estate.
Executive Summary
NAHRO has identified five key trends in NSPIRE data:
- NSPIRE scores are strong and generally an improvement from UPCS;
- HUD really will fail agencies with too many deficiencies in units;
- Scores remain consistent over time within each program and scoring trends are similar between public housing, PBRA, and broader multifamily housing;
- Scoring trends are similar regardless of development size; and
- Multifamily and PBRA units have seen a rapid acceleration in NSPIRE inspection frequency.
Each finding is discussed below.
Background
Since 2019, HUD has worked to implement a new physical inspection protocol for all of its housing programs. Previously, Project-Based Rental Assistance (PBRA) was assessed via the Uniform Physical Conditions (UPCS) protocol. In October 2023, all of the main notices governing the new protocol, the National Standards for the Physical Inspection of Real Estate (NSPIRE), were published ending UPCS inspections.[1]
Agencies did not know when exactly NSPIRE inspections would resume, how many developments would be inspected immediately, or how their scores would vary. And physical inspection scores do have tangible consequences for PBRA managers—poor scores can mean additional inspections, increased frequency of inspections, and even administrative referrals.
This report takes a first look at the patterns new scores take.
One final background note: NSPIRE replaced UPCS for other HUD programs too, including public housing. NAHRO published an analysis of those very early trends in December 2024. NAHRO will continue update NSPIRE analyses.
Partial Implementation Biases
HUD will not count affirmative requirements until October 1, 2026. Currently, these deficiencies are identified but do not factor into scores. While it is possible to determine the categories of deficiencies found under UPCS, NSPIRE doesn’t include those details. Additionally, it remains unclear whether the norming of NSPIRE standards during this period will reduce or increase the frequency of these deficiencies. As such, not counting affirmative requirements toward the scores currently reported mean these scores are artificially high, but we cannot determine how high. NAHRO will continue reporting on trends as new data are published and welcomes anecdotes from readers.
Data and Methodology
This report relies on HUD data that were publicly available at the time of its publication. Many of these datasets have been replaced with newer versions and can no longer be found online.[2] NAHRO thanks the Public and Affordable Housing Research Corporation (PAHRC) for helping NAHRO identify missing data and providing additional datasets no longer published online. Rather than publishing multiple vintages of a dataset online, HUD replaces previous data with new iterations.
This report uses multiple early publications of NSPIRE data. NAHRO combined and deduplicated the data obtained from HUD. For Multifamily Housing, on each data report, HUD provides the 3 most recent scores for each development. We separated this data into individual entries which allowed us to see and compare each UPCS and NSPIRE inspection score for an individual development since 2012. The deduplicated data for the multifamily portfolio included 109,883 UPCS scores and 8,259 NSPIRE scores. Next, NAHRO refined this the broader multifamily portfolio into a smaller sample of only Section 8 projects Project-Based Rental Assistance projects using HUD’s Multifamily – Assisted dataset by selecting only those projects with an iREMS number identified under the Section 8 indicator with active assistance.[3] This sample had 64,789 UPCS inspections and 5,040 NSPIRE inspections.
As with Public Housing, we used these data to identify the most recent UPCS and NSPIRE scores for each development. Compared to the quantity of UPCS inspections, the number of NSPIRE inspections is too low to draw final conclusions about how the protocol is working. Instead, this analysis is meant to provide agencies with a glimpse of how scores are changing based on the first inspections completed.
Finally, because we’re interested in both the broader Multifamily portfolio’s performance as well as the more specific Project-Based Rental Assistance program’s performance, for most findings, we present a Multifamily analysis and then a more specific PBRA analysis. For most findings, each “figure” we present includes two similar graphs: one for the broader Multifamily portfolio and one for PBRA only.
For more information on the data and methodology used to prepare this report, see the “Data and Methodology Appendix.”
The Findings
Finding #1: NSPIRE scores are strong and generally an improvement from UPCS.
It is actually not a simple question to consider whether and how scores changed under NSPIRE because there are three ways to compare these protocols: all UPCS scores from 2012 to 2023 vs. NSPIRE scores, the final UPCS score for each property vs. NSPIRE scores, or the final UPCS score for projects that have also had an NSPIRE inspection vs. the corresponding NSPIRE score for that same sample. The method that most resembles an experiment where you consider the outcomes before and after changing an independent variable is the final option, the final UPCS score for each project vs. the NSPIRE score for that same project. The first two variables use more data resulting in huge disparities in the size of the two samples being considered (roughly 100,000 and 32,000 UPCS scores respectively for the Multifamily portfolio vs. just 4,943 NSPIRE scores) and consider the portfolio across several presidential administrations.
Using the first two scenarios—all UPCS score data available or the final UPCS score for every property—results in a smaller positive change when comparing NSPIRE.
However, comparing the final UPCS score for projects that have had NSPIRE inspections and their corresponding NSPIRE score results in a positive shift overall as shown in Figure 1.
Figure 1: Each dot represents the change in score for a Multifamily or PBRA property that has had both a UPCS and an NSPIRE inspection
Overall, more properties in each portfolio improved their scores than did not from their last UPCS inspection to their first NSPIRE inspection. A difference of means test found that the on average, NSPIRE scores were 6 points higher than the final UPCS score for both Multifamily and PBRA developments that have had both inspection types. These differences are both statistically significant. See the Finding #1 Appendix for this test’s statistical outputs. HUD has conducted more inspections of these properties than public housing, and the average score of these developments was in the low 80s, so a higher final UPCS score meant less room to improve. However, these projects still saw increases of more than 6 points each. It would be reasonable to expect NSPIRE scores to be lower than final UPCS scores. After all, they are the same developments, and those developments are now older than when their final UPCS scores were done. The fact that NSPIRE scores are markedly and significantly higher may mean that they really are paying attention to different features and weighting the deficiencies they find differently. Once HUD begins counting affirmative requirements in scores, they may decrease, but it is not possible to predict the magnitude or significance of these changes.
NSPIRE and UPCS scores are also positively correlated, as shown in Figure 2.
Figure 2: A visual representation of the relationship between UPCS and NSPIRE scores
The linear line of best fit means that on average, a one-point increase in final UPCS score is associated with a 0.33 point increase in NSPIRE score, not quite a 1:1 ratio and slightly flatter than the relationship between the protocols in the public housing portfolio. This rate is statistically significant. Additionally, the associated “r-squared” coefficient means that just considering the two scores and no other variables explains roughly 9% of the variation found in this model. There’s clearly more to a score than just which protocol is used, but that is an important factor. See the Finding #1 Appendix for the full statistical output.
NAHRO discusses the line of scores equaling exactly 59 points under NSPIRE next.
Finding #2: The data suggest that HUD is failing units due to excessive per-unit deficiencies
Many projects scored exactly 59 points under NSPIRE—some had lower UPCS scores, some higher. In fact, projects were 19 percentage points more likely to score exactly 59 points under NSPIRE than UPCS. Under UPCS, projects had a .3% probability of scoring exactly 59 points. Both of these percentages are statistically significant. See the Finding #2 Appendix for the full output.
HUD did not provide a code book with their data saying that this is due to per-unit deficiencies. However, in the scoring notice, they say: “In the NSPIRE final rule and proposed Scoring notice, HUD identified three inspectable areas: Unit, Inside, and Outside. For scoring, HUD proposed that properties be rated against two performance thresholds: (1) Properties need to score 60 or above in all inspectable areas (“Property Threshold of Performance”), and (2) a “Unit Threshold of Performance”; where a loss of 30 points or more in the Unit portion of the inspection will result in a score adjustment to 59 or failing, even if the Inside and Outside portions of the inspection allowed it to score over 60 [….] Additionally, HUD will only lower the score to 59 if it was previously 60 or above. HUD will not further adjust scores that were already below 60.”[4] The results of this second way to fail are striking.
In Figure 2, the row of projects scoring exactly a 59 occurs among projects whose final UPCS resulted in a score in the 40s all the way through projects whose final UPCS score exceeded 90. This consistency means that unit deficiencies were present in many UPCS inspections but simply were not weighted the way they are under NSPIRE.
In fact, the trend is not only visible in Figure 2 above but when considering all physical inspections conducted since 2020 as shown in Figure 3 below.
Figure 3: All inspections conducted since the restart of physical inspections in 2021 plotted by date
It seems extremely implausible that developments are scoring 59 at such a rate by coincidence. HUD, therefore, is finding deficiencies in units causing whole developments to fail the “unit threshold of performance” and having their scores that would otherwise pass decreased to exactly 59. The health and safety of residents is the most important reason for a physical inspection, and HUD is weighting life threatening and severe deficiencies in units the highest. Agencies should focus their efforts there. This same trend is also present in public housing, so NSPIRE is also meeting the goal of creating uniformity across programs.
Finding #3: Scores remain consistent over time within each program and scoring trends are similar between public housing, PBRA, and broader multifamily housing.
The following physical inspection score patterns look remarkably similar between the Multifamily, PBRA, and Public Housing programs:
- All UPCS scores over time,
- All NSPIRE scores over time,
- The proliferation of 59s under NSPIRE,
- The absence of scores in the low 60s under NSPIRE,
- The relationship when regressing NSPIRE scores on UPCS scores,
- The improved average score from UPCS to NSPIRE, and
- And the low number of inspections just after converting from UPCS to NSPIRE.
Between programs, NSPIRE seems to be resulting in consistent trends, one of HUD’s goals. Within the Multifamily portfolio, though, scores are also internally consistent, as shown in Figure 4. The lowest average score per year since 2012 was 81, and the highest was 87 in the Multifamily portfolio. In PBRA, the trend is 80 and 87. Public housing saw slightly more variance, but scores were similarly close together. Overall, the variation in scores remains low. This is less surprising in Multifamily due to the ability to access debt for capital improvements, but it is important to remember that though initial changes were larger when comparing projects that have received both inspection types, these programs have been producing consistently quality places to live.
Figure 4: The average inspection score by year and by protocol.
Finding #4: NSPIRE scores are generally an improvement over UPCS scores for developments of all sizes that have had both inspection types
NAHRO examined whether PBRA developments of different sizes all mirror the trend of improved scores after receiving both UPCS and NSPIRE inspections. In short: they do. In order to compare trends, NAHRO grouped property sizes into four buckets: 1-50 units, 51-250 units, 251-449 units, and 500 or more units. We chose these sizes for two reasons: these thresholds are common in different HUD programs, and there is not one PBRA size methodology that is widely accepted. Furthermore, because these scores are reported at the development level, there are few developments with hundreds of units, so it is important to be granular at the small end of the development.
Each group’s average difference in score from UPCS to NSPIRE for projects that had both inspection types was between 6 and 9 points shown in Figure 5.
Figure 5: Average UPCS and NSPIRE scores for projects that received both inspection types by size.
Additionally, when plotting each protocol’s scores by property size in Figure 6, the typical score is nearly the same across sizes for both UPCS and NSPIRE.
Figure 6: All NSPIRE scores for projects that have received both inspections, plotted by development size.
And finally, at the risk of sounding like a broken record, it’s clear even when considering scores by property sizes in Figure 6 that all sized developments run the risk of failing with exactly 59 points due to the unit threshold of performance.
Finding #5: Multifamily and PBRA units have seen a rapid acceleration in NSPIRE inspection frequency.
These data provide one of the first looks into the number of NSPIRE inspections HUD has performed and the speed at which they have ramped up inspections. Prior to the conversion from UPCS to NSPIRE, HUD was inspecting 900-1200 Multifamily developments and 500-700 projects per month in the PBRA portfolio. Both programs saw the monthly number of NSPIRE inspections increase to UPCS levels by early 2024, a few months after the conversion. For this reason, Multifamily owners can likely expect NSPIRE inspections to occur more quickly than public housing agencies can.
As we saw earlier, it is not impossible for developments to “slip through the cracks” and go beyond three years without an inspection. Though it was partially due to COVID, many projects’ final UPCS scores were more than six years old, twice as long as the maximum duration should have been. NAHRO will watch this metric in future data publications.
Figure 7: The number of inspections performed per month and recorded in published data by protocol.
Conclusion
It is “too early to call” exactly how NSPIRE score patterns will change from UPCS in the long term. So far scores look to be an improvement for agencies that have had both kinds of inspection, HUD does appear to be valuing the units the most, Small Rural agencies are seeing similar outcomes, and HUD is either ramping up inspections slower than expected or is slow to publish scores.
Appendices
Data and Methodology Appendix
UPCS and NSPIRE data from HUD were used to determine NSPIRE and UPCS inspection scores from 2012 to 2024. HUD reported data from before 2012, but it was not possible to verify that there were not redundancies in scores from before these years. The data identified projects and inspections through unique ID numbers. Corresponding inspection scores, inspection dates, and geographical information were also provided. NAHRO deduplicated published HUD data, encompassing all publicly available inspections. Final UPCS scores and Final NSPIRE scores were collected by identifying the most recent inspection date and score for each development. Due to the recent implementation of these standards, most developments did not have an NSPIRE score.
Finding #1 Appendix: NSPIRE scores are strong and generally an improvement from UPCS.
As mentioned in this section, there were three ways to evaluate the change to NSPIRE scores: all UPCS vs. all NSPIRE scores, the final UPCS score recorded for each project vs. all NSPIRE scores, or the final UPCS score vs. final NSPIRE score for projects that received both types of inspection. More details on the final method are below. NAHRO only presents the third method here because comparing NSPIRE scores against all UPCS scores means that you are comparing the portfolio in 2023-2024 to scores from more than a decade ago, many of which are no longer the most recent.
As referenced in the paper, the third method resulted in a positive 6-point difference from UPCS to NSPIRE for the properties that received both types of inspection. The output is below in Figure 8.
Difference of Means Test for the Multifamily Portfolio:
| t-Test: Two-Sample Assuming Unequal Variances | ||
| Last NSPIRE | Last UPCS | |
| Mean | 87.46002288 | 81.11287657 |
| Variance | 197.7172867 | 179.6094286 |
| Observations | 7867 | 7867 |
| Hypothesized Mean Difference | 0 | |
| df | 15696 | |
| t Stat | 28.98173005 | |
| P(T<=t) one-tail | 2.9616E-180 | |
| t Critical one-tail | 1.644950713 | |
| P(T<=t) two-tail | 5.9232E-180 | |
| t Critical two-tail | 1.960115135 | |
Difference of Means Test for the PBRA Portfolio:
| t-Test: Two-Sample Assuming Unequal Variances | ||
| Last NSPIRE | Last UPCS | |
| Mean | 86.69101578 | 80.49413193 |
| Variance | 223.808375 | 185.4894414 |
| Observations | 4942 | 4942 |
| Hypothesized Mean Difference | 0 | |
| df | 9796 | |
| t Stat | 21.53302343 | |
| P(T<=t) one-tail | 7.9842E-101 | |
| t Critical one-tail | 1.645009192 | |
| P(T<=t) two-tail | 1.5968E-100 | |
| t Critical two-tail | 1.960206181 | |
Finally, the regression output showing the relationship between final UPCS score and NSPIRE score is below in Figure 9.
| SUMMARY OUTPUT | ||||||||
| REGRESS FINAL NSPIRE ON FINAL UPCS FOR PBRA ONLY | ||||||||
| Regression Statistics | ||||||||
| Multiple R | 0.298803862 | |||||||
| R Square | 0.089283748 | |||||||
| Adjusted R Square | 0.089099392 | |||||||
| Standard Error | 14.27820664 | |||||||
| Observations | 4942 | |||||||
| ANOVA | ||||||||
| df | SS | MS | F | Significance F | ||||
| Regression | 1 | 98733.28793 | 98733.29 | 484.302 | 1.8E-102 | |||
| Residual | 4940 | 1007103.893 | 203.8672 | |||||
| Total | 4941 | 1105837.181 | ||||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
| Intercept | 60.27124791 | 1.217583643 | 49.5007 | 0 | 57.88424 | 62.65825 | 57.88424 | 62.65825 |
| LastUPCS | 0.3282198 | 0.014914429 | 22.00686 | 1.8E-102 | 0.298981 | 0.357459 | 0.298981 | 0.357459 |
Finding #2 Appendix: HUD really will fail agencies with too many significant deficiencies in living units
| Regression Statistics | ||||||||
| Multiple R | 0.067922308 | |||||||
| R Square | 0.00461344 | |||||||
| Adjusted R Square | 0.004605033 | |||||||
| Standard Error | 0.072812831 | |||||||
| Observations | 118397 | |||||||
| ANOVA | ||||||||
| df | SS | MS | F | Significance F | ||||
| Regression | 1 | 2.909258356 | 2.909258 | 548.7398 | 4.5E-121 | |||
| Residual | 118395 | 627.695757 | 0.005302 | |||||
| Total | 118396 | 630.6050153 | ||||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
| Intercept | 0.003982832 | 0.000219567 | 18.13949 | 1.96E-73 | 0.003552 | 0.004413 | 0.003552 | 0.004413 |
| NSPIRE Indicator | 0.019281263 | 0.000823099 | 23.4252 | 4.5E-121 | 0.017668 | 0.020895 | 0.017668 | 0.020895 |
Finding #5 Appendix: Multifamily and PBRA units have seen a rapid acceleration in NSPIRE inspection frequency
It’s important to note that the common trends seen in properties of all sizes is not due to disproportionate sampling. The percent of NSPIRE inspections for properties of each size generally matches the distribution of property sizes. So no size category has its average score skewed due to over- or under-sampling. Figure 11 below shows the percent of projects inspected by size to the proportion of each sized project that exists.
| Size | Count – NSPIRE | Percent – NSPIRE | Count – PBRA | Percent – PBRA |
| Less than 50 | 2139 | 43.4% | 8520 | 48.5% |
| 51 to 250 | 2668 | 54.1% | 8615 | 49.0% |
| 251 to 499 | 126 | 2.6% | 413 | 2.3% |
| 500+ | 9 | 0.2% | 29 | 0.2% |
| Total | 4933 | 17577 |
[1] Economic Growth Regulatory Relief and Consumer Protection Act: Implementation of National Standards for the Physical Inspection of Real Estate (NSPIRE). U.S. Department of Housing and Urban Development (HUD). https://www.federalregister.gov/documents/2023/05/11/2023-09693/economic-growth-regulatory-relief-and-consumer-protection-act-implementation-of-national-standards
[2] Physical Inspection Scores. U.S. Department of Housing and Urban Development (HUD). https://www.huduser.gov/portal/datasets/pis.html, and
Multifamily Housing – Physical Inspection Scores by State. U.S. Department of Housing and Urban Development (HUD).https://www.hud.gov/stat/mfh/inspection-scores
[3] Multifamily Properties – Assisted. U.S. Department of Housing and Urban Development. https://hudgis-hud.opendata.arcgis.com/datasets/f4721da932a94b218bdb5a861fd7429e_0/explore?location=2.178501%2C-8.179588%2C2.19&showTable=true
[4] National Standards for the Physical Inspection of Real Estate and Associated Protocols, Scoring Notice. U.S. Department of Housing and Urban Development (HUD). https://www.federalregister.gov/documents/2023/07/07/2023-14362/national-standards-for-the-physical-inspection-of-real-estate-and-associated-protocols-scoring
More Articles in this Issue
International Social Housing Festival
NAHRO’s International Research and Global Exchange (IRGE) Committee Goes to Europe In June 2025, about…Leveraging Administrative Fees in the Housing Choice Voucher Program to Address Housing Instability Among Children
The Housing Choice Voucher (HCV) program, often referred to as Section 8, is the largest…Member Perspective: Incarnation Machine
To know Egypt is to step into an incarnation machine, a place where the ancient…From Crisis to Community
Housing and Community Development professionals see first hand how crisis delivery unfolds, including the immediate impacts, where…