2025  Journal Article

Developing Nationwide Estimates of Built Environment Quality Characteristics Using Street-View Imagery and Computer Vision

Pub TLDR

How can we measure what makes neighborhoods look and feel good to people across the entire United States?

DOI: 10.1021/acs.est.5c00966    PubMed ID: 40607680
 

College of Health researcher(s)

Abstract

Environmental health studies commonly rely on urban composition measures for built environment exposure assessment. However, quality measures are equally important, as they directly influence health behaviors. We leveraged computer vision and street-view imagery to estimate five components of built environment quality (perceived beauty, relaxation potential, nature quality, safe for walking, and safety from crime) across all U.S. cities, explicitly addressing socio-demographic and temporal biases. We collected 72 516 surveys via Amazon Mechanical Turk, where participants ranked street-view images and provided socio-demographic data. Deep learning models predicted quality metrics at 120 million street locations for 2008, 2012, 2016, and 2020. Cross-validation accuracy ranged from 73% (nature quality) to 59% (safety from crime) compared to 50% expected by random chance. Adjusting sampling weights based on demographics reduced but did not eliminate biases for Hispanic/Latino and Native Hawaiian or Pacific Islander groups (3.5 and 4% lower accuracy, respectively). We also adjusted model predictions for seasonal biases, correcting higher scores from late spring and early summer imagery (p < 0.001). The resulting nationwide estimates of street-level beauty, relaxation, nature quality, and safety for walking (but not safety from crime) can inform epidemiological research, urban planning strategies, and public health interventions.

Larkin, A., Huang, T., Chen, L., Lin, P.D., Hart, J.E., Zhang, W., Coull, B.A., Yi, L., Suel, E., Hankey, S., James, P., Hystad, P. (2025) Developing Nationwide Estimates of Built Environment Quality Characteristics Using Street-View Imagery and Computer VisionEnvironmental Science & Technology
 
Publication FAQ

FAQ: Built Environment Quality Assessment Using Street-View Imagery

What is the primary objective of this research?

The primary objective of this study is to develop nationwide estimates of built environment quality characteristics across all U.S. cities. This is achieved by using street-view imagery and deep learning computer vision models. The research specifically aims to provide data on perceived beauty, nature quality, relaxation potential, safety for walking, and safety from crime, while also addressing and correcting for socio-demographic, geographic, and seasonal biases in the data.

How does "built environment quality" differ from "built environment composition"?

"Built environment composition" refers to objectively measurable elements of an urban area, such as road density, tree canopy coverage, or the presence of sidewalks and parks. These are quantitative assessments. In contrast, "built environment quality" is subjective and relies on visual perception, encompassing elements like perceived beauty, safety, or the relaxing potential of a space. While composition focuses on what is physically present, quality emphasizes how those elements are perceived and experienced by individuals. For instance, two areas might have similar amounts of trees (composition), but their perceived "nature quality" could differ based on the health, arrangement, and integration of those trees within the overall visual scene.

What are the five specific built environment quality characteristics measured in this study?

The study focuses on five key components of built environment quality:

  • Perceived beauty: How aesthetically pleasing an area is.
  • Relaxation potential: The degree to which an environment promotes a sense of calm and relaxation.
  • Nature quality: The perceived presence, health, and integration of natural elements.
  • Safe for walking: The perception of safety for pedestrians in terms of traffic and general environment.
  • Safety from crime: The perceived absence of risk from criminal activity in the area.

How was the data for training the computer vision models collected and what were its limitations?

The data for training the deep learning models was collected through 72,516 surveys administered via Amazon Mechanical Turk (AMT) between February 2021 and February 2022. Participants were shown pairs of Google Street View (GSV) images and asked to choose which image had a higher quality for each of the five characteristics, using a slider to indicate the difference.

A significant limitation of the collected data was its demographic representation. The survey responders were not representative of the overall U.S. population; they were predominantly young (62% between 18 and 39), well-educated (71.8% with a bachelor's degree or more), and a large majority were non-Hispanic (82.8%) and White (78.0%). This demographic bias highlighted a key challenge in built environment quality assessment, as perceptions can be subjective and influenced by individual, cultural, and contextual differences.

How did the researchers address socio-demographic and seasonal biases in their models?

To address socio-demographic biases, the researchers dynamically adjusted the percentage of training records sampled from each demographic group during model training to minimize differences in prediction error (RMSE) between groups. While this reduced many significant demographic differences, some biases remained, notably for Hispanic/Latino and Native Hawaiian or Pacific Islander participants.

For seasonal biases, the study recognized that street-view images taken during different times of the year (e.g., summer vs. winter) could significantly influence perceived quality due to changes in vegetation, sky conditions, and lighting. To counter this, they applied monthly temporal adjustments to the built environment quality scores. These adjustments were positive for winter months and negative for summer months, reflecting the hypothesis that models were influenced by seasonal visual cues.

Which built environment quality metric had the highest and lowest model accuracy, and why?

The "nature quality" metric had the highest model accuracy at 73.1%, while "safety from crime" had the lowest accuracy at 58.9%.

Nature quality likely achieved higher accuracy because it is driven by visually identifiable features such as vegetation density, tree types, greenness, and spatial arrangement of natural elements, which are clearly captured in street-view imagery. In contrast, perceptions of safety from crime are more subtle, context-dependent, and socially constructed. They can involve cues not easily visible in static images, such as signs of disorder, socio-cultural context, or historical crime events. This makes it more challenging for computer vision models relying solely on visual input to accurately predict safety from crime.

What are the potential applications and implications of these nationwide built environment quality estimates?

These nationwide estimates of built environment quality have significant implications for several fields:

  • Epidemiological Research: The data can inform studies investigating how environmental factors influence health outcomes, allowing researchers to differentiate quality-based assessments from purely quantitative indicators of the environment. This can lead to a deeper understanding of environmental determinants of health.
  • Urban Planning Strategies: Urban planners can use these metrics to prioritize and design targeted infrastructure improvements. For example, they can identify areas needing enhanced green space quality or the creation of safer, more pedestrian-friendly environments.
  • Public Health Interventions: Public health practitioners can leverage this data to develop novel interventions aimed at improving community well-being. By understanding how people perceive their surroundings, interventions can be tailored to increase physical activity, reduce stress, and improve overall physical and mental health.

How does this study advance previous work in quantifying urban perception using street-view imagery?

This study significantly advances previous work by providing comprehensive, nationwide estimates of multiple built environment quality metrics, rather than focusing on single cities or limited geographical scopes. Crucially, it explicitly evaluates and corrects for socio-demographic, geographic, and seasonal biases, which were limitations in earlier foundational studies. By generating standardized and comprehensive data from 36 million street-view locations across the U.S. at four distinct time points (2008, 2012, 2016, and 2020), the study enhances methodological rigor and provides broadly applicable data for large-scale research and practical applications.