Use of individual Google Location History data to identify consumer encounters with food outlets

2025  Journal Article

Use of individual Google Location History data to identify consumer encounters with food outlets

Pub TLDR

This study introduces a novel method for analyzing individual-level food environment interactions by leveraging Google Location History (GLH) data. This approach aims to overcome limitations of traditional GIS methods that simplify food access dynamics by focusing solely on residential neighborhoods.

DOI: 10.1186/s12942-025-00387-w    PubMed ID: 39955543
 

College of Health researcher(s)

OSU Profile

Abstract

Background

Addressing key behavioral risk factors for chronic diseases, such as diet, requires innovative methods to objectively measure dietary patterns and their upstream determinants, notably the food environment. Although GIS techniques have pushed the boundaries by mapping food outlet availability, they often simplify food access dynamics to the vicinity of home addresses, possibly misclassifying neighborhood effects. Leveraging Google Location History Timeline (GLH) data offers a novel approach to assess long-term patterns of food outlet utilization at an individual level, providing insights into the relationship between food environment interactions, diet quality, and health outcomes.

Methods

We leveraged GLH data previously collected from a sub-set of participants in the Washington State Twin Registry (WSTR). GLH included more than 287 million location records from 357 participants. We developed methods to identify visits to food outlets using outlet-specific buffer zones applied to the InfoUSA data on food outlet locations. This methodology involved the application of minimum and maximum stay durations, along with revisit intervals. We calculated metrics from the GLH data to detect frequency of visits to different food outlet classifications (e.g. grocery stores, fast food, convenience stores) important to health. Several sensitivity analyses were conducted to examine the robustness of our food outlet metrics and to examine visits occurring within 1 and 2.5 km of residential locations.

Results

We identified 156,405 specific food outlet visits for the 357 study participants. 60% were full-service restaurants, 15% limited-service restaurants, and 16% supermarkets. Mean visits per person per month to any food outlet was 12.795. Only 8, 10 and 11% of full-service restaurants, limited-service restaurants, and supermarkets, respectively, occurred within 1 km of residential locations.

Conclusions

GLH data presents a novel method to assess individual-level food utilization behaviors.

Oje, O., Amram, O., Hystad, P., Gebremedhin, A.H., Monsivais, P. (2025) Use of individual Google Location History data to identify consumer encounters with food outletsInternational Journal of Health Geographics24(1)
 
Publication FAQ

FAQ on Using Google Location History (GLH) Data for Food Environment Research

What is Google Location History (GLH) data and how can it be used to study food environments?

GLH data comprises longitude and latitude coordinates, timestamps, and accuracy measurements collected passively from smartphone users who have enabled location tracking in their Google accounts. In the context of food environment research, it provides an objective record of an individual's visits to various locations, including food outlets like restaurants, grocery stores, and convenience stores, over extended periods. This allows researchers to quantify and analyze individual-level food outlet utilization patterns, providing insights into dietary behavior and its relationship with the surrounding food environment.

What are the key methodological steps involved in using GLH data to identify consumer encounters with food outlets?

The key steps include: 1) Initial Data Filtration: Refining raw location data based on accuracy (using a precision threshold); 2) Construction of Preliminary Visit List: Isolating location points falling within buffer zones around food outlets; 3) Consolidation of Visit List: Merging contiguous visits into a single event using a revisit interval parameter; 4) Finalization of Visit List: Applying minimum and maximum stay duration parameters to validate visits; and 5) Standardization of Visits: Standardizing visit frequency and duration based on time spent within an active area.

How does this methodology differ from traditional GIS-based approaches to studying food environments?

Traditional GIS-based approaches primarily focus on analyzing the food environment around residential addresses, often using proximity and density measures. These methods may oversimplify food access dynamics by assuming individuals primarily interact with food outlets near their homes. GLH data provides a more nuanced approach by tracking actual visits to food outlets regardless of their proximity to home, capturing patterns that traditional methods might miss. It also overcomes the recall bias inherent in self-reported dietary data.

What parameters are used to define a valid visit to a food outlet using GLH data, and why are they important?

The parameters are:

  • Location Precision: The accuracy radius of the location point (default < 50m). This ensures that only reliable location data are used.
  • Outlet Buffer Zone: A circular zone around each outlet, sized according to the median size of that type of outlet, used to detect a potential visit. This accounts for the varying sizes and consumer behaviors at different outlet types.
  • Minimum Stay Duration: The shortest time an individual must spend in the buffer zone to count as a visit (default 3 minutes). This helps distinguish genuine visits from incidental passing-by.
  • Maximum Stay Duration: A limit on how long an individual can stay in the buffer zone before the visit is no longer considered patronage (default 3 hours). This helps exclude non-consumer stays, like employee shifts.
  • Revisit Interval: The period of time that must pass for an individual to return to the buffer zone and it be considered a new visit (default 3 hours). This accurately consolidates visits.

What were the main findings regarding visit patterns to different types of food outlets in the Washington State Twin Registry (WSTR) study?

The study found that participants spent more time at full-service restaurants and fruit/vegetable markets compared to convenience stores, suggesting preferences for sit-down meals or fresh produce over quick purchases. Full-service restaurants were visited far more often than limited-service restaurants.

What is the "active area" and why is it used in the data processing workflow?

The "active area" is a 50 km radius around all listed food outlets, representing the area where participant activity is considered relevant to the study. It is used to standardize visit data by calculating the number of active days per participant within this area, ensuring a consistent basis for analyzing visit patterns and addressing variability in data coverage across participants. Periods spent outside the active area for more than three hours are not counted as active.

What are some of the limitations of using GLH data for food environment research?

Limitations include: 1) Geographical Limitation: Findings are specific to the study area (Washington State) and may not generalize to other regions; 2) Temporal Limitation of Outlet Data: The outlet data lacks a temporal dimension and can't discern whether an outlet was active during the data collection period; 3) Ambiguities in Outlet Locations: Multiple outlets reported with the same coordinates create uncertainty in determining which outlet was visited; 4) Policy changes impacting data availability: Recent changes to Google's Location History policy, effective December 2024, introduce limitations for research reliant on this data.

How can future research build upon this study to provide more comprehensive insights into the relationship between food environments and dietary behaviors?

Future research should: 1) Expand the geographic scope to include more diverse locations; 2) Incorporate dynamic outlet data with opening and closing dates; 3) Refine spatial data processing techniques to distinguish between outlets with the same coordinates; 4) Integrate GLH data with detailed dietary intake information to assess the influence of the built environment on food choices and health outcomes; and 5) consider alternative data collection methods to account for recent changes to google policy.