Skip to contents

This function prepares data retrieved from the weekly aggregated NHSN hospital respiratory data API. The data must be first retrieved using get_nhsn_weekly. Once pulled from the API, this function will conditionally adjust partial reporting and add extended time series data (see 'Details' for more information). The preparation also includes joining to internal data prepared to estimate the historical severity of each epiweek.

Usage

prep_nhsn_weekly(
  dat,
  adjust_partial = TRUE,
  trim = NULL,
  statesonly = TRUE,
  augment = FALSE,
  augment_stop = "2020-10-18"
)

Arguments

dat

Weekly hospital utilization data from get_nhsn_weekly

adjust_partial

Logical as to whether or not the partial reporting should be adjusted (see 'Details' for more); default is TRUE

trim

Named list with elements for epiyear and epiweek corresponding to the minimum epidemiological week to retain; default is set to NULL the data will not be trimmed; to override the default use a named list (e.g., list(epiyear=2020, epiweek=43))

statesonly

Logical as to whether or not the data should be limited to states and DC (i.e., no other territories included); default is TRUE

augment

Logical as to whether or not the data should be augmented with NHSN hospitalizations imputed backwards in time (see 'Details' for more); default is FALSE

augment_stop

Date at which the time series imputation data should stop; yyyy-mm-dd format; only used if "augment" is TRUE default is "2020-10-18"

Value

A tibble with hospitalization data summarized to epiyear/epiweek with the following columns:

  • abbreviation: Abbreviation for the location

  • location: FIPS code for the location

  • week_start: Date of beginning (Sunday) of the given epidemiological week

  • monday: Date of Monday of the given epidemiological week

  • week_end: Date of end (Saturday) of the given epidemiological week

  • epiyear: Year of reporting (in epidemiological week calendar)

  • epiweek: Week of reporting (in epidemiological week calendar)

  • flu.admits: Count of flu cases among admitted patients on previous week

  • flu.admits.cov: Coverage (number of hospitals reporting) for incident flu cases

  • ili_mean: Estimate of historical ILI activity for the given epidemiological week

  • ili_rank: Rank of the given epidemiological week in terms of ILI activity across season (1 being highest average activity)

  • hosp_mean: Estimate of historical flu hospitalization rate for the given epidemiological week

  • hosp_rank: Rank of the given epidemiological week in terms of flu hospitalizations across season (1 being highest average activity)

Details

The weekly aggregated data from NHSN includes locations that may have incomplete coverage of hospitals reporting (see https://data.cdc.gov/Public-Health-Surveillance/Weekly-Hospital-Respiratory-Data-HRD-Metrics-by-Ju/mpgq-jmmr/about_data for more information). The preparation in this function includes an optional step triggered by the "adjust_partial" argument to find the maximum coverage at any time point for each location, then adjusts the reported counts by a factor of X / Y_t, where X is the maximum coverage and Y_t is the coverage at time point t. If the coverage for the given week is near or equal to the maximum observed coverage, then the counts will have little to no effect on the counts. Note that this should be used with caution, as it is possible that some locations may have non-uniform reporting behaviors, especially during non-mandatory NHSN reporting windows. In other words, the counts may be adjusted using reported values from healthcare facilities that may be of a different size, serve different communities, or otherwise have different characteristics than the facilities that did not report.

The preparation for the weekly flu hospitalization data includes an option to "augment" the input time series. The augmentation is based on an extended time series that was developed with an imputation approach. The extended time series estimates flu hospitalizations at the state-level in years before NHSN reporting became available. If the user decides to include the imputed data, then the time series is extended backwards in time from the "augment_stop" date (defaults to October 18, 2020). The prepended data augmentation is formatted to match the true NSHN reporting. For more details on the data augmentation approach, refer to the publication: https://www.medrxiv.org/content/10.1101/2024.07.31.24311314.