9  Infection Dates

To support date-based analyses, infection records will be aggregated by MMWR week and year. For both the Los Angeles County and California datasets, we will generate two new columns: mmwr_year and mmwr_week, then remove the original date-based fields. We will also add columns start_date and end_date to serve as reference points should we need them later.

In the California dataset, the field time_int encodes the year and MMWR week as a six-digit integer (YYYYWW). To create the new fields, we extract the first four digits as mmwr_year and the last two digits as mmwr_week, then drop the original time_int column.

In the Los Angeles County dataset, the codebook identifies a field dt_report as the last day of the MMWR week. However, this field contained only missing values, so it was removed. Instead, we convert the infection date field, dt_dx to a proper date format, and then use the MMWRweek package to derive the mmwr_year and mmwr_week.

Code
##-- California dataset:
step2_ca_df <- step1_ca_df %>%
##--pull MMWR week and year from time_int field
mutate(
  mmwr_year = factor(time_int %/% 100), 
  mmwr_week = factor(time_int %% 100)
) %>%
add_start_end_dates() %>%
select(-time_int) %>%
relocate(mmwr_year, mmwr_week, start_date, 
         end_date, .before = everything())


##-- LA county dataset:
step2_la_cnty_df <- step1_la_cnty_df %>%
##--restructure to proper date format
mutate(
  DATE_FIX = 
  as.Date(parse_date_time(dt_dx, "%d%b%Y"), 
          format = "%Y-%m-%d")
) %>% 
##--use date to create new MMWR fields
add_mmwr_week_columns(date_col = "DATE_FIX") %>%
add_start_end_dates() %>%
select(-c(DATE_FIX, dt_dx)) %>%
relocate(mmwr_year, mmwr_week, start_date, 
         end_date, .before = everything()) %>%
relocate(county, .before = age_cat)

To streamline this process, we created two helper functions:

The dataframes now have a structure that looks like this:

mmwr_year mmwr_week start_date end_date county age_cat new_infections
2023 22 2023-05-28 2023-06-03 Los Angeles 0-17 15
2023 23 2023-06-04 2023-06-10 Los Angeles 0-17 17
2023 24 2023-06-11 2023-06-17 Los Angeles 0-17 23