10  Race and Ethnicity

Each of the three datasets defines Race / Ethnicity differently. The California dataset uses numeric codes, the Los Angeles County dataset uses full text labels, and the population dataset uses abbreviated text.

To resolve this, we created a crosswalk file (race_ethnicity_map.csv) that aligns the three formats. By joining this crosswalk to each dataset, we ensure that all three contain a consistent set of race and ethnicity variables: each with the numeric code, the abbreviated text, and the full text label.

Code
race_ethnicity_map <- read.csv(file = here("data/race_ethnicity_map.csv")) %>%
  mutate(
    race_long = clean(race_long),
    race_short = clean(race_short),
    race_coded = as.character(race_coded)
  )
Code
step2_ca_df <- step2_ca_df %>%
  rename("race_coded" = "race_ethnicity") %>%
  mutate(race_coded = as.character(race_coded)) %>%
  left_join(race_ethnicity_map, by = "race_coded") %>%
  relocate(race_coded, race_short, race_long, .after = sex)

step2_la_cnty_df <- step2_la_cnty_df %>%
  mutate(race_long = clean(race_ethnicity)) %>%
  select(-race_ethnicity) %>%
  left_join(race_ethnicity_map, by = "race_long") %>%
  relocate(race_coded, race_short, race_long, .after = sex)

Race and Ethnicity Crosswalk Table:

race_coded race_long race_short
1 White, Non-Hispanic White NH
2 Black, Non-Hispanic Black NH
3 American Indian or Alaska Native, Non-Hispanic AIAN NH
4 Asian, Non-Hispanic Asian NH
5 Native Hawaiian or Pacific Islander, Non-Hispanic NHPI NH
6 Multiracial (two or more of above races), Non-Hispanic MR NH
7 Hispanic (any race) Hispanic
9 Unknown Unknown

Each dataset now contains 3 columns for each unique race/ethnicity format, for example:

Code
step2_ca_df %>%
select(
    county, age_cat, sex, race_coded, 
    race_short, race_long
)%>%
group_by(race_short) %>%
slice_head(n=1) %>%
arrange(race_coded) %>%
  
rename_with(
~cell_spec(
    .x,
    "html",
    bold = TRUE,
    color = "#0f172a",
    background = "#ffde59",
    font_size = 16
  ), 4:6
) %>%
kbl(escape = FALSE, align = "c") %>%
row_spec(0, 
  bold = TRUE, 
  background = "#0f172a",
  extra_css = 
   "font-size: 16px!important;color:#ffffff;"
) %>%
kable_styling(bootstrap_options = c("bordered"))
county age_cat sex race_coded race_short race_long
Alameda 0-17 FEMALE 1 White NH White, Non-Hispanic
Alameda 0-17 FEMALE 2 Black NH Black, Non-Hispanic
Alameda 0-17 FEMALE 3 AIAN NH American Indian or Alaska Native, Non-Hispanic
Alameda 0-17 FEMALE 4 Asian NH Asian, Non-Hispanic
Alameda 0-17 FEMALE 5 NHPI NH Native Hawaiian or Pacific Islander, Non-Hispanic
Alameda 0-17 FEMALE 6 MR NH Multiracial (two or more of above races), Non-Hispanic
Alameda 0-17 FEMALE 7 Hispanic Hispanic (any race)