Code
race_ethnicity_map <- read.csv(file = here("data/race_ethnicity_map.csv")) %>%
mutate(
race_long = clean(race_long),
race_short = clean(race_short),
race_coded = as.character(race_coded)
)Each of the three datasets defines Race / Ethnicity differently. The California dataset uses numeric codes, the Los Angeles County dataset uses full text labels, and the population dataset uses abbreviated text.
To resolve this, we created a crosswalk file (race_ethnicity_map.csv) that aligns the three formats. By joining this crosswalk to each dataset, we ensure that all three contain a consistent set of race and ethnicity variables: each with the numeric code, the abbreviated text, and the full text label.
race_ethnicity_map <- read.csv(file = here("data/race_ethnicity_map.csv")) %>%
mutate(
race_long = clean(race_long),
race_short = clean(race_short),
race_coded = as.character(race_coded)
)step2_ca_df <- step2_ca_df %>%
rename("race_coded" = "race_ethnicity") %>%
mutate(race_coded = as.character(race_coded)) %>%
left_join(race_ethnicity_map, by = "race_coded") %>%
relocate(race_coded, race_short, race_long, .after = sex)
step2_la_cnty_df <- step2_la_cnty_df %>%
mutate(race_long = clean(race_ethnicity)) %>%
select(-race_ethnicity) %>%
left_join(race_ethnicity_map, by = "race_long") %>%
relocate(race_coded, race_short, race_long, .after = sex)| race_coded | race_long | race_short |
|---|---|---|
| 1 | White, Non-Hispanic | White NH |
| 2 | Black, Non-Hispanic | Black NH |
| 3 | American Indian or Alaska Native, Non-Hispanic | AIAN NH |
| 4 | Asian, Non-Hispanic | Asian NH |
| 5 | Native Hawaiian or Pacific Islander, Non-Hispanic | NHPI NH |
| 6 | Multiracial (two or more of above races), Non-Hispanic | MR NH |
| 7 | Hispanic (any race) | Hispanic |
| 9 | Unknown | Unknown |
Each dataset now contains 3 columns for each unique race/ethnicity format, for example:
step2_ca_df %>%
select(
county, age_cat, sex, race_coded,
race_short, race_long
)%>%
group_by(race_short) %>%
slice_head(n=1) %>%
arrange(race_coded) %>%
rename_with(
~cell_spec(
.x,
"html",
bold = TRUE,
color = "#0f172a",
background = "#ffde59",
font_size = 16
), 4:6
) %>%
kbl(escape = FALSE, align = "c") %>%
row_spec(0,
bold = TRUE,
background = "#0f172a",
extra_css =
"font-size: 16px!important;color:#ffffff;"
) %>%
kable_styling(bootstrap_options = c("bordered"))| county | age_cat | sex | race_coded | race_short | race_long |
|---|---|---|---|---|---|
| Alameda | 0-17 | FEMALE | 1 | White NH | White, Non-Hispanic |
| Alameda | 0-17 | FEMALE | 2 | Black NH | Black, Non-Hispanic |
| Alameda | 0-17 | FEMALE | 3 | AIAN NH | American Indian or Alaska Native, Non-Hispanic |
| Alameda | 0-17 | FEMALE | 4 | Asian NH | Asian, Non-Hispanic |
| Alameda | 0-17 | FEMALE | 5 | NHPI NH | Native Hawaiian or Pacific Islander, Non-Hispanic |
| Alameda | 0-17 | FEMALE | 6 | MR NH | Multiracial (two or more of above races), Non-Hispanic |
| Alameda | 0-17 | FEMALE | 7 | Hispanic | Hispanic (any race) |