Estimate Usual Nutrient Intake using NRC method with adaptive repeater policy
Source:R/estimate_usual_nutrient_intake.R
estimate_usual_nutrient_intake.RdEstimates usual nutrient intake distributions from 24-hour recall data following
the NRC/IOM methodology (see doi:10.17226/10666
). The method adjusts observed
intakes for within-person variability using variance-component shrinkage based
on respondents with repeated recalls. It is flexible and adaptive to different
replicate data quality scenarios through the repeater_policy argument.
Arguments
- recall_data
A data frame containing repeated 24-hour recall data, with one row per observation (respondent-day).
- id_col
Character scalar. The name of the column identifying respondents. Each unique ID represents one participant who may have one or more recall days.
- nutrient_cols
Character vector of one or more column names containing nutrient intake values to be processed (e.g.
"Energy.kcal_intake","Protein.g_intake"). All must be numeric and non-negative.- transform
Transformation applied prior to variance estimation to improve normality. Options are
"cuberoot"(default),"log","sqrt", or"none".- jitter
Logical; if TRUE, adds a deterministic small numeric offset after transformation to prevent ties (useful when values are identical after rounding).
- warn_negative_between
Logical; if TRUE, issues warnings when the estimated between-person variance component is negative before flooring to zero.
- repeater_policy
Character scalar specifying how strictly to enforce the minimum amount of replicate information:
"auto"– chooses a balanced adaptive rule based on available replicate information (default)."strict"– enforces higher thresholds for replicate data before adjusting."lenient"– proceeds with adjustment even when replicate information is limited.
- detailed
Logical; if TRUE, includes diagnostic columns such as observed mean, between- and within-person standard deviations, degrees of freedom, replicate count, and shrinkage ratio.
Value
A tibble containing one row per respondent and estimated usual intakes
for each nutrient. If detailed = TRUE, additional columns include:
*_observed_mean– back-transformed observed mean intake.*_sd_between,*_sd_observed– variance components.*_df_resid,*_R– residual degrees of freedom and total replicate info.*_shrink_ratio– the shrinkage factor applied.
Details
This function implements the NRC (1986) / IOM (2003) recommended approach for adjusting observed 24-hour recall data to estimate the distribution of usual nutrient intakes within a population. The workflow is:
Apply the chosen transformation (
transform).Identify individuals with >=2 recall days (repeaters).
Estimate within- and between-person variance using ANOVA among repeaters.
Derive shrinkage ratio = SD(between) / SD(observed).
Shrink each individual's mean intake toward the population mean, adjusting for the ratio of within-to-between variation.
Back-transform to original units.
When no repeaters are available, observed means are returned unchanged.
If insufficient replicate information exists, the behaviour depends on repeater_policy.
When the estimated between-person variance is non-identifiable (<= 0), the NRC adjustment is skipped and observed mean intakes are returned with a warning.
References
Institute of Medicine (2003). Dietary Reference Intakes: Applications in Dietary Planning. Washington (DC): National Academies Press. Appendix E. (https://www.ncbi.nlm.nih.gov/books/NBK221370/)
Examples
# Example with Energy and Protein
df <- tibble::tibble(
id = c(1, 1, 2, 2, 3),
Energy.kcal_intake = c(1800, 2200, 1500, 1600, 2000),
Protein.g_intake = c(55, 65, 40, 42, 50)
)
estimate_usual_nutrient_intake(
recall_data = df,
id_col = "id",
nutrient_cols = c("Energy.kcal_intake", "Protein.g_intake"),
transform = "cuberoot"
)
#> Warning: Very limited replicate information for Energy.kcal_intake (df_resid = 2, R = 2). Skipping adjustment and returning observed means.
#> Warning: Very limited replicate information for Protein.g_intake (df_resid = 2, R = 2). Skipping adjustment and returning observed means.
#> # A tibble: 3 × 3
#> id Energy.kcal_intake_usual Protein.g_intake_usual
#> <dbl> <dbl> <dbl>
#> 1 1 1993. 59.9
#> 2 2 1549. 41.0
#> 3 3 2000 50