Compute the entropy of transmission trees
get_entropy.Rd
Computes the mean entropy of inferred transmission trees from outbreaker2
, quantifying uncertainty in who infected whom.
By default, entropy is normalised to range from 0 (complete certainty) to 1 (maximum uncertainty).
Details
Entropy measures uncertainty in inferred infectors across posterior samples. It is computed as:
$$H(X) = -\sum p_i \log p_i$$
where \(p_i\) is the proportion of times each infector is inferred for a case.
If normalise = TRUE
, entropy is scaled by its maximum possible value, \( K\), where \(K\) is the number of distinct inferred infectors:
$$H^*(X) = \frac{H(X)}{\log K}$$
This ensures values range from 0 to 1, where:
0 complete certainty — the same infector is inferred across all samples.
1 maximum uncertainty — all infectors are equally likely.
Examples
# High entropy
out <- data.frame(alpha_1 = sample(c("2", "3"), 100, replace = TRUE),
alpha_2 = sample(c("1", "3"), 100, replace = TRUE))
class(out) <- c("outbreaker_chains", class(out))
get_entropy(out)
#> alpha_1 alpha_2
#> 0.9997114 0.9953784
# Low entropy
out <- data.frame(alpha_1 = sample(c("2", "3"), 100, replace = TRUE, prob = c(0.9, 0.1)),
alpha_2 = sample(c("1", "3"), 100, replace = TRUE, prob = c(0.9, 0.1)))
class(out) <- c("outbreaker_chains", class(out))
get_entropy(out)
#> alpha_1 alpha_2
#> 0.5842388 0.4999160