Skip to contents

Computes the average distance between each individual point and a set of base samples.

Usage

get_avg_distances_to_set(d_obj, base_samples, sample_data = NULL)

Arguments

d_obj

A distance object describing the distances between all the samples

base_samples

The samples against which all the others are measured.

sample_data

(optional) Extra data that can be added to the data frame containing the average distances. Useful when you want to plot average distances against other covariates.

Value

A data frame with one column containing the average distances between each sample and the base samples. Specifically, for sample i, the average distance is \(\sum_{bs \in \text{base_samples}} d(i, bs) / |\text{base_samples}|\) . Remaining columns in the data frame are sample names and sample_data (if it was provided.)

Examples

data(small_otutab)
data(small_tree)
mpqr_distances = get_mpq_distances(small_otutab, small_tree, rvec = c(0, .5, 1))
# get the MPQr distances with r = .5
d_obj = mpqr_distances$distances[["0.5"]]
## for each site, get the average of the distances between that site and the base sites (sites 1 and 2)
get_avg_distances_to_set(d_obj, base_samples = c(1,2)) |> head()
#>            avg_dist     site
#> ankarif  0.07115132  ankarif
#> banyong  0.07115132  banyong
#> beza1    0.07115132    beza1
#> beza2    1.84962896    beza2
#> brisefer 0.07115132 brisefer
#> korup    0.07115132    korup
## compare to doing this by hand for site 3:
(d_3_12 = as.matrix(d_obj)[3, c(1,2)])
#>   ankarif   banyong 
#> 0.0000000 0.1423026 
mean(d_3_12)
#> [1] 0.07115132
mean(d_3_12) == get_avg_distances_to_set(d_obj, base_samples = c(1,2))$avg_dist[3]
#> [1] TRUE