The tsvr
package provides an implementation of a
timescale-specific extension and generalization of the variance ratio of
Peterson (1975). The variance ratio is used
commonly in community ecology. The extension implemented in the
tsvr
package is described in detail by Zhao et al. (In
prep). The tsvr
package supports that paper and
provides an implemetation of the tools developed there for anyone to
use. The mathematical formulas for the variance ratio and extensions are
detailed elsewhere (Peterson 1975; Hallett et al. 2014; Zhao et al. In prep). The mathematics
are also summarized here, but the main purpose of this vignette is to
provide a decription of how to use the tsvr
package.
Preparing data
A typical dataset for analysis using tsvr
is an N × T matrix of nonnegative
numeric values where rows correspond to species in a community (so the
number of species is N) and
columns correspond to evenly spaced times during which sampling was
conducted (so the number of times sampling was conducted is T). Matrix entries may be densities,
or percent cover values for plant species within a quadrat, or
biomasses, or other measures of abundance of the species. For
instance:
library(tsvr)
class(JRGdat)
## [1] "data.frame"
## [1] "year" "agoseris.heterophylla"
## [3] "astragalus.gambelianus" "bombycilaena.californica"
## [5] "brodiaea.sp" "bromus.berteroanus"
## [7] "bromus.hordeaceus" "calandrinia.ciliata"
## [9] "calycadenia.multiglandulosa" "castilleja.densiflora"
## [11] "chlorogalum.pomeridianum" "crassula.connata"
## [13] "elymus.multisetus" "epilobium.brachycarpum"
## [15] "hemizonia.congesta" "hesperevax.sparsiflora"
## [17] "lasthenia.californica" "layia.platyglossa"
## [19] "lepidium.nitidum" "lotus.wrangelianus"
## [21] "microseris.douglasii" "nassella.pulchra"
## [23] "plantago.erecta" "poa.secunda"
## [25] "trifolium.albopurpureum" "trifolium.sp"
## [27] "vulpia.microstachys"
d<-t(as.matrix(JRGdat[,2:dim(JRGdat)[2]]))
dim(d)
## [1] 26 28
Here d
is a 26 × 28
matrix containing percent cover measurements for each year from 1983 to 2010
for 26 species which occurred in a 1 m2 plot in the Jasper Ridge
Biological Preserve serpentine grassland site. Species names are in the
row names of JRGdat
, which is an example dataset embedded
in the tsvr
package. Documentation for the dataset can be
viewed via ?JRGdat
. See (Hallett et al. 2014) for details of
the Jasper Ridge ecosystem and of these and other related data.
Standard implementations of Fourier transforms require time series
consisting of measurements taken at evenly spaced times, with no missing
data. The core functions provided in tsvr
make these same
assumptions and throw an error if data are missing. The user is left to
decide on and implement a reasonable way of filling missing data, if
data are missing. We have previously used the simple approach of
replacing missing values in a time series by the median of the
non-missing values in the time series (Sheppard et al. 2016). This approach,
and other related simple procedures (Sheppard et al. 2016), seem unlikely
to artefactually produce significant synchrony, or coherence
relationships with other variables, but rely on the percentage of
missing data being fairly low and may obscure detection of synchrony or
significant coherence relationships if too many data are missing. For
applications which differ meaningfully from the prior work for which the
tools of this package were developed (Zhao et al. In prep), different ways
of filling missing data may be more appropriate.
The timescale-specific variance ratio techniques which are the focus
of this package use Fourier methods to decompose by timescale the
classic variance ratio and related quantities. Detrending and variance
standardization across time series, techniques which are often applied
before doing Fourier analysis, may not be approriate except in cases for
which it makes sense to calculate the classic variance ratio and related
quantities after performing those techniques.
The classic variance ratio and related quantities
Let xi(t)
be a measure of the population of species i at time t, for i = 1, …, N and t = 1, …, T. Define $x_{\text{tot}}(t)=\sum_{i=1}^N x_i(t)$ to be
the total of all species populaton measures. Define CVcom2 to be the square
of the coefficient of variation of xtot(t) over
time, i.e., vart(xtot(t))/(meant(xtot(t)))2,
where vart and
meant represent
variance and mean computed through time. This equals ∑i, jcovt(xi(t), xj(t))/(meant(xtot(t)))2,
where covt is
covariance through time. The abbreviation “com” in CVcom2 stands for
“community” since CVcom2 is the squared
coefficient of variation of the whole-community population. Define CVcom_ip2 to be the
value of CVcom2
that would pertain if the population dynamics of different species were
independent, so that covt(xi(t), xj(t)) = 0
for all i ≠ j. The
abbreviation “ip” stands for “independent populations”. Thus CVcom_ip2 = ∑icovt(xi(t), xi(t))/(meant(xtot(t)))2 = ∑ivart(xi(t))/(meant(xtot(t)))2.
The classic variance ratio is defined as φ = CVcom2/CVcom_ip2,
so that CVcom2 = φCVcom_ip2.
A variance ratio greater than 1
suggests “synchronous” dynamics of the species comprising the community,
so that community variability (CVcom2) is greater than
it would be if populations were independent (CVcom_ip2). A variance
ratio less than 1 suggests
“compensatory” dynamics of the species comprising the community (i.e.,
increases/decreases in some species are partly compensated for by
decreases/increases in others), so that community variability is less
than it would be if populations were independent.
The quantities CVcom2, CVcom_ip2 and φ can be computed using the
vreq_classic
function in the tsvr
package, we
here do so for the dataset d
of the previous section, which
we will continue to use throughout this vignette:
res<-vreq_classic(d)
class(res)
## [1] "vreq_classic" "vreq" "list"
## [1] "com" "comnull" "vr"
## class: vreq_classic
## com: 0.04695127
## comnull: 0.1028208
## vr: 0.4566318
## Object of class vreq_classic:
## CVcom2: 0.0469512705521134
## CVcomip2: 0.102820846939826
## classic vr: 0.456631820778433
all.equal(res$com,res$comnull*res$vr)
## [1] TRUE
The vreq_classic
S3 class, of which
vreq_classic
is the generator function, inherits from the
generic S3 class vreq
(generator function
vreq
) and the list
class, and has the three
slots com
, comnull
, and vr
. These
slots correspond to CVcom2, CVcom_ip2 and φ. Both the vreq
and
vreq_classic
classes have print
and
summary
methods and set_*
and
get_*
methods where *
represents any of the
class slot names. See documentation for vreq
,
vreq_classic
, vreq_methods
,
vreq_classic_methods
, setget_methods
for
details. The “classic” in vreq_classic
references the fact
that this version of the variance ratio is the original version used in
community ecology (Peterson 1975), and is probably still
the most commonly used. Alternative versions have been proposed (Loreau and Mazancourt
2008) - see the next section of this vignette.
The timescale-specific classic variance ratio and related
quantities
Next we describe a timescale-specific extension of the classic
variance ratio, and its related quantities, that uses spectral methods,
a set of standard statistical tools in ecology and other fields. The
power spectrum of the time series xi(t),
here denoted sii(σ)
and defined for timescales σ = T/(T − 1), T/(T − 2), …, T/2, T,
decomposes vart(xi(t))
by timescale in that sii(σ)
is nonnegative, will tend to be larger for timescales on which xi(t)
shows greater variation through time, and ∑σsii(σ) = vart(xi(t)).
Likewise, the cospectrum sij(σ)
decomposes covt(xi(t), xj(t))
by timescale in a similar way, being larger for timescales on which the
two time series predominantly covary.
We define a timescale-specific generalization of CVcom2 as CVcom2(σ) = ∑i, jsij(σ)/(meant(xtot(t)))2.
It is straightforward to see that ∑σCVcom2(σ) = CVcom2,
so CVcom2(σ)
decomposes CVcom2
by timescale. CVcom2(σ)
reveals to what extent variation on each timescale contributes to
community variability. Likewise, CVcom_ip2 = ∑ivart(xi(t))/(meant(xtot(t)))2
can be decomposed by timescale as CVcom_ip2(σ) = ∑isii(σ)/(meant(xtot(t)))2.
Finally, we define a timescale-specific version of the classic variance
ratio as the quotient of these two quantities, i.e., φts(σ) = (∑i, jsij(σ))/(∑isii(σ)),
so that CVcom2(σ) = φts(σ)CVcom_ip2(σ).
The timescale-specific variance ratio quantifies the extent to which
species’ oscillations are synchronous ( > 1) or compensatory ( < 1) on a timescale-by-timescale basis.
For further details, see (Zhao et al. In prep).
The quantities CVcom2(σ),
φts(σ)
and CVcom_ip2(σ)
can be computed using the tsvr
package:
res<-tsvreq_classic(d)
class(res)
## [1] "tsvreq_classic" "tsvreq" "list"
## [1] "ts" "com" "comnull" "tsvr" "wts"
## class: tsvreq_classic
## ts_start: 1.037037
## ts_end: 28
## ts_length: 27
## com_length: 27
## comnull_length: 27
## tsvr_length: 27
## wts_length: 27
## Object of class tsvreq_classic:
## ts, a length 27 numeric vector: 1.04 1.08 1.12 ... 9.33 14 28
## CVcom2, a length 27 numeric vector: 0.00221 0.000986 0.00179 ... 0.00179 0.000986 0.00221
## CVcomip2, a length 27 numeric vector: 0.00927 0.00291 0.018 ... 0.018 0.00291 0.00927
## tsvr, a length 27 numeric vector: 0.238 0.339 0.0995 ... 0.0995 0.339 0.238
## wts, a length 27 numeric vector: 0.0901 0.0283 0.175 ... 0.175 0.0283 0.0901
all.equal(res$com,res$tsvr*res$comnull)
## [1] TRUE
all.equal(sum(res$com),vreq_classic(d)$com)
## [1] TRUE
all.equal(sum(res$comnull),vreq_classic(d)$comnull)
## [1] TRUE
Here, ts
is a vector of timescales (T/(T − 1), T/(T − 2), …, T/2, T),
and the other elements are vectors of the same length containing
timescale-specific information. The element com
contains
CVcom2(σ),
comnull
contains CVcom_ip2(σ),
and tsvr
contains φts(σ).
The tsvreq_classic
S3 class inherits from the generic class
tsvreq
and from the list
class.
The timescale-specific variance ratio, φts(σ),
is not a decomposition of φts,
i.e., summing φts(σ)
across timescales does not yield φ. However, defining w(σ) = ∑isii(σ)/(∑ivart(xi(t))),
one can show ∑σw(σ) = 1,
so the w(σ) are
weights, and ∑σw(σ)φts(σ) = φ.
So φ is a weighted average of
the values of φts(σ)
across timescales. The wts
field of a
tsvreq_classic
object contains w(σ).
## [1] 1
res2<-vreq_classic(d)
all.equal(sum(res$tsvr),res2$vr)
## [1] "Mean relative difference: 0.9794545"
all.equal(sum(res$wts*res$tsvr),res2$vr)
## [1] TRUE
The plot method for the tsvreq_classic
class displays
the various components as functions of timescale:
plot(res,filename="Tsvreq_classic_demo")
## png
## 2
knitr::include_graphics("Tsvreq_classic_demo.pdf")
The plots are symmetric about the middle, because Fourier transforms
of real-valued time series have this property. The middle timescale is
associated with the Nyquist frequency. The gray shading is a reminder of
the symmetry - one should typically interpret the left, un-grayed side
of plots. Both symmetric sides are plotted because sums must be computed
over all displayed timescales to equal the corresponding
frequency-nonspecific analogues. Additionally, non-smoothed Fourier
transforms are used so that sums will exactly equal
frequency-nonspecific analogues. Unsmoothed Fourier spectra and
cospectra are jagged, so peaks of plots should not be given undue
interpretive weight unless smoothing or averaging over timescales (see
below) or significance testing is performed.
The lack of a timescale-specific Loreau-de Mazancourt variance
ratio
It is difficult to envision an analogous timescale-specific version
of the Loreau-de Mazancourt approach. The quantity $\text{CV}_{\text{pop}}^2=\left(\sum_i
\sqrt{\operatorname{var}_t
(x_i(t))}\right)^2/(\operatorname{mean}_t(x_{\text{tot}}(t)))^2$
cannot be decomposed by replacing the variances in the numerator by
power spectra because of the square root.
Aggregating the timescale-specific classic variance ratio and
related quantities to timescale bands
If Ω is a set of
timescales, and defining CVcom2(Ω) = ∑σ ∈ ΩCVcom2(σ),
CVcom_ip2(Ω) = ∑σ ∈ ΩCVcom_ip2(σ),
and $\bar{\varphi}_{ts}(\Omega)=\frac{\sum_{\sigma \in
\Omega} \varphi_{ts}(\sigma) w(\sigma)}{\sum_{\sigma \in \Omega}
w(\sigma)}$, it has been shown (Zhao et al. In prep) that CVcom2(Ω) = φ̄ts(Ω)CVcom_ip2(Ω).
Aggregating over timescales mitigates the jaggedness resulting from
unsmoothed Fourier transforms (see above). The tsvr
package
provides tools for aggregating over any collection of timescales:
res<-tsvreq_classic(d)
aggresLong<-aggts(res,res$ts[res$ts>=4])
aggresShort<-aggts(res,res$ts[res$ts<4])
class(aggresLong)
## [1] "vreq_classic_ag" "vreq" "list"
## [1] "com" "comnull" "vr" "ts"
## Object of class vreq_classic_ag:
## CVcom2: 0.0227087709439967
## CVcomip2: 0.0850891648065253
## vr: 0.266882052440303
## ts: 1.04 1.08 1.12 ... 9.33 14 28
## Object of class vreq_classic_ag:
## CVcom2: 0.0242424996081167
## CVcomip2: 0.0177316821333008
## vr: 1.36718555103062
## ts: 1.4 1.47 1.56 ... 2.8 3.11 3.5
The aggts
function is the generator function for the
vreq_classic_ag
class, which inherits from the
vreq
class and from list
.
Note that the best way to specify the timescales over which to
aggregate is by using conditions such as
aggts(res,res$ts[res$ts<4])
. If you type in timescales,
e.g., aggts(res,c(1.03,1.07,1.12))
, and the timescales do
not match exactly with timescales in res$ts
, they
will be removed. No error or warning will be triggered unless there are
no remaining timescales. The reason for this is, the code removes all
timescales that are not among the canonical Fourier timescales less than
2, the Nyquist timescale. Remaining
timescales are then reflected about the Nyquist timescale to account for
the symmetry of Fourier transforms. This setup makes it possible to
specify, say, aggregation over timescales less than 4 by
aggts(res,res$ts[res$ts<4])
instead of
aggts(res,res$ts[res$ts<4 & res$ts>4/3])
(which
gives the same results if run, but is an inconvenient format with which
to specify timescales). In other words, only timescales greater than or
equal to the Nyquist timescale (2) need be specified in the argument
ts
, and the symmetric timescales on the other side of the
Nyquist timescale are included automatically. See also the documentation
for aggts
.
Acknowledgements
This material is based upon work supported by the National Science
Foundation under grant numbers 1714195 and 1442595, by the James S
McDonnell Foundation, and by a working group grant from the Long Term
Ecological Research Network Communications Office of the National Center
for Ecological Analysis and Synthesis. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of
the authors and do not necessarily reflect the views of these funders.
We thank all users of the package who have reported or will later report
ways in which the package could be improved.
References
Hallett, LM, JS Hsu, EE Cleland, SL Collins, TL Dickson, EC Farrer, LA
Gherardi, et al. 2014. “Biotic Mechanisms of Community Stability
Shift Along a Precipitation Gradient.” Ecology 95:
1693–1700.
Loreau, M, and C de Mazancourt. 2008. “Species Synchrony and Its
Drivers: Neutral and Nonneutral Community Dynamics in Fluctuating
Environments.” American Naturalist 172: E48–66.
Peterson, CH. 1975. “Stability of Species and of Community for the
Benthos of Two Lagoons.” Ecology 56: 958–65.
Sheppard, LW, J Bell, R Harrington, and DC Reuman. 2016. “Changes
in Large-Scale Climate Alter Spatial Synchrony of Aphid Pests.”
Nature Climate Change 6: 610–13.
Zhao, L, S Wang, L Hallett, A Rypel, MCN Castorani, LG Shoemaker, KL
Kottingham, K Suding, and DC Reuman. In prep. “Decomposition of
the Variance Ratio Illuminates Timescale-Specific Population and
Community Variabilty.” In Prep.