Get ahead with a $300 test prep scholarship
| Enter to win by Tuesday 9/24
EDCA Exam Notes
Terms in this set (177)
analysis of spatial distribution at fixed points in time. characterize quantitatively and use this to interpolate points and create a map
need to fill in gaps in space given limited data points
basic geostatistics model equation
Z(s) = µ(s) + ε(s). µ is trend, ε is residual
lowercase vs uppercase Z (geostatistics)
Z is the statistical model, z is the unknown reality
quantifies spatial variation. Half the expected squared difference between the values of the variable of interest and two locations
Z is measurement at location s, h is difference between the two points.
plot of semivariance as a function of distance. Comprised of range, nugget, and sill
Distance up to which there is spatial variation
Variance of the variable of interest.
Limit value of the semivariance when the separation decreases to zero
two assumptions of semivariogram
stationarity and isotropy
semivariance of Z(s) and Z(s+h) only depends on the distance h and not on the locations s and s+h
the semivariance is a function of the length of h, not of its direction
spatial correlation is stronger in one direction than another
nugget-to-sill ratio trends
small = little variation, large = lot of random variation
graph that estimates semivariogram from observations
averages semivariogram cloud over lags
experimental semivariogram calculation
How large should the lag size be for an experimental semivariogram?
at least 30 points per lag, at least 10 lags, maximum distance 0.5x diagonal of study area, variable lag size is a sensible option
last step of structural analysis for creating experimental semivariogram
choose a function shape. spherical, linear, exponential, Gaussian. Estimate parameters of the chosen shape by eye
How many observations are needed for variogram estimation?
No simple answer! 50 minimum, 200 sufficient. Depends on configuration.
geostatistical interpolation. prediction at a location is a linear combination of observations nearby
predict Z(S0) at an unobserved location S= using observations Z(si), I=1, ..., n
Ordinary kriging formula
where kriging weight = λi
Preferred distribution of Zhat (s0) = Z(S0)
Centered on 0, unbiased, with small prediction errors. Choose λ so that it's as narrow as possible and centered on 0
Computation of kriging weights
minimize the expected squared prediction error
computation of kriging weights formula
Make as small as possible under unbiasedness condition so that Σλi = 1
a mathematical technique for extracting dominant waves from a time series. extracts the dominant oscillations in a time-series through transformation to the frequency domain
random process; counterpart to deterministic process where X(t) is described by a set of equations
continuous, finite signal
X(tn) or X(n)
discrete, finite signal
time at sample n (s)
sample index 1...N9
sampling period (s)
sample interval (Ps/N). Data is spaced at regular, fixed intervals with no data gaps
sample frequency = N/Ps = 1/triangle Ts (Hz). Read X measurements per time unit aloud.
round brackets for spectral analysis
indicate a continuous signal
square brackets for spectral analysis
indicate a discrete signal
underlying principle of spectral analysis. decomposition of a time series signal into N waves with certain frequency and amplitude
phase shift (s)
wave mode (# of waves in period)
wave frequency (1/tm). # of waves per period, Hz
wave time or period. time per wave (s). P/m
sinusoidal functions: frequency vs wavelength
time series (size of waves characterized by frequency) and spatial data (size of waves characterized by wavelength). Knowing the wave velocity allows you to change the spectral domain from wave number to frequency
you need at least 2 samples to resolve a sine of certain frequency (if you don't sample fast enough, you will miss oscillations with a frequency higher than the Nyquist frequency). 0.5 * minimum frequency
under sampled waves are interpreted as waves with lower fm (wave frequency)
Fourier series of a signal
sum of cosine and sine functions where the fourier coefficients Am and Bm define the weights of the sines and cosines at wave-mode m
Fourier Transform of a signal
technique to determine fourier coefficients Am and Bm. transforms X[tn] from the time domain (n or tn) to the frequency domain (m or fm)
fourier spectral analysis can help to identify frequency of structures (waves) and their average amplitude.
a spectral method in which temporal information contained in a spectrum is maintained
Mexican hat, morlet, meyer; shape and number of oscillations determine good time resolution and frequency resolution.
sine vs wavelet
sine: fixed shape, infinite function, considers global time series, provides an average spectrum. Wavelet: wavelet shape, finite function, local time series, provides a spectrum for each time step
estimate size or errors in or data and evaluate how they propagate through models
how do we estimate error bars in x and y?
how do the errors in x and y propagate in the model for Z
something wrong with the observer, observing conditions or instrumentation. Tends to give a bias (one-sided) and makes the measurement les accurate.
examples of systematic errors
uncalibrated sensor, blunder. high precision, low accuracy
results from stochastic nature of process. adds noise (twosided errors), makes the measurement less precise, can be quantified.
examples of random errors
fluctuations in the process or due to instrument limitations. high accuracy, low precision
errors due to fluctuation in the process. can be described by statistics as long as the range of possible values can be estimated.
expresses the precision of a number
univariate functions (numerical 1)
+-εy = |f(Xhat +- εx) - f(Xhat). estimate of εy is exact, but not symmetric.
univariate functions (numerical 2)
εy = 0.5|f(xhat + εx) - f(xhat-εx). Symmetric and can be easily extended to multivariate problems.
univariate functions (analytical)
εy = f'(Xhat)εx. Analytical expression can easily be extended to multivariate
(∂f(X)g(X))/∂X = g(X)f'(X) + f(X)g'(X)
(∂f[g(X)])/∂X = f'[g(X)]g'(X)
(∂f(X,Y))/∂X and (∂f(X, Y))/∂Y
f'(X) of e^X
f'(X) of a^X
f'(X) of ln(X)
f'(X) of ^b log(X)
uncertainty in X and Y represents an area in the X-Y domain. Error bars enclose an area which marks the uncertainty, and initially we take them independently of each other
bivariate functions (analytical)
εZx = (∂f/∂x)εx, εZy = (∂f/∂y)εy,
εZx = sqrt[(εzx)^2 + (εZy)^2]
bivariate functions (numerical)
εZ = sqrt[(0.5(f(X+εx, Y) - f(X-εx, Y))^2 + (0.5(f(X, Y+εy)-f(X, Y-εy))^2]
Multivariate function (numerical)
εZ = sqrt[(0.5(f(X1+εx1, X2, ..., Xn) - f(X-εx1, X2, ..., Xn))^2 + (0.5((f(X1, X2+εx2, ..., Xn)-f(X1, X2-εx2, X3))^2 + (0.5((f(x1, X2, ..., Xn+εxn) - f(X1, X2, ..., Xn-εxn))^2]
Multivariate function (analytical)
εy = sqrt[((∂f/∂x1)εx1)^2 + ((∂f/∂x2)εx2)^2 + ... + ((∂f/∂xn)εxn)^2]
solution in case of ordinary kriging
which observation gets the largest weight if all the points are equidistant?
the one with full information and no neighbours.
ordinary kriging variance
quantifies kriging interpolation error.
ordinary kriging weights can equal more or less than one (T/F)
F; (must always equal one)
checks the validity of the assumption. the points are visited one by one, each time applying a kriging interpolation to the point. predicted value can be compared with the true value at all points
the trend is no longer constant but a function of explanatory variables
steps for regression kriging algorithm
1. select explanatory variables and fit regression model, 2. compute residuals (by sutracting fitted trend from observations) at observation locations and compute them from a semivariogram, 3. apply the regression model to all unobserved locations, 4. krige the residuals, 5. add up the results of steps 3 and 4
Since the trend explains aprt of the spatial variation, the RK residual semivariogram must lie below the OK semivariogram (T/F)
F; (can be above at close distance)
spatial stochastic simulation
do not compute a prediction but generate a possible reality instead by simulating from the probability distribution using a random number generator.
steps for sequential gaussian distribution
1. visit a location that was not measured, 2. krige to the location using the available data, 3. draw a value from the probability distribution using a random number generator and assign this value to the location, 4. add the simulated value to the data set, and more to another location, 5. repeat the procedure until there are no locations left
Why is spatial stochastic simulation useful?
more realistic spatial pattern, multiple simulated realities can communicate uncertainty in the map to users
SSS makes a better prediction than kriging (T/F)
F (kriging makes a better prediction than SSS)
kriging produces a smoothed representation of reality (T/F)
T (kriging produces a smoothed representation of reality)
Spatial variability cannot be quantified (T/F)
F (can be quantified with a semivariogram)
The semivariogram can be used in a ___ interpolation to create maps from observations at point locations
given the assumptions of the geostatistical model, the kriged map is optimal in that it is ___ and ___ the expected squared interpolation error
Ordinary kriging can be extended to ___ kriging by including a non-constant ___ that is often take as a function of spatially exhaustive environmental covariates
Kriging smoothes reality while ___ ___ ___ does not. This technique generates possible realities that are useful in a Monte Carlo uncertainty propagation analysis
spatial stochastic simulation
the total set of elements under examination in a particular study
subset from the population
most basic statistic (count them)
measures center. 1/N * SUM xi
measures spread. 1/N-1 * Sum (xi-mean)^2
measures spread. sqrt[1/n-1 * SUM (xi-mean)^2]
shows statistical representation of data. middle line = median (center statistic does not equal mean). box = 25% and 75% quantiles. Width of the box = interquantile = spread measure. Outside of whiskers = outliers.
the probability of non-exceeding. data ordered from small to large. quantiles show where the data is reached.
plot of number of data per bin. peak is at the modus, shows where the frequency of large number of individual grid cells can be found. better to make it density. total area = 1.
general probability density
will definitely find a value between smallest and largest
standard normal (gaussian) density
mean = 0, sd = 1. Data without a skew and symmetrical around the mean. 68% of values within 1 SD, 95% between 2
gaussian captures skewness and sharpness (T/F)
F; (gaussian does not capture skewness or sharpness)
more than one variable collected, e.g. a, b, c. can apply all univariate statistics to the data.
pairwise statistic for 2 variables (pair spread). COV(A,B) - E(A-meanA)(B-meanB)
measures linear dependency. cov/(SDx.SDy)
ordered categories. variables have natural, ordered categories and the distances between the categories is not known
simple linear regression
one variable (X is the explanatory variable) should be easily accessible, the other variable (Y is dependent) should be more difficult to access
explanatory, easily accessible
dependent, to be predicted from X
purely linear regression model
yhat = bx (b=slope)
general linear regression model
yhat = a + bx (a=offset, b=slope)
yi = yi-yihat. Prediction is never perfect so they always exist.
Sum of Squares (SS)
measure of badness. as it increases, the fit is worse. Counts the distance between the data points and predicted values, squares, and adds.
regression vs SS
regression uses calculations to find best parameters without counting all SS
we will never know the TRUE values. by regression we can get an approximation of the true values
geometrical vs probabilistic
geometrical: fit model line through, calculate y from x, criterion is SS. probabilistic: fit model line through, predict y from x, residuals treated only probabilistically
true + noise
assumption on errors
independent, gaussian distribution, all with mean 0, and all with the same standard deviation
uncertainty on y hat and uncertainty due to residuals. enclose the area that you expect to enclose x% of future data points. (includes the uncertainty in the true position of the curve and accounts for scatter of data around the curve. sample more data, uncertainty smaller)
band of yhat(x) around true line. (variation on a-hat and b-hat compared to true line. enclose the area that you can be x% sure contains the true curve. visual sense of how well your data defines best fit curve. more noise = larger uncertainty on y hat = wider confidence band)
number of variables less than number of points.
adding other information than data to SS (e.g. overfitting)
regression with indicator variables
data for yields for ten fields, 3 soil types (assign categories numbers)
What is the mean yield for sand? clay? loam? intercept = 9.9, soilclay = 1.6, soilloam = 4.4
sand = 9.9, clay = 9.9+ 1.6 = 11.5, loam = 9.9+4.4 = 14.3
when there are 2 or more factors, the explanatory power always comes from the factors separately (T/F)
F; (when there are 2 or more factors, always the possiblity that explanatory power does not come from the factors added but from their interaction)
univariate time series
every time corresponds to one value, e.g. river discharges
multivariate time series
more than one value for each time
known in advance, as they are tied to natural cycles (e.g. lunar cycle or evaporation)
residuals (time series)
time series - deterministic component
aggregation (time series)
series at a coarsesr time resolution. trend = deterministic (non-periodic) component
Time series classification. Number, values, source, behaviour, time, availability
Number: univariate/multivariate; values: discrete/continuous/mixed; source: measured/modelled/residuals; behaviour: periodic/trends/regular; time: equidistant/irregular; availability: full/missing values
what can you do with time series?
description (differences in behaviour), prediction (one step ahead predictions), simulate (generate artificial time series with the same properties)
deterministic point of view
can predict with complete certainty if the information is precise
chaos point of view
minor uncertainties can give rise to totally different model
answer to chaos. can translate into probability and statistics
stochastic process vs random process
stochastic: X(0), X(1),..., X(n) random: x(0), x(1),...,x(n)
calculation of probability which assumes statistical stationarity (time shifts do not change statistics). Time averages equal ensemble averages.
can investigate data without their order
method for transforming non-gaussian into gaussian
double square root
mean (time series)
level around which series fluctuates
standard deviation (time series)
measure of size of fluctuations
dependency in time
dependency given by these conditional densities are essential for a good statistical description of a time series, especially for regularity. can discover dependencies.
If P(x,y) = P(x)P(y), then one says that the consecutive days are independent (T/F)
T. (the previous day does not change the probability on dry and wet)
correlation analysis (time series)
measures only linear dependency; works best for gaussian when all dependency is linear
lag 1 correlation analysis
dependency between 2 consecutive values X(n) and X(n+1) (correlation captures regularity, wildness, dependency in time)
negative lag 1 correlation
extremely oscillating events. high values for x(n) follwed by low values for x(n+1)
graph obtained by plotting the lags versus corresponding autocorrelation. summarizes all linear dependencies within a time series.
complete statistical description of a discrete valued time series
complete statistical description of a continuous valued time series
mean + sd + autocorrelogram
simulation of time series
generate artificial time series that have the same statistical characteristics as an original given series
continuous valued time series
infinite and cannot be counted (e.g. age)
discrete time series
can be counted. finite (e.g. age in years)
how do you simulate without dependency?
random number generator
statistical theory investigating discrete values time series
series with no time structure at all (autocorrelation is zero). Simulation without dependency.
model for time series with continuous values. focuses on reconstructing the right mean, standard deviation, and autocorrelation (idea 1: construct from white noise, idea 2: bring in dependency by using finite past, idea 3: just take linear function)
time series built on linear regression (Y = a + bX + ce; larger C = larger noise)
series constructed by combining white noise terms to generate dependencies (X[n] = β1 ε[n-1] + β0 ε[n]; even if ε is white noise and ε(n-1) and ε(n) are independent, x(n) and x(n-1) are dependent as they share the same term ε(n-1))
how do you simulate arma?
simulate ε's, choose starting values, just use recursive formula
the goal of simulating ARMA is to generate series with the same statistics as the original series (T/F)
T; (the goal of simulating ARMA is to generate series with the same statistics as the original series)
what do you need to do to fit ARMA to data?
find θ, α1, ..., β0,... such that ARMA model corresponds best to data
prediction of time series is the final goal of many analyses (t/F)
T; prediction of time series is the final goal of many analyses
θ in ARMA
α in ARMA
autocorrelation (higher = higher correlation)
β in ARMA
calculation of the prediction error now involves matrix multiplications
a technique to combine a time series prediction and a new measurement that comes in
sum of kriging weights
must equal 1