Caret:Documentation:Statistics
From Van Essen Lab
WARNING
THIS DOCUMENT IS IN DEVELOPMENT AND DESCRIBES FUTURE VERSIONS OF CARET
Descriptive Statistics
Descriptive statistics provide information about the data such as the mean (average), median (middle value), mode (most common value), standard deviation, and variance. When computing the standard deviation, one must know if the data values represent the entire population in which case division is by N (number of items) or the data values are a subsample of the population in which case division is by N - 1.
Population Descriptive Statistics
- Population Mean
- Population Standard Deviation OR
- Population Variance = σ^{2}
- Standard Deviation of the Mean
Sample Descriptive Statistics
- Sample Mean
- Sample Standard Deviation OR
- Sample Variance = S^{2}
- Standard Error of the Mean
Miscellaneous Descriptive Statistics
- Z-Score
Inferential Statistic Tests
Parametric Inferential Tests
For parametric tests, the data is assumed to be in a specific probability distribution, typically the normal (gaussian) distribution.
ANOVA (Analysis of Variance), One Way
A one-way ANOVA determines if the mean values at each node for two or more groups of subjects are statistically different. The groups being compared are allowed to have a different number of subjects.
K = Number of Groups
N = Total Number of Subjects
N_{i} = Number of Subjects in Group "i"
df_{Total} = N − 1
df_{Treatment} = K − 1
X_{ij} = Measurement for subject "j" in group "i"
Mean of group i,
Grand Mean,
SS_{Total} = SS_{Within} + SS_{Treatment}
If the ANOVA is run with two groups of data, the F-statistic is equivalent to the square of the T-Statistic produced by a Two-Sample T-Test.
T-Test, One-Sample (Single Sample)
A one-sample T-Test determines if the mean value at each node is statistically different than a specified value, often zero.
t =
df = N − 1
T-Test, Paired (Dependent Means)
A paired T-Test determines if mean at each node is statistically different for two measurements (X and Y) on one group of subjects.
t =
df = N − 1
T-Test, Two-Sample (Independent Means)
A two-sample T-Test determines if the means at each node for two groups of subjects are statistically different. The groups being compared are allowed to have a different number of subjects.
Equal (Pooled) Variances
df = N_{1} + N_{2} − 2
Unequal (Unpooled) Variances
Non-Parametric (Distribution Free) Inferential Statistic Tests
For non-parametric tests, no assumptions are made about the distribution of the data.
caret_stats
caret_stats is a command line program that performs statistical operations on GIFTI surface data files. The first parameter indicates the operation that will be performed. Run the command with just the operation for help information.
The program is written in Java and requires the Java SE Development Kit (JDK) for optimal execution. If you are using a Mac, Java is already installed and you can skip this step. If you are running Linux or Windows, you must download the Java JDK. The Java Development Kit is downloaded from http://java.sun.com/javase/downloads/index.jsp. Download and install the Java SE Development Kit (JDK). You must set the "path" environment variable to the Java installation's "bin" directory so that "java" can be run from the command line.
Note: Do not use the Java Runtime Environment. It does not support Java's "-server" option which reduces the runtime of caret_stats by fifty percent. If you get the error message "No Server JVM" you are using JRE, not JDK.
After Java is installed, download the caret6 distribution. Install in the desired location such as "Program Files" on Windows, "/Applications" on a Mac, or "/usr/local" on Linux. When the distribution is unzipped, it will create the subdirectory "caret6". Located in the caret6 directory are several directories whose names being with "bin". You must update your PATH environment variable to point to the appropriate "bin" directory so that "caret_stats" can be run from the command line. In addition, Windows users will need to set the environment variable CARET6_HOME to the full path of the caret6 directory (eg: C:\caret6).
If you have a problem see your System Administrator, and, most importantly, remember that John Harwell is NOT your System Administrator.
Descriptive Statistical Operations
- -descriptive Mean, standard deviation, etc.
Inferential Statistical Operations
The purpose of the inferential statistic is to take the input files, perform a statistical test at each node, and create a new file containing one or more statistical measurements (F, T, Z, etc) at each node.
Performing Inferential Statistical Tests in Caret
Inferential statistical tests in Caret are performed on metric or surface shape files. All of the data (metric or shape files) must be on a co-registered surface so that all data files have the same number of nodes and each node number i is "in register" across subjects (i.e., all subjects' surfaces have undergone surface-based registration using Caret, Freesurfer, CIVET, or other software).
The goal is to find clusters (regions) that are statistically different between the groups of input data. That is, one can reject the null hypothesis which states that the metric/shape values at each node are essentially the same.
The steps in Caret are:
- Run the input files through an inferential statistical test to produce the statistic file and the randomized statistic file.
- Perform a significance test to assign P-Values to the statistic file.
Each of the inferential tests in Caret produces two files. The statistic file contains the results of the statistical test performed on the input data. The randomized statistic file contains columns with the same statistical test performed on randomly assigned groups of the input data. This randomized file is used during significance testing.
One Sample T-Test
-inferential-t-test-one-sample
Paired T-Test
-inferential-t-test-paired
Two-Sample T-Test
-inferential-t-test-two-sample Two sample T-Test with or without pooled variance.
Interhemispheric Clusters
-inferential-interhemispheric
The interhemispheric clusters test is used to determine asymmetry (and symmetry) between the left and right hemispheres of two groups of subjects. All subjects left and right hemispheres must be co-registered to an atlas, typically the PALS atlas.
Inputs:
- AL is group A, left hemispheres.
- AR is group A, right hemispheres.
- BL is group B, left hemispheres.
- BR is group B, right hemispheres.
- ITER_LEFT_RIGHT is the number of iterations for T-Statistics of random combinations of left or right subjects.
- ITERATIONS is the number of iterations for the randomized T-Statistic file.
Algorithm:
- Create TL, a T-Statistic metric file comparing the left hemispheres of the two groups, TL = T-Statistic(AL, BL).
- Create TR, a T-Statistic metric file comparing the right hemispheres of the two groups, TR = T-Statistic(AR, BR).
- Create TP, a metric file containing the product of the left and right T-Statistic, TP = TL * TR.
- Create RANDTL, a metric file containing T-Statistics for ITER_LEFT_RIGHT randomized combinations of the left hemispheres from both groups, RANDTL = T-Statistic(RandomCombinations(AL, BL)).
- Create RANDTR, a metric file containing T-Statistics for ITER_LEFT_RIGHT randomized combinations of right hemispheres from both groups RANDTR = T-Statistic(RandomCombinations(AR,BR)).
- Create RANDTP, a metric file containing ITERATIONS random combinations of the product of one column from each of the left and right T-Statistic randomized files, RANDTP = RandomColumn(RANDTL) * RandomColumn(RANDTR).
Output:
- TP is the statistic file for input to the significance testing command.
- RANDTP is the randomized statistic file for input to the significance testing command.
Coordinate Difference
Formulas:
- D(i,j) The Euclidean distance between two three-dimensional points.
- DEV_x = , where N_{x} is the number of participants in group X and M is the number of nodes.
Inputs:
- A is group A coordinate files.
- B is group B coordinate files.
Algorithm:
- Create A_{avg}, the average coordinate file for group A.
- Create B_{avg}, the average coordinate file for group B.
- Create A_{dev}, the deviations at each node for group A.
- Create B_{dev}, the deviations at each node for group B.
- If the mode is COORD_DIFF, create the statistic-file where the statistic at each node is D(A_{avg},B_{avg}).
- If the mode is TMAP_DIFF, create the statistic-file where the statistic at each node is
- If the mode is COORD_DIFF, Create CDF, the coordinate difference (statistic) file where the columns are the Euclidean distances between AVG-A and AVG-B and the X, Y, and Z component distances. D(AVG-A, AVG-B).
- If the mode is TMAP_DIFF, Create CDF, the coordinate difference (statistic) file where the column is the Euclidean distance between the average coordinates divided by (DEV-A + DEV-B). The remaining columns are the same as in COORD_DIFF mode.
- Create RAND-AB which is the randomized statistic file. For each column in it, create two coordinate files that are randomized combinations of all of the input coordinate files on which the COORD_DIFF or TMAP_DIFF test is performed.
Significance Testing
Significance testing in Caret is a non-parametric technique involving randomization (bootstrapping???).
Two data files are required for significance testing. The first is the file containing the test statistic. The second file is the "randomized statistic" file that contains test statistics from many random combinations of the test subjects.
Randomization
Randomization testing is used to determine the P-Values.
Randomization With One Group of Subjects
When there is one group of subjects, such as in a one-sample T-Test, it is not possible to randomize among groups. So, the randomization is performed by randomly flipping the signs of the values for each subject. The statistical test is then run on each of these randomizations and the largest clusters are identified.
Randomization With Multiple Groups of Subjects
With multiple groups of subjects, all of subjects are placed into a pool. Subjects are then randomly drawn from the pool and placed into new groups. The new groups contain the same number of subjects as the original groups. When randomizing subjects, each new randomization of subjects should be unique when compared to any previously generated groups of subjects. Statistical tests are then run on each of these randomizations and the largest clusters are identified.
Given a group of three subjects, choosing two at a time, there are 3 combinations and 6 permutations. For example, selecting two subjects from {A,B,C} results in the combinations {A,B}, {A,C}, and {B,C} and results in the permutations {A,B}, {A,C}, {B,C}, {B,A}, {C,A}, and {C,B}. Basically, with combinations, two groups of elements are equal if they contain the same elements, in any order (ie: {A,B}, and {B,A} are equivalent). With permutations, two groups of elements are equal only if they contain the same elements in an identical order (ie: {A,B} and {B,A} are NOT equivalent).
Mathematical formulas for the number of permutations and combinations when choosing k elements from a total of n elements:
P(n,k) =
C(n,k) =
P's and Q's
The significance tests in Caret produce both P and Q values. Q is simply 1 - P. Q is useful for thresholding in Caret. One selects the statistic for viewing and thresholds with Q. Since Caret thresholds by inhibiting the display of data BELOW the threshold, one can threshold with Q and set the threshold to 0.95 to see statistics with a P-Value of 0.05 or less.
Cluster Based Thresholding
For cluster-based threshold significance testing use "caret_stats -significance-cluster-threshold".
- The user provides positive and negative thresholds and a desired significance level (P-Value, eg: 0.05).
- Clusters of nodes passing the threshold tests are identified in the statistic file. Note that positive and negative values are processed separately.
- The largest cluster is identified in each column of the randomized statistic file using the thresholds.
- The clusters identified from the randomized statistic file are ranked based upon surface area (possibly corrected for surface distortion).
- The user provided P-Value is multiplied by the number of columns in the randomized statistic file (eg: 0.05 * 500 = 25) providing the significant cluster rank. The cluster at this rank is identified and its surface area is noted as the "significant surface area".
- For each cluster in the statistic file, use its surface area and determine how it ranks in the ranked randomized clusters. Set the P-Value for the statistic file's cluster to its ranking divided by the total number of columns in the randomized file. For example if the statistic cluster is ranked 3 out of 100, the cluster receives a P-Value of 0.03.
The difficult part of cluster-based thresholding is selecting the thresholds. There is no "correct" threshold value. In general, smaller thresholds result in either or both more clusters and larger clusters and larger thresholds result in either or both fewer clusters and smaller clusters.
Threshold-Free Cluster Enhancement (TFCE)
For threshold-free cluster enhancement significance testing use "caret_stats -significance-threshold-free".
The difficulty of selecting a threshold in cluster-based thresholding led to the development of threshold-free cluster enhancement (See Smith and Nichols in the References section at the bottom of this page). With threshold-free cluster enhancement, the user does not need to choose thresholds.
- Apply the TFCE transform to the statistic in the statistic file.
- Apply the TFCE transform to all columns in the randomized statistic file.
- Find the largest TFCE value in each column of the TFCE transformed randomized statistic file and rank them.
- The user provided P-Value is multiplied by the number of columns in the randomized statistic file (eg: 0.05 * 500 = 25) providing the significant TFCE rank. The TFCE at this rank is identified and its value is noted as the "significant TFCE value".
- For each node in the statistic file, use its TFCE value and determine how it ranks in the ranked, randomized TFCE values. Set the P-Value for the statistic file's node to its ranking divided by the total number of columns in the randomized file. For example if the statistic node TFCE is ranked 3 out of 100, the node receives a P-Value of 0.03.
TFCE(j) = where e(h) is the spatial extent of a node (in Caret, the node's surface area) and h is the value at the node. E and H are constants (0.5 and 2.0) . N total number of nodes contributing to this node's spatial extent (including the node itself). The spatial extent is all connected nodes that have a non-zero value with the same sign as the node being evaluated. In addition, as one moves away from the node for which the TFCE score is being calculated, the metric values must be adjusted so that they are no larger than the immediate neighbors that are closer to the node being evaluated (That is, as one moves away from the node being processed, the metric values must never increase, see Figure 1 in the TFCE paper which shows that we are calculating the gray region).
In the figure above, four nodes are labeled A, B, C, and D. If calculating the TFCE score for node C, its supporting section includes nodes A, B, and D. Below each node are boxes labeled a, b, c, and d that correspond to the nodes labeled with uppercase letters. For node C, the four boxes reprsent its supporting sections. The extent, e, is the horizontal size of the box and in Caret is the surface area (in millimeters) associated with the node. The height, h, is the vertical size of the box and is the statistical value associated with the node. Furthermore, notice that the box a, for node A, is limited in size vertically to the height of B, a local maximum. As one moves away from node C, the heights (statistical values) are limited so that they never increase.
Flat Surface with Z-Coordinate set to T-Statistic
Flat Surface with Z-Coordinate set to TFCE-Enhanced T-Statistic
The significance testing command have a parameter named "-number-of-threads". Threads allow a task to be broken down into pieces that may be run in parallel and take advantage of either multiple processors or multi-core processors. Using threads will typically reduce the execution time of the command.
References
Books
- Howell, David C. (2002) Statistical Methods for Psychology. Pacific Grove, CA: Duxbury.
Journal Articles
- Nonparametric Permutation Test For Functional Neuroimaing: A Primer with Examples. Thomas E. Nichols and Andrew P. Holmes. Human Brain Mapping 15:1
- Threshold-Free Cluster Enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. Stephen M. Smith and Thomas E. Nichols.NeuroImage 2009 44(1)
Web Sites
Glossary
- Cluster-based Thresholding - Groups of connected nodes with attribute values greater than a threshold (ie: t > 3.0) are identified.
- Family Wise Error - Probability of making Type I Errors (rejecting the null hypothesis when the null hypothesis is true). Also called alpha error.
- Gaussian Field Theory
- Non-parametric Statistics - The test contains no requirement that the data fit a probability distribution.
- Permutation Testing - A type of non-parametric test. http://en.wikipedia.org/wiki/Resampling_(statistics)
- Parametric Statistics - The test requires the data to fit a probability distribution, typically the normal distribution.
- ROC (Receiver Operating Characteristic) A plot that shows the tradeoff of true positive and false positive as the threshold is varied.
- Spatial Smoothing
- Supporting Section - The connected region contributing to a node's TFCE enhanced value.