Data mining in fmri data

Data Mining in FMRI data
Sepehr Rasouli
Data Mining
Class Winter 2018

WHAT IS NEUROSCIENCE? WHY IS IT IMPORTANT?
▪ The study of the brain and the nervous system,
their interactions with other physiological
systems
▪ Understand Human Behavior
▪ Improve human psychological and somatic health
▪ Understand learning & memory
3

4
Neuron Anatomy
(Brain Cells)

FMRI(Functional Magnetic Resonance Imaging)
▪ Functional magnetic resonance imaging or
functional MRI measures brain activity by
detecting changes associated with blood flow.
▪ This technique relies on the fact that cerebral
blood flow and neuronal activation are coupled.
When an area of the brain is in use, blood flow
to that region also increases.
6

fMRI is MRI in time
dimension that is we
have repeated scans of
brain volume. A voxel is
a unit if measurement
in fMRI because BOLD
signal is measured on a
typical voxel in
different regions of the
brain using gradient of
the High frequency
signal.
7
fMRI

Previous Approaches
The typical analysis of fMRI data uses one or both of two main analysis approaches:
1. The single-voxel approach creates activation maps by testing each voxel
separately (possibly after spatial pre- processing, e.g., smoothing) for
correlation with the experimental paradigm (predictor) and declaring a voxel
active if the p-value is less than some threshold. (The threshold value may be
pre-decided, or it may be adjusted adaptively by the data, e.g., using FDR.)
2. The second common approach is to pre-define a region of interest (ROI),
based on either anatomical or functional data (by an already-established
paradigm known to activate that region), and then to perform statistical
analysis on the ROI time course obtained from the new experiment.
8

Pros and Cons of each approach
1. Activation maps obtained by single-voxel analysis are inherently limited by the
SNR of individual voxel data, which is typically low. Furthermore, the very large
number of statistical tests (a typical acquisition involves tens of thousands of
voxels) requires adjusting the p-values for multiple comparisons, imposing high
statistical thresholds that may reveal only the voxels with the very highest
SNR but mask others that do have real effects.(high SNR)
2. The ROI approach overcomes the low SNR inherent in single-voxel data but
has other shortcomings. The most obvious problem is that it thwarts
researchers’ ability to discover effects of the experimental manipulations in
brain regions other than those already hypothesized and pre-defined. In
addition, the chosen ROI itself may be comprised of subregions that behave
differently, but current ROI analysis methods do not allow researchers to
discover such microstructure.(low SNR)
9

CBA Method
▪ A ‘cluster-based analysis’ (CBA) method. The approach can be thought of as a
‘hybrid’ between the single-voxel and the ROI analyses, combining some of the
advantages of each while avoiding many of their pitfalls.
▪ Like the single-voxel approach, CBA creates complete activation maps: every
voxel in the acquisition volume has an a priori chance of being ‘discovered’. The
important difference from the single-voxel approach is that the units of analysis
are now contiguous clusters of voxels, taking advantage of the increased SNR
of multi-voxel data, as in the ROI approach.
10

Advantages of Clustering Method
1. averaging data from multiple voxels increases the SNR of each statistical
comparison
2. because the statistical testing is now performed on clusters, the total number
of tests is reduced
Note that the procedure is based on two experimental stages, so the clusters are
defined on a different data set than the one used to test for activation under the
paradigm of interest.(Train and Test)
11

Clustering Algorithm
1. For each voxel, the correlation with each of its neighbors is computed.
2. For every voxel, the neighbor with the highest correlation is found
Note that this is not a symmetric property: given a voxel i that is maximally
correlated with neighbor j, voxel j may be maximally correlated with another of
its neighbors, k.
3. Each voxel and its maximally correlated neighbor define an initial region, and if
the same voxel is in two or more regions these regions are joined together,
iteratively until the process terminates in non-overlapping clusters.
12

Testing Clusters Using FDR
▪ The testing of clusters rather than voxels reduces the extent of the multiple
hypotheses testing problem as the number of clusters tested m
c
is smaller than
the number of voxels tested m. In fact the reduction when using the clustering
algorithm in Clustering method section is to at least m/2.
▪ The number of tests conducted can be further reduced by restricting the
analysis to clusters within regions of interest (ROI) rather than searching over
the entire brain for activity. Such ROI can either be pre-defined (e.g.,
anatomically) or extracted from the experiment that is already being used to
define clusters. Thus, CBA in combination with ROI analysis can be viewed as
helping us search for activation within subregions of the ROI.
14

BH procedure & adaptive procedure for FDR
▪ The BH procedure (Benjamini and Hochberg, 1995) has been adopted in the
fMRI community for controlling the FDR at any desired level q while testing
voxels
▪ The BH procedure makes use of the m p-values, calculated one for each voxel
for testing its activation. Sorting these p-values, we get P
(1)
≤ ... ≤ P
(j)
≤ ...P
(m)
.
Then find the largest p-value among all those satisfying; call it P
(k)
, and declare
the k voxels whose p-value is less or equal to P
(k)
as active.
15

BH procedure & adaptive procedure for FDR
▪ In CBA, we use the same procedure on the p-values obtained for the clusters,
replacing the total number of clusters m
c
for the total number of voxels used
above. Thus, the procedure controls the expected proportion of falsely
discovered clusters among all clusters declared active.
( Note that a falsely discovered cluster is a cluster that contains no active
voxels, and correspondingly a truly discovered cluster is a cluster that
contains at least one active voxel)
▪ with CBA we give up the control of FDR on voxels. Thus, the FDR on voxels
may be in certain situations higher than the FDR on clusters, especially if there
are many non-homogenous clusters that contain both activated and non-
activated voxels. We believe that researchers are interested in these flexible
units of analysis for which conclusions are taken, rather than in the artificially
generated voxel units. Thus we emphasize the control of FDR of clusters
rather than the FDR of voxels. 16

Data mining in fmri data

Related slideshows

More Related Content

Data mining in fmri data