Patent No. 5083571 Use of brain electrophysiological quantitative data to classify and subtype an individual into diagnostic categories by discriminant and cluster analysis
Patent No. 5083571
Use of brain electrophysiological quantitative data to classify and subtype an individual into diagnostic categories by discriminant and cluster analysis (Prichep, Jan 28, 1992)
Abstract
A system for using discriminant analysis of EEG data to automatically evaluate the probability that an individual patient belongs to specified diagnostic categories or a subtype within a category where there are more than two categories or subtypes, and the system automatically places the patient into one of those more than two categories or subtypes.
Notes:
BACKGROUND
AND SUMMARY OF THE INVENTION
The invention is in the field of deriving and using brain electrophysiological
data such as EEG (electroencephalographic) and EP (evoked potential) data and
focuses on the use of quantitative data of this nature to classify and subtype
an individual into one or more diagnostic categories on the basis of the probability
of the presence of a specific electro-physiological profile.
It has long been known that time-varying spontaneous electrical potentials (SP)
exist between areas of a person's scalp and that an SP record called an electro-encephalogram
or EEG for short, can be studied in an effort to evaluate brain activity. The
EEG data can be presented in analog form, as a set of traces of SP amplitude
vs. time from a number of scalp and other electrodes which are electrically
referenced in various ways. These traces can be studied visually but the information
of interest is difficult to extract accurately because it typically is in the
form of low amplitude time-varying signals which tend to be masked by higher
amplitude noise and other perturbations. Accordingly, long and specialized training
is believed to be required to interpret analog EEG and the process tends to
be subjective and time consuming. Another approach is quantitative EEG, which
involves digitizing the EEG electrode outputs and subjecting them to sophisticated
digital processing in an effort to suppress features which are believed to be
less relevant and enhance features which are believed to be more relevant. A
great deal of development has taken place in quantitative EEG in recent years
as evidenced, for example, by the documents which are cited at the end of this
specification and are hereby incorporated by reference in this specification
in their entirety as though fully set forth herein. These documents are referred
to by citation number in the discussion below. It also has long been known that
evoked potential (EP) data can be derived from the same or similar electrodes
when the patient is subjected to stimuli such as auditory and visual stimuli.
The EP data typically appears as potentials superimposed on the normally present
SP signals. See, for example, U.S. Pat. No. 4,493,327, which is hereby incorporated
by reference.
A good example of an advanced quantitative EEG instrument is the system which
is available under the tradename Spectrum 32 from Cadwell Laboratories, Inc.
of Kennewick, Washington. The patents pertaining to such technology include
U.S. Pat. Nos. 3,901,215; 4,171,696; 4,188,956; 4,201,224; 4,216,781; 4,279,258;
4,411,273 and 4,417,592, which are hereby incorporated by reference. In a typical
operation, the instrument collects EEG data from each of the 19 electrodes of
the International 10/20 Placement System. For example, the electrodes are secured
to a flexible cap placed over the patient's head. A sufficiently long interval
of artifact-free, eyes-closed, resting state EEG is gathered from these electrodes,
transmitted to the instrument via a multiconductor cable and recorded therein.
The EEG data can be displayed in analog form for a visual check and verification
and as an initial indication of the nature of the data. EEG data contaminated
by artifacts such as due to muscle or eye movement or environmental noise can
be automatically or manually rejected to leave only valid data which are sufficiently
free from artifacts to be acceptable for subsequent analysis. For example, EEG
data can be collected and screened to extract therefrom 24-48 artifact-free
segments each 2.5 seconds long. If the screening is manual, a 2.5 second window
can be scanned over the analog display until it encloses only data which appear
visually to be artifact-free, the selected segment is marked as acceptable,
and the process is repeated until enough good segments have been collected.
If the screening is automatic, the artifact-free segments can be selected through
the use of artifact-rejection algorithms stored in the instrument.
For quantitative EEG analysis, the EEG analog waveforms typically are converted
to digital form so that they can be subjected to various forms of digital processing
in order to extract objective descriptors or features of the EEG data and to
use such features as a diagnostic aid. For example, Fast Fourier Transform (FFT)
analysis is applied to characterize the frequency composition of the EEG data,
typically dividing the EEG spectrum into the four traditional frequency bands:
delta (1.5-3.5 Hz), theta (3.5-7.5 Hz), alpha (7.5-12.5 Hz) and beta (12.5-20
Hz). These features can include characteristics of the EEG data such as absolute
and relative power, symmetry, coherence, etc., for a total of up to hundreds
of features. In this context: absolute power is the average amount of power
in each frequency band and in the total frequency spectrum of the artifact-free
EEG data from each of the 19 electrodes and is a measure of the strength of
the brain electrical activity; relative power is the percentage of the total
power contributed for a respective electrode and a respective frequency band
and is a measure of how brain activity is distributed; symmetry is the ratio
of levels of activity between corresponding regions of the two brain hemispheres
in each frequency band and is a measure of how balanced is the observed activity;
and coherence is the degree of synchronization of electrical events in corresponding
regions of the two hemispheres and is a measure of how coordinated is the observed
brain activity. These four basic categories of univariate features, resulting
from spectral analysis of the EEG data, are believed to characterize independent
aspect of brain activity and each is believed to be sensitive to a variety of
different clinical conditions and changes of state. In one example, the Spectrum
32 instrument automatically extracts 258 quantitative features representing
these four aspects of the EEG spectrum from the monopolar EEG data provided
by the 19 electrodes and computes an additional 112 features from 8 bipolar
derivations, for a total of 370 features which can be considered univariate
measures (i.e., each feature is a measure of the local EEG at one derivation
or from the comparison of two derivations). It is believed that the measures
of such features should have Gaussian distributions and, if that is not the
case, the instrument can transform the appropriate measures into an adequate
Gaussian distribution, for example as discussed in John, E. R. et al., The Use
of Statistics in Electrophysiology, Handbook of Electroencephalography and Clinical
Neurophysiology, Revised Series, Vol. 1, Methods of Analysis of Brain Electrical
and Magnetic Signals, edited by Gevins, A. S. et al., Elsevier, 1987, pp. 497-540.
In the case of quantitative EP data, the instrument derives factor Z-scores
Z.sub.ij from the analog EP data derived from the individual, in the manner
discussed below, where the analog output EP.sub.i (t) of the i-th EP electrode
is a waveform which is a function of time t and can be approximated by the sum
of the typically non-sinusoidal waveforms F.sub.j (t) as follows, where the
SUM is from i=1 to i=N (N is the number of EP electrodes, e.g., 19), and from
j=1 to j=K (j identifies a particular factor waveform F and K is an integer
and is the total number of waveforms F required to achieve a desired accuracy
in approximating the waveform EP.sub.i (t)), a.sub.ij is a coefficient corresponding
to the contribution of factor waveform F.sub.j (t) to the approximation of the
waveform EP.sub.i (t), and R is a constant:
The above expression is similar to that for a Fourier series decomposition of
a waveform, except that the waveforms F.sub.j need not be and typically are
not sinusoidal. In effect, for each EP electrode i the first waveform F.sub.1
(t) is derived by least-squares curve fitting the waveform EP.sub.i (t), the
second waveform F.sub.2 (t) is derived by least-squares curve fitting the difference
between waveforms EP.sub.i (t) and F.sub.1 (t), etc. until the index K in F.sub.K
(t) is high enough for the sum of the waveforms to satisfactorily approximate
EP.sub.i (t). Using a historical database of the EP responses to each of a set
of standard stimuli of a large population of individuals believed to be normal,
the distribution of these responses has been determined for each stimulus and
each electrode, i.e., the mean a.sub.ij and the standard deviation sigma.sub.ij
are determined and stored for this large population of normal individuals. The
desired factor Z-scores Z.sub.ij then are determined in accordance with:
where
Similarly derived factor scores Z.sub.ij for the SP waveforms can be used in
the invented process described below in place of the features discussed in detail,
or in combination with such features.
Such features and factor Z-scores can be used to evaluate if and how patients
differ from norms which in turn are evaluated from processing similar features
derived from large populations of subjects believed to be normal in the relevant
context These norms can be in the form of growth curves representative of the
evolution of brain activity with age and are stored in said Spectrum 32 instrument
in the form of a database of age regression expressions which define the distribution
of every feature of interest as a function of age in a population of subjects
believed to be normal. For a patient of a specified age, the instrument extracts
from this database the mean value and the standard deviation to be expected
for each feature of a group of "normal" subjects the same age as the patient
The instrument automatically evaluates the difference between the value of each
feature observed in the patient and the age-appropriate value predicted by the
age regression expressions in the database and evaluates the probability that
the observed value in the patient would belong to the "normal" group, taking
into account the distribution of values in that "normal" group. This process
is known as a Z-transformation and yields a Z-score for each feature, derived
by dividing the difference between the observed value and the mean of the expected
"normal" value by the standard deviation of the expected "normal" value. This
process rescales all relevant data into units of probability (or units proportional
to probability), yielding a uniform scale in all dimensions which can simplify
further comparisons and evaluations of relationships between features.
These Z-scores can then be used to distinguish among different clinical conditions
and subgroups within a population through discriminant analysis using discriminant
functions comprised of weighted combinations of subsets of variables each of
which is believed to contribute to an overall discrimination in some significant
way. The distributions of features of two groups of subjects (where the groups
are believed to differ in some way, e.g., to belong to different diagnostic
categories) can be thought of as two clouds of points in a multidimensional
space in which each dimension corresponds to a feature. There may be no significant
differences between the two groups in some dimensions (i.e., in some features)
but there may be significant differences in other dimensions. An identification
problem arises when these clouds of points overlap (i.e., when there is no apparent
significant difference between the two groups with respect to some features).
as an attempt to define a boundary through the clouds to create a first zone
which includes as much as practicable of the first group and as little as possible
of the second group and a second zone which includes as much as practicable
of the second group and as little as practicable of the first group. A third
zone can be defined to encompass an overlap region where it is believed that
no reliable classification can be made. In principle, a discriminant function
weights the values of selected features for a new individual and adds these
weighted values to specify a single point in the relevant multidimensional space.
The theory is that this single point then would be in one of the three zones,
and the individual would be classified accordingly. See John, E. R. et al.,
The Use of Statistics in Electrophysiology, Handbook of Electroencephalography
and Clinical Neurophysiology, Revised Series, Vol. 1, Methods of Analysis of
Brain Electrical and Magnetic Signals, edited by Gevins, A. S. et al., Elsevier,
1987, particularly at pages 534-539.
While the mathematical principles of discriminant analysis have been known for
some time and have been used in practice for other purposes, including to classify
groups of individuals, it is believed that there has been no prior art approach
to make such discriminant analysis practical and sufficiently accurate for classifying
an individual into one or more diagnostic category on the basis of the individual's
probability of the presence of a specific brain electrophysiological response.
It is believed that the prior art techniqus of this nature were focused on finding
the difference between groups, which optimizes factors typically different from
those which need to be optimized for practical classification of an individual,
and that the prior art processes for discriminaiton between groups may not allow
a practically accurate classification of an individual. For example, Prichap,
L. S., Neurometric Quantitative EEG Features of Depressive Disorders, Cerebral
Dynamics, Laterality and Phychopathology, Proc. of Third International Symposium
on Cerebral Dynamics, Laterailty and Psychopathology held in Hakone, Japan,
Oct. 14-18, 1986, edited by Takahashi, R. et al., Elsevier, 1987, pp. 55-69
discusses at page 59 et seq. the use of prior art discriminant analysis to classify
one subgroup of individuals relative to another but does not disclose what particular
selection of weighted features could be used in the analysis. It is believed
that others have claimed prior art use of discriminant analysis to classify
groups of individuals and have used plots of discriminant values in which the
values for individuals appear as points on the plot but it is believed that
this invention is the first to provide a practical way to classify an individual
patient with respect to specified disorders in the manner disclosed below. See,
for example, Shagass, C. et al., Phychiatric Diagnostic Discriminations with
Combinations of Quantitative EEG Variables, British Journal of Psychiatry (1984),
pp. 581-592, and Ford, M. R. et al., EEG Coherence and Power in the Discrimination
of Psychiatric Disorders and Medication Effects, BIOL Psychiatry, 1986, 21:1175-1188.
In accordance with a nonlimiting embodiment of the invention, a probabilistic
psychiatric classification of an individual patient can be determined using
discriminant functions derived from stepwise discriminant analysis of test populations.
Each discrimination is based on n functions, where n is equal to the number
of groups in that discrimination. The functions are defined as the sum of selected
Neuro metric variables, each multiplied by a coefficient. The result of each
function is a single discriminant score s.sub.i. The result of each function
is a single discriminant score s.sub.i. A classification probability P.sub.i
that an individual belongs to group i is calculated according to the following
formula: ##EQU1##
The group i for which an individual has the highest probability P.sub.i is selected
as a potential classification group.
This probability P.sub.i is then compared to guardband cutoff levels for this
group, a.sub.i,a'.sub.i,a".sub.i, . . . , where a.sub.i <a'.sub.i <a".sub.i,
. . . which correspond to classification errors .epsilon..sub.i,.epsilon..sub.i
' and .epsilon..sub.i ", where .epsilon..sub.i <.epsilon..sub.i '<.epsilon..sub.i
". For example, .epsilon..sub.i =10%,.epsilon..sub.i '=5%, and .epsilon..sub.i
.increment.=2.5%.
If P.sub.i <a.sub.i then the individual is not classified. If a.sub.i .ltoreq.P.sub.i
<a'.sub.i then the individual is classified as group i, with confidence 1-.epsilon..sub.i.
If a.sub.i '.ltoreq.P.sub.i <a.sub.i .increment. then the individual is classified
as group i, with confidence 1-.epsilon..sub.i '. If a.sub.i .increment..ltoreq.P.sub.i
then the individual is classified as group i, with confidence 1-.epsilon..sub.i
.increment..
The selection of variables included in the discriminant function and cutoff
levels is based on classification curves. FIG. 4 illustrates classification
curves for group 1 of a three group discriminant.
Curve TP.sub.1 shows the percent of correctly classified individual for the
independent replication of group 1 as a function of probability cutoff level.
Curves FP.sub.21 and FP.sub.31 show percent of incorrectly classified individuals
into group 1 from independent replications of group 2 and 3 as functions of
probability cutoff levels.
The cutoff level a.sub.i for a given classification error .epsilon..sub.i is
selected in such a way that the maximum percent of FP across all other groups
is equal to .epsilon..sub.i. The additional levels a'.sub.i and a.increment..sub.i
are selected in the same way.
The sensitivity of classification for the group is estimated based on curve
TP.sub.1.
Analogous classification curves are constructed for all groups in the discriminant.
The variables for the discriminant function are selected to provide optimal
sensitivity for all groups for the desirable level of classifications errors.
These levels can be different for different groups.
In practice, the invention can be embodied in an instrument such as said Spectrum
32 instrument and can be used interactively by the instrument operator to evaluate
a patient. For example, the operator can ascertain from clinical observation
and from questioning the patient if there are factors which might make the discriminant
analysis, or some aspect of it, less reliable and can modify of discontinue
the discriminant analysis accordingly. For example, certain medication can normalize
an otherwise abnormal profile while other medication can introduce drug-related
abnormal features. For such cases the invented system permits the operator to
bypass some or all of the automated patient classification steps or to manually
force classification into a selected category.
---------------------------------------------------------------------------------------------
It
should be noted that while the exemplary embodiment of the inventions has been
described in detail above in terms of the processing of SP data, this is not
a limitation of the invention and that the principles of the invention apply
to similar processing of EP data as well to combinations of SP and EP data.
Comments