JAMA & ARCHIVES
Arch Fam Med
SEARCH
GO TO ADVANCED SEARCH
HOME  PAST ISSUES  TOPIC COLLECTIONS  CME  PHYSICIAN JOBS  CONTACT US  HELP
Institution: CLOCKSS  | My Account | E-mail Alerts | Access Rights | Sign In
  Vol. 7 No. 5, September 1998 TABLE OF CONTENTS
  Archives
 • Online Features
  Original Contribution
 This Article
 •Abstract
 •PDF
 •Send to a friend
 • Save in My Folder
 •Save to citation manager
 •Permissions
 Citing Articles
 •Citation map
 •Citing articles on HighWire
 •Citing articles on Web of Science (69)
 •Contact me when this article is cited
 Related Content
 •Related articles
 •Similar articles in this journal
 Topic Collections
 •Primary Care/ Family Medicine
 •Psychiatry
 •Depression
 •Stress
 •Diagnosis
 •Alert me on articles by topic

False Positives, False Negatives, and the Validity of the Diagnosis of Major Depression in Primary Care

Michael S. Klinkman, MD, MS; James C. Coyne, PhD; Susan Gallo, PhD; Thomas L. Schwenk, MD

Arch Fam Med. 1998;7:451-461.

ABSTRACT



Objective  To explore the issues of diagnostic specificity and psychiatric "caseness" (ie, whether a patient meets the conditions to qualify as a "case" of a disease or syndrome) for major depression in the primary care setting.

Design  A cross-sectional study comparing the demographic, clinical, and mental health characteristics of patients identified as depressed by their family physicians with those meeting diagnostic criteria for major depression on the criterion standard Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition.

Setting  The offices of 50 family physicians from private and academic practice in southeast Michigan.

Patients  A total of 1580 consecutive adult patients being seen for routine primary care services, from whom a weighted sample of 372 patients completed a set of mental health screening and diagnostic instruments.

Main Outcome Measures  Patients were assigned to 1 of 4 groups (true positive, false positive, false negative, and true negative) based on clinician identification and Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition diagnosis. Differences between the 4 groups in demographic and clinical characteristics, scores on mental health instruments, and mental health history were explored.

Results  Physician identification of depression was strongly associated with increased familiarity with the patient and the presence of suggestive clinical cues, such as history of or treatment for depression, patient distress, and presence of vegetative symptoms. Patients in the false-positive group displayed significantly higher levels of distress and impairment and were significantly more likely to have a history of mental health problems and treatment than were those in the true-negative group. The 2 "misidentified" groups, false positives and false negatives, were indistinguishable in their clinical characteristics (impairment, distress, or mental health history). Both groups' scores occupied the middle ground between true positives and true negatives on most clinical characteristics. Physicians appeared to discriminate between these 2 groups on the basis of their knowledge of the patient's clinical history.

Conclusions  Misidentification of depression in primary care may be in part an artifact of the use of the psychiatric model of caseness in the primary care setting. Our results are most consistent with a chronic disease–based model of depressive disorder, in which patients classified as false positive and false negative occupy a clinical middle ground between clearly depressed and clearly nondepressed patients. Family physicians appear to respond to meaningful clinical cues in assigning the diagnosis of depression to these distressed and impaired patients.



INTRODUCTION


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

THE PUBLICATION of several landmark studies of depression in primary care in the 1980s established that depressive disorders were a major cause of morbidity in routine primary care practice.1-7 The next wave of publications in this area focused on primary care clinicians' sensitivity in recognizing depressive disorders in routine practice, consistently finding that primary care clinicians recognize depression in less than half of their depressed patients.8-10 The third wave, controlled trials in primary care settings, demonstrated that routine feedback of scores on case-finding instruments before office visits can improve recognition rates for major depressive disorder (MDD) and increase treatment rates.11-16 These findings provided the rationale for the screening-detection-treatment-improvement paradigm operationalized in the Agency for Health Care Policy and Research/National Institute of Mental Health clinical guideline for primary care detection and treatment of depression.17-18

To date, however, there is little evidence to support the critical links in the paradigm: that improved detection and treatment lead to improved outcomes. Callahan et al,19 in a prospective randomized trial of treatment of depressed elderly patients, found that intensive screening and individualized clinician feedback resulted in increased detection and treatment rates for MDD but did not result in improved outcomes. Katon et al20 examined the impact of a collaborative management protocol on adherence to treatment, patient satisfaction with care, and reduction in depression symptoms; although billed as a successful trial, the intervention improved outcomes (satisfaction and reduction in symptoms) only for patients with MDD who required adjustment of their medication regimen during the intervention period. In a secondary analysis of data from the collaborative care trial, Simon et al21 compared those receiving "adequate" vs "inadequate" antidepressant treatment and found similar rates of clinical improvement in both groups as measured by either of 2 symptom scales or by the proportion of patients meeting diagnostic criteria at 4-month follow-up. Schulberg et al22 found no difference in outcome at 6 months between patients with recognized and unrecognized depression in an observational study. Ormel et al,23 in another observational study, reported that detection, but not treatment, of depression was associated with improved outcome at 1 year. Tiemens et al,24 in a prospective multisite observational study, found that 3- and 12-month clinical outcomes for patients with recognized and unrecognized depression were indistinguishable: both groups improved in all measurements of psychological and occupational function.

Our failure to link improved detection and treatment to better outcomes may reflect fallacies in the key assumptions underlying the screening-detection-treatment-improvement paradigm. The first assumption, that unrecognized cases of MDD do not differ from recognized cases, is strongly contradicted by recent evidence. Coyne et al,25 while confirming that family physicians detected less than half of their patients with current major depression, found that they detected 73% of severely depressed patients, while "missed" patients barely met minimal criteria of the Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition (DSM-III-R)26 and were not functionally impaired. Simon and Von Korff27 observed that undetected and untreated patients had milder and more self-limited depression than detected patients and a similar rate of recovery over 12 months. Tiemens et al24 found that recognition was associated with higher initial severity of psychopathological conditions and occupational disability.

The second assumption is that the specificity of primary care physician detection is or can be made sufficiently high to make increased detection efforts worthwhile: in other words, that the increase in the "true-positive" rate (patients with MDD correctly diagnosed) will more than offset the increase in the "false-positive" rate (patients without MDD "overdiagnosed" as depressed) when detection is enhanced by screening protocols. There is little evidence available to support or refute this assumption. Klinkman et al28 found that unaided physicians' specificity in detecting MDD was higher than that of case-finding instruments, but that patients labeled as depressed by family physicians still had less than even odds of meeting diagnostic criteria for major depression. Gerber et al9 found that 13% of psychiatric interview–negative (nondepressed) patients and 26% of patients with nondepressive psychiatric diagnoses were identified as depressed by their primary care clinician. These patients in the false-positive group had more vague or amplified chief complaints and were given significantly higher ratings of psychosocial stress by clinicians than were correctly identified nondepressed patients. Although the extent of overdiagnosis in routine practice is not known, it is possible that efforts to increase detection and treatment of MDD will also result in treatment of increased numbers of nondepressed patients, who may be adversely affected by treatment.

This study explores the issues of physician recognition and diagnostic specificity in the primary care setting by examining patient demographic and clinical factors associated with detection of depression in the practices of 50 family physicians across southeast Michigan. We focused on patients "overdiagnosed" or "underdiagnosed" by physicians in comparison with a diagnostic standard, also referred to as "off-diagonal" patients.29 We sought to answer the following questions: (1) What is the epidemiology of detection of depression in primary care? (2) Which factors are associated with physician overdetection of nondepressed patients as depressed ("false positives")? (3) What are the demographic, clinical, and mental health characteristics of these patients in the false-positive group in comparison with those classified as true positive, false negative, and true negative?


PATIENTS AND METHODS


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

SETTING

Patients included in this study were recruited from the practices of 50 board-certified family physicians in southeastern Michigan during the period from September 1, 1990, through December 31, 1991. The practices represented the full spectrum of family practice in suburban southeastern Michigan. Participating family physicians included clinicians in full-time practice in rural and suburban communities; members of the Michigan Research Network, a primary care research network administered by the Michigan Academy of Family Physicians; and a small number of full-time faculty members of the University of Michigan Department of Family Practice, Ann Arbor. A few patients were seen by third-year family practice residents in training in the participating practices. All subjects were patients receiving routine care at the site where initial screening occurred.

SAMPLING AND DATA COLLECTION

A 2-stage sampling strategy, described in detail in previous publications,30-31 was used to select subjects. In the first stage, 1928 consecutive adult patients waiting for a scheduled visit with their primary physician were asked to complete the Center for Epidemiologic Studies—Depression screening instrument (CES-D),32 along with a number of additional demographic questions and self-rating items, including 7-point Likert-type scales for level of depression, general health, and perceived stress level. The CES-D is a 20-item instrument measuring mood state and vegetative signs suggestive of depression, with possible scores ranging from 0 to 60. The sensitivity and reliability of the CES-D are well established, and it is considered one of the standard case-finding instruments for depressive disorders in medical care settings. Completed patient questionnaires were available for 1580 patients (82.0% of those initially approached), and there were no significant demographic differences between those who did and did not complete initial screening (data not shown). Patient responses to these items were not available to the clinician at the time of the visit.

Immediately after the office visit, physicians completed a Physician Rating Form that contained the same 7-point Likert scales for rating physician perception of patient level of depression, general health, and stress; direct questions regarding physician knowledge of depression history, use of antidepressants or other psychoactive medications, or receipt of any of several types of mental health treatment within the past 6 months; the number of times they had seen the patient in the past year; and a single yes/no item asking whether the patient had clinically significant depression. These measures allowed us to capture the mental health content and context of each encounter from both patient and physician perspectives.

In the second stage, a weighted subsample of the 1580 patients was selected to complete the Structured Clinical Interview for DSM-III-R (SCID)33 and the Hamilton Rating Scale for Depression (HAM-D),34-35 administered by a trained psychiatric social worker or master's-level clinician psychologist. The SCID is an extensively validated instrument that uses a highly structured clinical interview with prompts to elicit DSM-III-R–congruent diagnoses. The HAM-D is one of the most widely used interviewer-administered tools for the assessment of severity of depressive symptoms. Scores on the HAM-D reflect both the intensity and frequency of specific symptoms. We used the Structured Interview Guide for the instrument,35 which has been shown to substantially increase interrater agreement for individual items of the scale. Consistent with the original design of the scale, we derived a total scale score based on 17 of the 21 items. At this time, subjects also completed a comprehensive interview, the Michigan Inventory of Life Events, which included several items assessing the level and severity of comorbid medical illness (copies available from authors upon request).

All interviews occurred within 2 weeks of the office visit. This interval was chosen to match the period covered by the CES-D to that covered by the SCID. The instructions accompanying the CES-D indicate that responses should refer to the past week, while the SCID questions cover the past month: this allows a maximum of 3 weeks between visit and interview to assure that the 2 instruments measure the same time frame. We adopted the more stringent 2-week limitation to ensure overlap between the 2 instruments. This process resulted in the assignment of a criterion standard DSM-III-R psychiatric diagnosis, as well as 2 estimates of the severity of the diagnosis, the Global Assessment of Functioning Scale (GAF) from the SCID and the overall score on the HAM-D.

Interrater reliability was very high for both SCID and HAM-D. Administration of the SCID was audiotaped with patients' consent, and a subset of tapes was recoded by a second interviewer to determine the reliability of interviewer judgment and scoring. Interrater reliability was 97% for SCID symptom level ratings, 98% for summary diagnostic variables, and 93% for specific diagnoses in the mood disorder category. Interrater reliability for the HAM-D was 0.97. No significant differences were seen in either demographic variables or CES-D scores between those who did (84.0%) and did not (16.0%) consent to the second-stage interview (data not shown).

DATA WEIGHTING

The data-weighting procedure was designed to obtain approximately equal numbers of depressed and nondepressed patients for the second-stage interview by oversampling high CES-D scorers. Four hundred twenty-five patients received the second-stage diagnostic interview, 271 of whom scored above the cutoff point of 16 on the CES-D and 154 of whom scored below it. To accurately estimate the prevalence of depression in the sample, a second weighting procedure was performed (overweighting the low CES-D scorers' results) to correct for the oversampling of high CES-D scorers while preserving the number of cases of major depression for analysis. The data-weighting procedure has been described in greater detail elsewhere.30-31

All analyses not dependent on the prevalence of MDD were performed with unweighted data. Physician Rating Forms were unavailable for 53 (12.5%) of the 425 patients receiving the SCID, but no significant differences were seen between cases with or without Physician Rating Form data in demographic or clinical variables or in the relative proportions of individual DSM-III-R diagnoses (data not shown). These analyses show a correspondingly smaller number of cases (n=372).

PATIENT ASSIGNMENT INTO TRUE-POSITIVE, FALSE-POSITIVE, FALSE-NEGATIVE, AND TRUE-NEGATIVE GROUPS

On the basis of the combination of physician identification (defined as a yes response to the question regarding clinically significant depression) and SCID diagnosis of MDD, 4 groups of patients were created: depressed patients identified by physicians (true positives); nondepressed patients misidentified as depressed by physicians (false positives); depressed patients not identified by physicians (false negatives); and nondepressed patients correctly identified as nondepressed by physicians (true negatives).

STATISTICAL ANALYSIS

The sensitivity, specificity, and positive predictive value associated with physician identification of depression were calculated from the weighted data set by standard methods. All other analyses were performed with unweighted data. The {chi}2 and t tests were used to compare demographic and clinical characteristics of identified and nonidentified patients. Comparisons between the 4 groups (true positives, false positives, false negatives, and true negatives) were performed with 1 of 3 methods: {chi}2 analysis with Bonferroni correction for multiple 2x2 comparisons, 1-way analysis of variance with post hoc Scheffé tests for comparison of individual means, or analysis of covariance (ANCOVA).

Patient scores on the psychiatric instruments (CES-D, HAM-D, and GAF), patient self-rated level of depression, perceived stress and general health, and physician ratings for level of depression, perceived stress, and general health were compared by means of a 2-factor ANCOVA model. Group and sex served as grouping variables in the model, with age and educational level incorporated as covariates. In 2 comparisons, patient self-rated health and level of depression, the assumption of homogeneity of slopes for the covariate term educational level was violated; a recoded binary term for education (through high school/at least some college) was included as a third grouping variable, with age the sole covariate in these ANCOVA models. Marital status, ethnic status, and level of comorbid medical illness were not included in the final ANCOVA models, as they had no effect on any of the comparisons. Full sets of interaction terms were specified for each model, but all except group-by-age were removed from all models because of lack of impact on the analysis and their negative impacts on degrees of freedom and error terms for adjusted means. All reported mean scores were adjusted for possible differences in age and educational level by the least squares means method.36

All analyses were carried out with the statistical analysis software packages SPSS (version 6.1; SPSS, Inc, Chicago, Ill) and SuperANOVA (version 1.1; Abacus Concepts, Inc, Berkeley, Calif) on a Macintosh microcomputer (Apple Computer, Cupertino, Calif).


RESULTS


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

SAMPLE DEMOGRAPHICS AND PREVALENCE OF DEPRESSION

Subjects completing the SCID were predominately white (92.9%), female (76.7%), and currently married (59.8%), with a mean age of 39.6 years. The majority (62.4%) had some education beyond high school. In the final weighted sample, major depression (including MDD, "double depression," and depressed bipolar disorder) was present in 13.4%, and the estimated prevalence for all depressive disorders (major depression plus dysthymia, adjustment disorder, and bereavement) was 22.0%.

THE EPIDEMIOLOGY OF DETECTION: WHO DID PHYSICIANS IDENTIFY AS DEPRESSED?

Physician diagnostic sensitivity for MDD was low, as measured in the weighted sample. Physicians identified only 35% of SCID-positive patients as depressed (sensitivity, 0.349). Although specificity was high (0.929), less than half of identified patients met criteria for major depression (positive predictive value of physician identification, 0.446).

One hundred fifteen of the 372 patients for whom a complete set of data was available were identified as depressed by either physicians or the SCID. Sixty-five were identified by physicians, while 81 met criteria for MDD on the SCID. As shown in Figure 1, 31 patients (26.9% of the depressed total) were positively identified by both methods, while 34 (29.6%) were in the false-positive group, identified by physicians but SCID negative, and 50 (43.5%) were in the false-negative group, SCID positive but not identified by physicians.



View larger version (13K):
[in this window]
[in a new window]
Proportions of patients identified as depressed by physicians, Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition (SCID), both methods, or neither method (N=372).


The demographic and clinical characteristics of those patients identified and not identified by clinicians (independent of DSM-III-R diagnosis) are shown in Table 1. These 2 groups had similar demographic characteristics: the sole exception was age, with identified patients on average older. However, a comparison of clinical characteristics of the 2 groups suggests that clinicians were responding to meaningful clinical cues in assigning the diagnosis of depression. A significantly higher proportion of identified patients had at least 1 previous psychiatric hospitalization and previous treatment for mental health problems as revealed on the SCID. Identified patients had significantly higher scores on the CES-D (30.0 vs 17.0; P<.001) and lower scores on the GAF (61.5 vs 74.3; P<.001). Identified patients also had significantly higher scores on the HAM-D (10.3 vs 6.4; P<.001), although neither group achieved the mean score of 12 or above considered diagnostic of major depression.


View this table:
[in this window]
[in a new window]
Table 1. Demographic and Clinical Characteristics Associated With Physician Identification of Depression*


Identified patients had significantly higher mean self-ratings of depression (4.95 vs 3.45; P<.001) and stress (5.81 vs 5.02; P<.001) and a significantly lower mean self-rating of general health (4.25 vs 4.99; P<.001) on the 7-point Likert-type scales. Identified patients also reported a significantly higher proportion of positive responses for all 4 depressive symptoms included on the patient questionnaire: poor appetite, decreased energy level, sleep disturbance, and feeling "worn out."

Clinicians' knowledge of previous mental health problems and treatment was greater for identified patients for all variables included in the study. Clinician knowledge of previous depressive episodes was significantly higher for identified patients (82.8% vs 15.4%; P<.001), while clinician knowledge of treatment within the past 6 months was higher in identified patients for each treatment modality: counseling by primary physician, individual or group therapy, referral to mental health professional, and marital or family therapy.

Identification was also strongly associated with physician familiarity with patients. Identification almost never occurred in the context of an initial physician-patient contact, with physicians detecting depression in only 2% of patients they had not previously seen. This is significantly lower than the corresponding identification rates of 11% in patients they had personally seen between 1 and 4 times and 16% in those seen more than 4 times in the past year ({chi}22=9.50; P=.008).

FACTORS ASSOCIATED WITH PHYSICIAN "OVERDETECTION" OF DEPRESSION

To explore the issue of overdetection, we compared the demographic and clinical characteristics of nondepressed patients "misidentified" as depressed by physicians (false positives) with those correctly seen as not depressed (true negatives). The results of this comparison are displayed in Table 2. Physicians were more likely to misidentify older patients as depressed, but the 2 groups were otherwise comparable on demographic variables.


View this table:
[in this window]
[in a new window]
Table 2. Comparisons Between Identified and Not Identified Patients Without Current Major Depression: False Positives vs True Negatives*


Comparisons of clinical characteristics showed the same pattern as seen for identified vs not-identified patients. Patients in the false-positive group were significantly more likely to have a history of mental health treatment and previous mental health–related hospitalization than were those in the true-negative group: almost three fourths of patients in the false-positive group disclosed a history of mental health treatment on the Structured Clinical Interview for DSM-III-R structured clinical interview. Those in the false-positive group also scored significantly higher on the CES-D (28.4 vs 15.4; P<.001) and significantly lower on the GAF (68.4 vs 77.3; P<.001). Patients in the false-positive group scored higher than those in the true-negative group on the HAM-D, but this difference was not statistically significant (6.8 vs 5.9; P=.66), and neither group had scores close to the range usually seen in depressed patients. Patient self-rated level of depression was significantly higher (5.08 vs 3.33; P<.001); stress level, higher (5.71 vs 4.81; P =.002); and general health, significantly lower (4.27 vs 5.08; P=.001) in the false-positive group.

Physicians' perceptions were concordant with these differences in clinical appearance. Mean physician ratings of levels of depression (6.01 vs 2.75; P<.001) and stress (5.93 vs 4.22; P<.001) were significantly higher in the false-positive than in the true-negative group. Physician ratings of patients' general health were significantly lower in the false-positive than the true-negative group (4.64 vs 5.23; P=.01). Physicians also reported significantly higher levels of knowledge of previous depression (78.8% vs 13.9%; P<.001) and all types of treatment in the false-positive than the true-negative group.

On closer examination, most false-positive results appeared to occur in patients receiving effective treatment for previously detected MDD. More than half (55%) of the patients in the false-positive group disclosed at least 1 previous episode of MDD on the SCID, and most of those patients were receiving some type of antidepressant medication at the time of the index visit. Few of the patients in the false-positive group met diagnostic criteria for other mood disorders. Four met criteria for dysthymia, 2 for bereavement, and 1 for mixed bipolar disorder. Only 1 patient met criteria for "minor" or "subthreshold" depression after rescoring SCID responses to incorporate this diagnosis.

COMPARISONS BETWEEN TRUE POSITIVES, FALSE POSITIVES, FALSE NEGATIVES, AND TRUE NEGATIVES

The clinical characteristics of both "misidentified" groups (false positives and false negatives) were explored by comparing the demographic and clinical characteristics of all 4 groups of patients. The results of these comparisons are displayed in Table 3 and Table 4.


View this table:
[in this window]
[in a new window]
Table 3. Comparisons of Patient Demographics and Clinical Characteristics Across 4 Groups: True Positives, False Positives, False Negatives, and True Negatives*



View this table:
[in this window]
[in a new window]
Table 4. Comparisons of Physician Knowledge and Rating Across 4 Groups: True Positives, False Positives, False Negatives, and True Negatives*


Demographic comparisons (Table 3) showed a significant difference between the 4 groups only in mean age. Post hoc analysis indicated that patients in the false-positive group were on average older than those in both the false-negative and true-negative groups. This potential confounder was controlled for by adjusting for age in all analyses.

Comparisons of clinical characteristics (Table 3) showed a striking pattern of similarities and differences between groups. With the sole exception of HAM-D scores, values for the 2 "misidentified" groups, false positives and false negatives, were statistically indistinguishable and fell between those of the true positives and true negatives. Although the ANCOVA results for each of these comparisons showed a significant group effect and multiple significant individual group differences, no significant differences between false positives and false negatives were found (Table 3). True positives had the highest proportion of patients with previous psychiatric hospitalization and mental health treatment, followed by false positives and false negatives, then true negatives. Mean CES-D scores were highest for true positives (35.2), followed by false positives and false negatives (28.4 and 24.3, respectively), then true negatives (15.4). True positives had the lowest GAF rating (56.9), followed by false negatives and false positives (62.8 and 68.4, respectively), then true negatives with the highest mean GAF (77.3). Significant differences were noted for all individual group comparisons except the false positive–false negative comparison. The pattern seen for HAM-D scores more closely corresponded to SCID results: both true and false positives scored higher than true and false negatives. This was an expected finding, in that HAM-D items are closely related to the diagnostic criteria operationalized in the SCID.

Mean patient self-rated levels of depression, stress, and general health (Table 3) followed a similar pattern. Mean ratings were similar for true positives, false positives, and false negatives, with the mean rating of the true-negative group significantly "better" than for the other 3 groups. For example, mean self-rated depression was highest for true positives (5.51), followed by false positives (5.08), false negatives (4.36), then true negatives (3.33). Significant differences were seen in all individual group comparisons except the true positive–false positive and false positive–false negative comparisons.

The 4-group comparison of clinicians' knowledge of previous mental health problems and treatment (Table 4) showed a consistent but different pattern of results. For each of these variables, true and false positives were indistinguishable, as were false and true negatives, while significant differences between false positives and false negatives were seen in each comparison. Physicians reported a significantly higher proportion of patients with known history of depression for true positives and false positives (87.1% and 78.8%; P=.39) than for false negatives and true negatives (23.4% and 13.9%; P=.10; {chi}23 for 4-group comparison, P<.001). The identical pattern of results was seen for each treatment modality.

Physician ratings of levels of depression and stress (Table 4) followed the same pattern as seen for their knowledge of mental health history. For example, mean physician ratings of level of depression were significantly higher in the true-positive and false-positive groups (5.84 and 6.01, P=.59) than in the false-negative and true-negative groups (2.97 and 2.75; P=.24; ANCOVA for 4-group comparison, P <.001). Mean physician ratings of general health followed a slightly different pattern, with true-positive, false-positive, and false-negative groups having significantly lower ratings than the true-negative group.

The overall pattern of results from the 4-group comparison suggests that the clinical characteristics of misidentified patients (false positives and false negatives) were similar, and that physicians discriminated between these 2 groups on the basis of their perceived knowledge of the patient's clinical history.


COMMENT


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

In this study, a 2-stage sampling strategy accompanied by data-weighting procedures resulted in an epidemiologically accurate sample of primary care patients with major depression seen in routine practice in southeastern Michigan. Previous work with this data set has shown that family physicians find more severely depressed individuals and miss those with minimal impairment, and that detection rates are lower for patients whose chief complaint at the index visit is a somatic problem.25 This work extended that line of inquiry to examine the clinical and demographic characteristics of those misidentified in both directions, false positives and false negatives. Our cross-sectional, prevalence-based study design did not allow us to distinguish between new episodes (incident cases) and ongoing episodes (prevalent cases) of depression, nor did it allow us to explore how detection might unfold over time and subsequent office visits. Despite these limitations, we believe the study contains 3 robust and important findings.

First, 2 factors emerged as central to physician identification: time and a clinical "picture" suggestive of depression. Identification was strongly associated with increased physician familiarity with the patient, almost never occurring during an initial encounter with a new patient. The family physicians in this study also appeared to be responding to suggestive clinical cues, such as history of depression or previous treatment for depression, patient distress, and presence of vegetative symptoms in assigning the diagnosis of depression. These findings are consistent with published work.9, 37-38

Second, physician misidentification of nondepressed patients as depressed does not appear to be the result of random error. Two distinct groups of nondepressed patients were identified in this study, and physicians effectively discriminated between them. Patients in the false-positive group were very different from those in the true-negative group. They displayed significantly higher levels of distress and impairment and were significantly more likely to have a history of mental health problems and treatment: the majority appeared to be patients undergoing successful treatment for MDD and no longer meeting DSM-III-R diagnostic criteria. Again, physicians appeared to use the suggestive clinical cues described above in assigning the label of "depression" to these distressed and impaired patients.

Third, and perhaps most importantly, we found a striking pattern of similarities and differences in the demographic and clinical characteristics of the 2 misidentified groups, false positives and false negatives. It was not possible to distinguish between these 2 groups on the basis of impairment (GAF ratings), distress (self-reported levels of depression and stress and CES-D scores), or predisposing history (proportions with previous mental health treatment or hospitalization). False positives and false negatives were indistinguishable when viewed from any perspective other than the formal diagnostic criteria of the SCID or the closely related HAM-D. Both groups' scores occupied the middle ground between the clearly depressed true positives and clearly nondepressed true negatives on most clinical characteristics. Although false positives and false negatives appeared clinically indistinguishable, physicians clearly appeared to discriminate between these 2 groups on the basis of their knowledge of the patient's clinical history. Physicians reported a significantly higher proportion of patients with known history of mental health problems and treatment for false positives than for false negatives: here again, physicians appeared to use these clinical cues when assigning the diagnosis of depression.

Taken in the context of recent studies exploring the detection of depression in primary care,38-39 these findings suggest that primary care physicians use nonspecific clinical cues such as distress and impairment, as well as their prior knowledge of the patient, in diagnosing or detecting depression. Using this approach, they find a high proportion of patients with MDD and significant impairment (true positive) and sort out many who are clearly not depressed (true negative). For patients "in the middle," not clearly a part of either group (false positive and false negative), clinical cues have major influence over the decision to diagnose and treat "depression."

These results do not necessarily mean that "all is well" in routine primary care practice. The fact that these cues are nonspecific also implies that they can mislead clinicians. From the perspective of the current model of psychiatric caseness in primary care, physicians' use of cues in this study resulted in both overdiagnosis of distressed patients without MDD (false positive) and underdiagnosis of nondistressed patients with MDD (false negative), with a higher number of misdiagnosed than correctly diagnosed cases. Although the clinical implications of this high level of misdiagnosis are potentially serious, there is no evidence to suggest that accurate identification and treatment of mildly depressed individuals in primary care improves clinical outcomes.19-21,24, 27, 40

However, these findings also suggest that the current model of psychiatric caseness in primary care may not accurately represent depressive disorders as they exist in primary care. The current diagnostic standard, the Structured Clinical Interview for DSM-III-R , operationally defines depressive disorder on the basis of a symptom count covering a specified time period. Patients qualify for the diagnosis of MDD in 2-week or 1-month blocks of time. Under current study protocols, patients with MDD who are successfully treated with antidepressants or therapy will not qualify for the diagnosis if screened 1 month later. If these patients are labeled as depressed by primary care clinicians after that 1 month has elapsed, they will be categorized in the false-positive (misidentified) group. Patients in this clinical transition zone, with intermediate levels of distress and moderate severity of symptoms, may only intermittently meet diagnostic criteria and will move from "disease" to "no disease" status on the basis of treatment. These patients become the false-positive and false-negative artifacts of the diagnostic classification, artificially teased apart from one end by diagnostic interview scores and from the other by clinician awareness of their history. More than half of the patients in the false-positive group in this study may fall into this category.

We propose an alternative conceptual model of psychiatric caseness in primary care that looks at depressive disorder as a subacute or chronic condition marked by exacerbation and improvement over time, with the core parameters of severity, staging, and comorbidity.40-42 At higher levels of severity, depressive symptoms may be present all or most of the time, occur without provocation, and cause significant impairment. At intermediate levels of severity, depression may become symptomatic only under certain conditions and result in minimal or moderate impairment, and individual episodes may be short. At minimal severity, exacerbations may occur only rarely and cause minimal impairment; depressive episodes may be self-limited. Episode staging and measurement of medical and mental health comorbidity may provide the added dimensions of time and clinical context necessary to fully characterize depressive episodes.29

This chronic disease model offers an alternative formulation of primary care caseness that is more closely aligned with the clinical approach taken by physicians in this study. Patients in the true-positive group can be seen as appropriately identified patients with more severe disease. Many of the false positives in this study would have been more accurately identified in the model as patients with MDD in a later stage of an episode, successfully treated or in remission.25, 30 The rest of the patients in the false-positive and false-negative groups can be seen as a single group of distressed and possibly depressed patients with intermediate severity who warrant careful examination over time for the presence of a treatable depressive disorder.42 These patients could be diagnosed as depressed on the basis of the clinical cues of distress and impairment alone, history of major depression and recurrence of similar symptoms, or the results of screening tests. Because individual clinicians are likely to use different heuristics for diagnosing patients in this intermediate group, diagnoses assigned to patients in this group will inconsistently match those assigned by structured clinical interview. Patients with minimal distress or impairment would not require identification or treatment, as outcomes would not be improved by labeling and treating. Patients in the true-negative group can be seen as those without the disease.

The disease model might also help explain the inconsistent results seen in recent clinical trials—improvement despite "inadequate" treatment,21 relapse in the presence of adequate treatment,43 and no difference in outcome between adequate and inadequate or no treatment.24, 27, 43 Patients with depression of intermediate or minimal severity are likely to be inconsistently detected and treated, and when detected may be at either the beginning or the end of a short depressive episode. Unmeasured differences in severity, staging, and comorbidity in this intermediate group may account for both unexpected treatment success and failure. The longitudinal focus of the model should also clarify the positions of "minor" and "subthreshold" depression in the universe of mood disorders. These entities may represent an intermediate stage through which patients pass as they enter and leave major depressive episodes, or they may represent depression of intermediate severity that never reaches the level of symptoms or impairment seen in MDD.

Although clearly in the early stages of development, this alternative model offers a framework that more accurately reflects the dynamic, transitional nature of depression as it is observed in primary care. However, the model could only be suggested, not tested, by this cross-sectional study. Refinements in longitudinal study design, as well as further work to characterize the elements of severity, comorbidity, and temporal staging in depressive episodes, will be required before the model can be operationalized and tested in a real-world setting.


CONCLUSIONS


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

In this examination of physician detection of depression in the primary care setting, we found that diagnostic specificity is difficult to achieve, but that this difficulty is most likely caused by problems in applying the psychiatric model of caseness to the primary care setting. Our results are most consistent with a chronic disease-based model of depressive disorder, in which patients in the false-positive and false-negative groups occupy an ill-defined clinical middle ground. Many false positives can be redefined as patients with major depression under treatment or in remission, while false negatives can be redefined as depressed patients with minimal impairment. Family physicians appear to be responding to meaningful clinical cues in assigning the diagnosis of depression to these distressed and impaired patients.


AUTHOR INFORMATION


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

Accepted for publication November 9, 1997.

This work was supported in part by grant RO1MH43796 from the National Institute of Mental Health, Bethesda, Md.

Corresponding author: Michael S. Klinkman, MD, MS, Department of Family Practice, University of Michigan, 1018 Fuller St, Ann Arbor, MI 48109-0708 (e-mail: mklinkma{at}umich.edu).

From the Department of Family Practice, University of Michigan, Ann Arbor.


REFERENCES


 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Conclusions
 •Author information
 •References

1. Robins LN, Helzer JE, Weissman MM, et al. Lifetime prevalence of specific psychiatric disorders in three sites. Arch Gen Psychiatry. 1984;41:949-958. FREE FULL TEXT
2. Kessler LG, Cleary PD, Burke JD. Psychiatric disorders in primary care. Arch Gen Psychiatry. 1985;42:583-587. FREE FULL TEXT
3. Schulberg HC, Saul M, McClelland M, Ganguli M, Christy W, Frank R. Assessing depression in primary medical and psychiatric practices. Arch Gen Psychiatry. 1985;42:1164-1170. FREE FULL TEXT
4. Barrett JA, Barrett JA, Oxman TE, Gerber PD. The prevalence of psychiatric disorders in a primary care practice. Arch Gen Psychiatry. 1988;45:1100-1106. FREE FULL TEXT
5. Katon W. Epidemiology of depression in medical care. Int J Psychiatry Med. 1987;17:93-112. WEB OF SCIENCE | PUBMED
6. Broadhead WE, Blazer D, George LK, Tse CK. Depression, disability days, and days lost from work in a prospective epidemiologic survey. JAMA. 1990;264:2524-2528. FREE FULL TEXT
7. Regier DA, Boyd JH, Burke JD Jr, et al. One-month prevalence of mental disorders in the United States: based on five epidemiologic catchment area sites. Arch Gen Psychiatry. 1988;45:977-986. FREE FULL TEXT
8. Von Korff M, Shapiro S, Burke JD, et al. Anxiety and depression in a primary care clinic: comparison of Diagnostic Interview Schedule, General Health Questionnaire, and practitioner assessments. Arch Gen Psychiatry. 1987;44:152-156. FREE FULL TEXT
9. Gerber PD, Barrett J, Barrett J, Manheimer E, Whiting R, Smith R. Recognition of depression by internists in primary care: a comparison of internist and "gold standard" psychiatric assessments. J Gen Intern Med. 1989;4:7-13. PUBMED
10. Perez-Stable EJ, Miranda J, Munoz RF, Ying Y. Depression in medical outpatients: underrecognition and misdiagnosis. Arch Intern Med. 1990;150:1083-1088. FREE FULL TEXT
11. Johnstone A, Goldberg D. Psychiatric screening in general practice: a controlled trial. Lancet. 1976;1:605-609. FULL TEXT | WEB OF SCIENCE | PUBMED
12. Linn LS, Yager J. The effect of screening, sensitization, and feedback on notation of depression. J Med Educ. 1980;55:942-949. WEB OF SCIENCE | PUBMED
13. German PS, Shapiro S, Skinner EA, et al. Detection and management of mental health problems of older patients by primary care providers. JAMA. 1987;257:489-492. FREE FULL TEXT
14. Shapiro S, German PS, Skinner EA, et al. An experiment to change detection and management of mental morbidity in primary care. Med Care. 1987;25:327-339. FULL TEXT | WEB OF SCIENCE | PUBMED
15. Rand EH, Badger LW, Coggins DR. Toward a resolution of contradictions: utility of feedback from the GHQ. Gen Hosp Psychiatry. 1988;10:189-196. FULL TEXT | WEB OF SCIENCE | PUBMED
16. Magruder-Habib K, Zung WWK, Feussner JR. Improving physicians' recognition and treatment of depression in general medical care. Med Care. 1990;28:239-250. FULL TEXT | WEB OF SCIENCE | PUBMED
17. Depression Guideline Panel. Clinical Practice Guideline Number 5: Depression in Primary Care, 1: Detection and Diagnosis. Rockville, Md: US Dept of Health and Human Services, Agency for Health Care Policy and Research; 1993. AHCPR publication 93-0550.
18. Depression Guideline Panel. Clinical Practice Guideline Number 5: Depression in Primary Care, 2: Treatment of Major Depression. Rockville, Md: US Dept of Health and Human Services, Agency for Health Care Policy and Research; 1993. AHCPR publication 93-0551.
19. Callahan CM, Hendrie HC, Dittus RS, Brater DC, Hui SL, Tierney WM. Improving treatment of late life depression in primary care: a randomized clinical trial. J Am Geriatr Soc. 1994;42:839-846. WEB OF SCIENCE | PUBMED
20. Katon W, Von Korff M, Lin E, et al. Collaborative management to achieve treatment guidelines: impact on depression in primary care. JAMA. 1995;273:1026-1031. FREE FULL TEXT
21. Simon GE, Lin EHB, Katon W, et al. Outcomes of "inadequate" antidepressant treatment. J Gen Intern Med. 1995;10:663-670. WEB OF SCIENCE | PUBMED
22. Schulberg HC, McClelland M, Gooding W. Six-month outcomes for medical patients with major depressive disorders. J Gen Intern Med. 1987;2:312-317. WEB OF SCIENCE | PUBMED
23. Ormel J, van den Brink W, Koeter MWJ, et al. Recognition, management, and outcome of psychological disorders in primary care: a naturalistic follow-up study. Psychol Med. 1990;20:909-923. WEB OF SCIENCE | PUBMED
24. Tiemens BG, Ormel J, Simon GE. Occurrence, recognition, and outcome of psychological disorders in primary care. Am J Psychiatry. 1996;153:636-644. WEB OF SCIENCE | PUBMED
25. Coyne JC, Schwenk TL, Fechner-Bates S. Nondetection of depression by primary care physicians reconsidered. Gen Hosp Psychiatry. 1995;17:3-12. FULL TEXT | WEB OF SCIENCE | PUBMED
26. American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition. Washington, DC: American Psychiatric Association; 1987.
27. Simon GE, Von Korff M. Recognition, management, and outcomes of depression in primary care. Arch Fam Med. 1995;4:99-105. FREE FULL TEXT
28. Klinkman MS, Coyne JC, Gallo SM, Schwenk TL. Can case-finding instruments be used to improve physician detection of depression in primary care? Arch Fam Med. 1997;6:567-573. FREE FULL TEXT
29. DeGruy F. Mental health care in the primary care setting. In: Donaldson MS, Yordy KD, Lohr, KN, Vanselow NA, eds. Primary Care: America's Health in a New Era. Washington, DC: National Academy Press, Committee on the Future of Primary Care, Division of Health Care Services, Institute of Medicine; 1996:285-311.
30. Coyne JC, Fechner-Bates S, Schwenk TL. Prevalence, nature and comorbidity of depressive disorders in primary care. Gen Hosp Psychiatry. 1994;16:267-276. FULL TEXT | WEB OF SCIENCE | PUBMED
31. Fechner-Bates S, Coyne JC, Schwenk TL. The relationship of self-reported distress to depressive disorders and other psychopathology. J Consult Clin Psychol. 1994;62:550-559. FULL TEXT | WEB OF SCIENCE | PUBMED
32. Radloff LS. The CES-D scale: a self-report depression scale for research in the general poulation. Appl Psychol Meas. 1977;1:385-401. FULL TEXT
33. Spitzer RL, Williams RBJ, Gibbon M, First M. Structured Clinical Interview for DSM-III-R: Nonpatient Edition (SCID-NP 9/1/89 Version). New York, NY: Biometrics Research Division, New York State Psychiatric Institute; 1989.
34. Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry. 1960;23:56-62.
35. Williams JB. A structured interview guide for the Hamilton Depression Rating Scale. Arch Gen Psychiatry. 1988;45:742-747. FREE FULL TEXT
36. Winer BJ. Statistical Principles in Experimental Design. 2nd ed. New York, NY: McGraw-Hill Inc; 1971.
37. Farmer AE, Griffiths H. Labelling and illness in primary care: comparing factors influencing general practitioners' and psychiatrists' decisions regarding patient referral to mental illness services. Psychol Med. 1992;22:717-723. WEB OF SCIENCE | PUBMED
38. Susman JL, Crabtree BF, Essink G. Depression in rural family practice: easy to recognize, difficult to diagnose. Arch Fam Med. 1995;4:427-431. FREE FULL TEXT
39. Main DS, Lutz LJ, Barrett JE, Mathew J, Miller RS. The role of primary care clinician attitudes, beliefs, and training in the diagnosis and treatment of depression: a report from the Ambulatory Sentinel Practice Network. Arch Fam Med. 1993;2:1061-1066. FREE FULL TEXT
40. Ormel J, Oldehinkel T, Brilman E, van den Brink W. Outcome of depression and anxiety in primary care: a three-wave 3-yearstudy of psychopathology and disability. Arch Gen Psychiatry. 1993;50:759-766. FREE FULL TEXT
41. Callahan CM, Hui SL, Nienaber NA, Musick BS, Tierney WM. Longitudinal study of depression and health services use among elderly primary care patients. J Am Geriatr Soc. 1994;42:833-888. WEB OF SCIENCE | PUBMED
42. Goldberg D. A classification of psychological distress for use in primary care settings. Soc Sci Med. 1992;35:189-193.
43. Rost K, Zhang M, Fortney J, Smith J, Coyne J, Smith GR. Persistently poor outcomes of undetected major depression in primary care. Gen Hosp Psychiatry. 1998;20:12-20. FULL TEXT | WEB OF SCIENCE | PUBMED

RELATED ARTICLES

Relapse of Depression in Primary Care: Rate and Clinical Predictors
Elizabeth H. B. Lin, Wayne J. Katon, Michael VonKorff, Joan E. Russo, Greg E. Simon, Terry M. Bush, Carolyn M. Rutter, Edward A. Walker, and Evette Ludman
Arch Fam Med. 1998;7(5):443-449.
ABSTRACT | FULL TEXT  

Managing Our Depressed Patients: Gold Standards vs Higher Standards
Marian R. Block
Arch Fam Med. 1998;7(5):462-464.
FULL TEXT  


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES

The overdiagnosis of depression in non-depressed patients in primary care
Aragones et al.
Fam Pract 2006;23:363-368.
ABSTRACT | FULL TEXT  

Should we screen for depression?
Gilbody et al.
BMJ 2006;332:1027-1030.
FULL TEXT  

Exploration of DSM-IV Criteria in Primary Care Patients With Medically Unexplained Symptoms
Smith et al.
Psychosom. Med. 2005;67:123-129.
ABSTRACT | FULL TEXT  

General Practitioner Recognition of Mental Illness in the Absence of a 'Gold Standard'
The Mental Health and General Practice Investigati et al.
Aust N Z J Psychiatry 2004;38:789-794.
ABSTRACT | FULL TEXT  

The Challenge of Depression in Late Life: Bridging Science and Service in Primary Care
Gallo and Coyne
JAMA 2000;284:1570-1572.
FULL TEXT  

The Role of Competing Demands in the Treatment Provided Primary Care Patients With Major Depression
Rost et al.
Arch Fam Med 2000;9:150-154.
ABSTRACT | FULL TEXT  

Comorbidity and Diagnosing Depressive Disorders in Family Practice
van Rijswijk et al.
Arch Fam Med 2000;9:123-124.
FULL TEXT  

Managing Our Depressed Patients: Gold Standards vs Higher Standards
Block
Arch Fam Med 1998;7:462-464.
FULL TEXT  




HOME | CURRENT ISSUE | PAST ISSUES | TOPIC COLLECTIONS | CME | PHYSICIAN JOBS | HELP
CONDITIONS OF USE | PRIVACY POLICY | CONTACT US | SITE MAP
 
© 1998 American Medical Association. All Rights Reserved.

DCSIMG