DSM Disorders Disappear in Statistical Clustering of Psychiatric Symptoms (2024)

“Reconstructing Psychopathology: A data-driven reorganization of the symptoms in DSM-5” by Miri Forbes, et al. (was available as a preprint at the time of writing this post, later published in Clinical Psychological Science) is a brilliantly designed and innovative study of the quantitative structure of psychopathology with important ramifications for our understanding of psychiatric classification. No one has conducted a study quite like this before, and the results are remarkable. It takes place in the context of the development of Hierarchical Taxonomy of Psychopathology (HiTOP) which is a dimensional, hierarchical, and quantitative approach to the classification of mental disorders, and relies on identification of patterns of covariation among symptoms.

The study is based on a large online survey, with participants recruited from a variety of sources, resulting in a socio-demographically diverse sample size of 14.8K participants. Participants could opt to complete a mini, short, medium, or long version of the questionnaire. The survey consisted of items based on individual symptoms derived from DSM-5. Symptoms were written in first person and past tense, as close to the DSM phrasing as possible but devoid of information about symptom onset, duration, frequency, and severity. Importantly, survey items were presented to participants in a random order. This randomness is important because in prior studies questions about symptoms were not asked or presented in a random manner. They have been asked using symptom questionnaires that cluster symptoms together in ways influenced by the diagnostic manuals or using a structured clinical interview that adopts the DSM organization. Asking about symptoms in a random order ensures that their co-occurrence is not artificially influenced by the order in which questions are asked. The survey went through multiple rounds of pilot testing, and in the end, 680 items were included. Participants reported how true each symptom statement was for them in the past 12 months on a five-point scale from Not at all true (Never) to Perfectly true (Always). Participants were told to think about their experiences across a wide variety of contexts.

The responses were subjected to two statistical clustering methods: iclust and Ward’s hierarchical agglomerative clustering. Clusters were accepted for further analysis when both methods agreed. This was intended to ensure that there were no idiosyncrasies arising from reliance on one method. This resulted in 139 clusters (“syndromes”) and 81 solo symptoms. Higher-order constructs were identified using hierarchical principal components analysis and hierarchical clustering. The sample was divided into a primary sample (11.8K) and a hold-out sample (3K) to examine the robustness of results. The final classification was based on points of agreement between samples and methods.

The final high order structure included 8 spectra: Externalizing, Harmful Substance Use, Mania/Low Detachment, Thought Disorder, Somatoform, Eating Pathology, Internalizing, and Neurodevelopmental and Cognitive Difficulties. 27 subfactors were identified. As an example, within the internalizing spectrum, the 4 subfactors were: Distress, Social Withdrawal, Dysregulated Sleep and Trauma, and Fear. Similar to earlier literature, a single overarching dimension also emerged. This has been described before as the “p-factor” (general psychopathology factor), but Forbes et al. chose to call it the “Big Everything” to avoid reifying it.

So here it is, an empirically derived hierarchical clustering of individual symptoms across the range of psychopathology:

Forbes, et al. 2023.

The end result has a prominent convergence with the existing HiTOP model, with some points of divergence that would be important for future revisions of HiTOP.

An important thing to note is that many classic DSM disorders do not emerge as identifiable syndromes in these analyses.

Due to the symptom heterogeneity of DSM constructs, they are either broken down into smaller homogenous syndromes or they merge into higher-order clusters such as subfactors and spectra. (And this is the case not just in this particular study, but has been the case in prior analyses on which the HiTOP model is based—even though those exploratory symptom-level analyses had been conducted using measures based on DSM/ICD.)

There is much to discuss in this paper, but for illustrative purposes, I’ll focus on the case of depression and some other internalizing disorders.

... continue reading