In the last post of this month’s series on research methods, we wanted to touch upon an important aspect of comparative effectiveness research: what happens when a treatment works well for the average population, but not for a certain segment(s) of the population or patient? This situation, when patients vary on their response to a treatment, is known as heterogeneity.
Variation can occur between populations of patients, between studies, between individual patients, or in the optimal treatment for individual patients. Subgroup analysis, which is a common method used to evaluate whether treatment effects differ between defined subgroups of patients, may indicate if there are variations between patients.
To determine whether treatment effects differ, a single statistical test (ie, a test for "interaction") is first conducted to determine if there is any statistically significant difference in treatment effects across levels of a patient characteristic (eg, age groupings). If a statistically significant difference is found, further testing is required to identify which levels of response (eg, which age groups) are experiencing different treatment effects.
Many research studies are designed to evaluate the difference in treatment effectiveness between two main treatment groups, and are not initially designed with subgroup analysis in mind. As a result, most studies have only sufficient statistical power to detect the main effect differences overall among all treatment groups in the study. Subgroup analyses may not be able to detect a statistically significant difference in one or more subgroups when, in fact, there actually is such a difference. If a subgroup effect does exist, it may go undetected because the study simply is not large enough.
Given the large number of baseline variables (young vs. elderly, males vs. females, those with a genetic marker vs. those without), many subgroup analyses in studies are conducted after the data is gathered, as opposed to defining a subgroup analysis plan ahead of time. These analyses should be considered exploratory, and only in exceptional circumstances should the analyses affect the main conclusions drawn from the study.
There are both strengths and limitations in conducting a subgroup analysis. On the positive side, subgroup analysis can assess whether subgroups of patients may differentially benefit or experience harm due to a treatment, which informs clinical decision-making. On the negative side, when multiple subgroup analyses are conducted, the probably of observing a false positive (finding a significant interaction when one does not exist) is inflated. This could lead researchers to incorrectly conclude that treatment effect differs across subgroups when it does not.
Want to learn more? In this brief video, Dr. Darius Lakdawalla, Director of Research at the Leonard D. Schaefer Center for Health Policy and Economics at the University of Southern California, discusses the concept of heterogeneity -- what it is and why it must be taken into consideration when conducting and analyzing CER.