If you’re like me, you often find yourself struggling to select an outfit for a party. Finding the garments to provide the right balance between comfort and style is a difficult task. If you focus all your attention on looking sharp, you’ll find yourself showing up to the party in a suit and tie, while if you focus all your attention on being comfortable, you’ll find yourself showing up in sweatpants. In the absence of a professional stylist to consult, striking the right balance is tricky. I often find myself settling on a compromise: I’ll wear jeans, but I’ll pick out my nicest pair (what some refer to as “grad student formal”).
The difficulty in this decision making process lies in the fact that there are two competing goals. While the optimal decision for each goal may be obvious (wear a suit for style and sweatpants for comfort), there’s no decision that is simultaneously optimal for both. The right balance between the two goals isn’t obvious and the optimal balance may vary across individuals. Even if you could ask an individual directly, most people wouldn’t be able to articulate how they weight the trade-off between style and comfort. In this case, it might seem that the best solution would be to hire a team of professional stylists and observe how they select outfits for a number of individuals, doing their best to balance style and comfort using their expertise. Then, one could try to emulate the decisions of the professionals.
Similar themes show up in medical decision making. A large body of statistical literature has focused on estimating decision rules for assigning treatment to optimize a clinical outcome. However, this idea creates a disconnect with what actually happens in the clinic; much like we all have to select outfits to balance the trade-off between comfort and style, physicians often must make treatment decisions to balance the trade-off between multiple outcomes. Suppose you were a mental health professional treating a patient with bipolar disorder. You know that prescribing an antidepressant may help your patient control their symptoms of depression. However, you’ve recently read research articles, like the one by Gabriele Leverich and colleagues, indicating that antidepressants may induce manic episodes . The value that each patient places on symptoms of depression and symptoms of mania is unknown and may vary from patient to patient. How can we use data to learn decision rules for treatment that balance two outcomes in a meaningful way?
A recent project that I have worked on approaches the two-outcome problem through the lens of utility functions. We assume that there exists some unknown utility function (a possibly patient-dependent function of the two outcomes) that physicians seek to optimize, perhaps subconsciously, when selecting treatments. The physician will not always be able to assign a patient the best treatment for that patient’s utility function, but we can assume that they are successful with some probability. In observational data, where treatment decisions are not randomized, this assumption allows us to model clinician decisions, estimate a patient-specific utility function, and estimate an optimal decision rule for the estimated utility function.
This idea represents a new way of thinking about observational data. Randomized controlled trials are widely considered the gold standard for medical research and many statistical methods are designed to take observational data and apply transformations that allow us to perform analyses as if treatment decisions were randomized. However, the statistical method we’ve been developing for this project handles observational data differently- by recognizing that when treatment decisions are not randomized, there may be information to be gleaned from the decisions themselves. This can be viewed as a form of inverse reinforcement learning, where we observe decisions made by someone with expertise, attempt to discern the goals of the expert, and finally, attempt to learn policies that will achieve the expert’s goals. This idea is similar in spirit to imitation learning, covered in more detail in a previous post on this blog entitled “The Computer is Watching!” by Eric Rose.
We applied our method to data from the observational component of the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) study. By observing treatment decisions made in the clinic and assuming that physicians are implicitly acting with the intent to balance each patient’s depression symptom score and mania symptom score, we were able to construct a composite “depression-mania” score and estimate a decision rule for determining which patients should receive an antidepressant in order to optimize this composite score. We estimated that applying the resulting decision rule in a population would achieve a 7% improvement in the composite score compared to standard practice. Much like observing the actions of a professional stylist could help us all improve our fashion sense, in the future we may be able to use observed actions of experienced physicians to help us systematically construct better decisions rules for assigning treatment.
 Leverich, G. S. et al., (2006). “Risk of switch in mood polarity to hypomania or mania in patients with bipolar depression during acute and continuation trials of venlafaxine, sertraline, and bupropion as adjuncts to mood stabilizers.” American Journal of Psychiatry, 163(2), 232-239.
Daniel recently completed his PhD in biostatistics at UNC (congratulations, Daniel!!)! We thought this posting was a great excuse to get to know a little more about him, so we asked him a question!
Provide a list of five do’s and don’ts that apply both to effective teaching in STEM and dealing with a wild bear.