Welcome back. This is Kay Dickerson and Section B of the lecture on framing the question for the Systematic Reviews and Meta-analysis course. Section B covers deciding the type and scope of the question that you're going to be addressing in your systematic review. This is a very important part of your systematic review, but I have to confess it's probably the one where people have the most problems. And so, maybe I shouldn't tell you that ahead of time, because I don't want you to be frightened, but I want to say you really need to pay attention to this. It's not as easy as it looks. So the first thing I'm going to tell you about is that there all different kinds of questions we have about health care, about epidemiology, about interventions. And they are all different types of questions as it turns out, and each type of question requires a different type of research. And don't worry, I'll get into that. But let's start first by just thinking about the different types of questions we might have. So you might say, what proportion of the population is newly diagnosed with this problem each year. That's whats called an incidence question, and there's a certain type of study design that you would use if you had a incidence question. And that means when you're looking at the literature and doing a systematic review, you don't just take any old paper, or article, or research that says that it's looking at incidents. You want to make sure it's the type of design that actually can address properly, with minimum bias and incidence question. You might have a prevalence question. What proportion of the population is currently living with this problem? Both of those are standard epidemiology questions. The questions that kind of belong together are therapy, screening and prevention, and sometimes harm. Those are all intervention questions. So a therapy question is, what should be done to treat this problem? A screening question is, will detecting this problem early before I get symptoms make a difference in my health? And health is the outcome there for a screening question. Another type of question that's like that because it's an intervention question, is how can this problem be prevented? That's at the bottom of the slide, but that's also an intervention type question. You might have a question, how good is this test at detecting this problem? And that's a diagnostic accuracy question. That's a different type of question from a screening question. So the screening question is saying, will detecting this problem early make a difference in my health? So health is an outcome. In a diagnostic accuracy question, you're just comparing one test to another, and seeing how good that test is at detecting the problem. So we're looking at things like sensitivity and specificity. A prognosis question is a type of question that doctors are interested in but also people in public health. What is the likely outcome of this problem? What's likely if I have a baby two months early, what's the likely outcome for that baby? A harm question is, will there be any negative effects of this intervention? And so often when we're looking at the effectiveness of a therapy, we also want to look at harms, and the possible safety questions. And then finally, an ideology question which you see in epidemiology all the time, what causes this problem? If I have glaucoma, is it because I have a family history of glaucoma? Or how likely is it that the family history contributed? So think about your question and then try classifying it. Because it's not until you classify it that you actually can decide what type of research is the best research to examine the question. Let's look at some examples. I've already given you a few. But here they are written down so you can think about them in that context. Incidence and prevalence. Here are some examples. What's the incidence of low birth weight of minority populations compared to the white population? A therapy question. Is exercise effective in improving quality of life in persons with COPD? A screening question. Is PSA to detect prostate cancer effective? And that means effective in terms of saving lives or some sort of health outcome. In this case, we've said is it effective in reducing mortality? Diagnostic accuracy. How effective is MRI at detecting new breast cancers and follow up of women with breast cancer who had lumpectomy. Here's a prognosis question. What is the effect of pregnancy on exacerbating the symptoms of multiple sclerosis. Here's a harm question. What proportion of post menopausal women receiving calcium and vitamin D can expect to have kidney stones? That's a common side effect of this intervention. And etiology, is coffee consumption casually associated with developing pancreatic cancer? Probably not, I'll say as an aside, but we're always worried about coffee consumption and I'm happy to tell you that so far as we know, it's not that bad for us. So now you can see there are different types of questions and the type of things that are being addressed with those types of questions. Let's look now at the study designs that you will use to study those questions. So if we have an incidence question or a prevalence question, we look for surveys or cohort studies. If we have a therapy or a screening question, we're looking for clinical trials, randomized clinical trials. Because it's only be comparing one intervention to another that we'll be able to tell if they have similar outcomes, or if one is superior to the other. For questions of diagnostic accuracy, we would love to have randomized clinical trials, but you hardly ever find them. More likely you'll find cross sectional studies. And we say okay, it would be great if those studies were randomized or had a random start but we rarely see them. For prognosis, most commonly you'll see a cohort study but a clinical trial can also tell you about prognosis. That would be great. For harm, also, clinical trials will be wonderful. Randomized clinical trials. But usually, they aren't big enough to be able to detect rare outcomes. Nor are the conducted long enough. And so, we have to look, instead, at cohort studies or case control studies when we have questions of harm. Etiology or an epidemiology type question. We typically use observational studies, cohort studies, or case control studies. So now you see why it's so necessary that we classify the question before we begin our systematic review. That's because each type of question requires a different type of study to minimize bias, or at least that's what we would expect. So if we have a question about how well an intervention works, we probably won't look at case control studies because they have too high a likelihood of bias associated with them. Instead, we'll look for randomized clinical trials. However, when it comes to looking at harm, we probably will have to turn to case control or cohort studies because the randomized clinical trials just either aren't long enough or big enough to be able to deduct harm reliably. Now many of you have probably seen this, what's called levels of evidence, or pyramid of evidence. It's called a Hierarchy of Evidence. And often if I'm asked to talk to a group, they'll say, oh please, would you review the Hierarchy of Evidence with us, because we want to understand it. The first thing I want to emphasize and most importantly is this Hierarchy of Evidence really only works for those intervention questions that I mentioned before. Questions of therapy, questions of harm, although I want to put a caveat on there because we know from the beginning that we are unlikely to find this type of study for harm, intervention studies, studies of screening, and also studies of prevention. So now we can look at this hierarchy of evidence, which doesn't work so well when we're taking about studies of etiology, prognosis, incidents or prevalence. So if you look at this hierarchy of evidence, what you can see is that at the top the highest form of evidence is a systematic review of randomized clinical trials. There aren't as many of these as there are unsystematic clinical observations, but they are the highest form of evidence for determining whether an intervention works, whether it's treatment, prevention, screening, or detecting harm. A single randomized trial, that is, where there is fewer than two randomized trials in a systematic review available is better than a systematic review of observational studies in terms of minimizing bias. But if you have to, you may be turning to observational studies. For example, as I mentioned for detecting harm. And so you can see what this hierarchy of evidence represents, is that the studies you find least often, in most cases the systematic reviews of randomized trials, are also the highest form of evidence. What you'll find a lot of is very low evidence, and evidence we probably would not consider in a typical situation for giving standard of care treatment based on just that low form of evidence. There are exceptions, however, in very rare diseases, where it's a desperate situation, but we're not going to talk about that here. So John Tukey, who was a well known statistician at Princeton said this many years ago, in the 1960s, and I think it's a really great maxim for us who are doing systematic reviews. The most important maxim for data analysis to heed, and one which many statisticians have shunned, is this. Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. Now, this is really a good beginning for us to talk about how to formulate that question that we're going to try to address in this class. That is, in doing systematic reviews, people tend to ask pretty broad questions rather than very narrow ones. I'll give an example. One might ask whether it prevents another heart attack if you take aspirin after your first heart attack. So a very broad question might be, is secondary prevention of heart attack with aspirin effective? Now, one might say well when would you start taking that Aspirin? Should I start taking it as soon as I have the first heart attack? What if I start taking it ten years later? How much Aspirin should I take? Should I take a baby Aspirin every day? Should I take two Aspirin every four hours? What's the right amount? And for how long should I take it? Do I take it the rest of my life? Do I take it a short amount of time? So you can see that that question that I mentioned, is Aspirin effective in secondary prevention of heart attack, can be made very precise. But the trouble is, when you make that question very precise, there are going to be fewer and fewer studies that can address the question. And you'll be making it so precise that, in fact, maybe there are no studies that answer the question. And in fact, maybe it's okay if the question was very broad. Maybe any amount of Aspirin is helpful. Maybe taking it anytime after that first heat attack is helpful. Maybe both men and women can be helped by taking Aspirin. So, you want to be careful when you narrow your question down a lot, because maybe that's not how systematic reviews are best conducted. Now studies that are out there, and we've already touched on this, can differ in many, many different ways. The types of population that they're studying, the inclusion and exclusion criteria, if we use the heart attack example. Maybe some studies say you can't have had diabetes, and other studies say you can. Maybe some say you can only have had one heart attack. Some say you can have had infinite number. So they can differ in all kinds of things. How you define how much Aspirin is taken. What the comparison group is. Is it taking no aspirin or maybe a vitamin pill? What you do if people want to take other types of drugs, for example. That might help prevent heart attack. How you define the definition? Well did you measure heart attack at 2 years, at 4 years, at 10 years, at 20 years? How did you define heart attack? And finally the quality of the study design and how the study was conducted, how the analysis was conducted. Whether missing data influenced that analysis and how it was done. So there are all different ways similar studies can differ and this effects how you define your systematic review and what your question is. I am not going to go into a lot of detail here on ways that studies can differ, you're about to find it out in your own systematic reviews. However, I've already touched on the fact that you're going to make your question and you have to decide whether it's a broad or a narrow question. The trouble with narrow questions, is that they may not be applicable to multiple groups or populations. So if I say I'm only going to look at Aspirin studies to prevent heart attacks in men ages 35 to 40, whatever results I get, they're only going to apply to men 35 to 40. And that may not be the question you really have in mind. You can also get spurious findings because you'll have fewer studies. And that can be a problem when you have a smaller sample size. And there are many examples of spurious findings in situations like that. For example, the efficacy of Aspirin in preventing strokes in women. There was an association that was seen incorrectly between dysfunctional uterine bleeding and BMI, in African American women. And this is something that just ended up not true, because the questions were too narrow. If you want to learn more about, or see a comparison of the pros and cons of broad and narrow questions, I'll refer you to the Cochrane Handbook table 5.6.a and you can look at extensive tables there. Here we're just going to talk about in general terms. What's the downside or the upside of looking at narrow questions or broad questions? When you have very broad questions, such as is Aspirin effective in preventing a second heart attack, you might get, as I mentioned, all different kinds of studies. And you have the criticism I'm sure you've heard before that systematic reviews and meta-analysis compare apples and oranges. And this can happen. So, you're goal is really to compare different kinds of apples not apples and oranges, because that really doesn't work. So you have to make the decision whether men and women are similar enough that you're going to include them both in your systematic review and meta-analysis. I would tend to do it, and then if you think there's likely to be a difference you might plan ahead of time to do a sub-group analysis. But I think men and women are probably similar enough that in the example I gave you, you could use both of them. However, the dose might be something where you would draw the line, or the timing since that first heart attack. So you really have to decide how much of a difference it makes in terms of the validity of the answer that you're likely to get. Another problem with broad questions is, how do you search the literature? It could take you forever because you're likely to find a large number of hits when you do searches of multiple databases. Now that's just real life and that's one of the side effects of a systematic review. But you do have to think about that ahead of time. Certainly if you find more studies, it'll make your synthesis more difficult. But again, you might be willing to take that on, because that's the question people really have. So again, if you want to compare the pros and cons of broad or narrow questions, do look at the Cochrane Handbook, where there's a very nice table doing this. And we'll just talk about it in generalities here. So that finishes up section B. And we'll move on next to Section C, Elements of the Question.