Welcome back to the lecture on clinical trials designs. In section B we are going to talk about some extensions of the parallel design. Extensions that we're going to cover here are the factorial design and the large simple design. In the factorial design we are testing two or sometimes more experimental interventions simultaneously. So we test treatment A versus the control for treatment A, and we test treatment B versus the control for treatment B. We test the treatments simultaneously, either because it's economical to test the two treatments simultaneously or because the design can be used to test for interaction between treatments A and B. Usually we use the factorial design for economical reasons. It's actually very rare to design a factorial trial for the express purpose of testing for an interaction. In fact, we usually assume that there is no interaction between the two treatments and that they have an independent mode of action. This is often more plausible if we actually test for two separate outcomes. An example of a factorial trial with two different outcomes is the Physicians' Health Study. The Physicians' Health Study was a primary prevention trial among physicians in which beta carotene was compared to placebo for the outcome of cancer, and aspirin was compared to placebo for the outcome of coronary artery disease. The factorial design can be graphically represented by a two by two table. The top row represents participants who were randomized to receive treatment A, and the bottom row represents participants who were randomized to not receive treatment A or to receive the control for treatment A. Similarly, the left column is the people randomized to receive treatment B, and the right column is the people that were randomized to receive the control for treatment B. The cells in the two by two table represent the four different combinations that are possible with two treatments each having its own control. So on the top left, we have people who are receiving both A and B. In the top right we have people who are receiving A and the control for B. In the bottom left we have people who are receiving B and the control for A, and in the bottom right we have people who are receiving the control for A and the control for B. As I mentioned before, when we do a factorial design we are usually interested in the estimation of the main effects assuming that the treatments do not have an interaction. To make these comparisons, we use the responses of the people and the margins of the table, and we'll go back to the graph in a moment to review these responses. In the case where we are interested in the interaction, we have to compare the responses in the cells instead of in the margins, and it's important to note that the test for interaction is usually not a powerful test. Unless the sample size is very large, we are likely to have difficulty reliably detecting an interaction between treatments A and B. So let's go back to the two by two graph of the factorial design. To assess the main effect of A, we compare the response in the people in the margin on the far right. That is, the response of those assigned to A regardless of their assignment to B. We compare that to the response of those assigned to not A regardless of their assignment to B. We do a similar comparison across the margin at the bottom for those assigned to treatment B versus those assigned to not B. If we are indeed interested in assessing the interaction, we have to compare the effect of A versus not A, and those with B, to the effect of A, versus not A, and those with not B. So we are comparing the cell responses instead of the margin responses. An example of a factorial design is ISIS-3, that is the International Study of Infarct Survival-3. ISIS-3 was designed as a three by two factorial. ISIS-3 was testing aspirin plus heparin versus aspirin alone. Those treatments were presumed to act through an antithrombotic mechanism on the outcomes of myocardial infarction and stroke in cardiovascular death. ISIS-3 was also testing Streptokinase versus tPA versus APSAC, and these three treatments were presumed to act through a fibrinolytic mechanism. So you see in this example of ISIS-3, we had a common primary outcome, but we had two different proposed mechanisms of action on the outcome. This factorial design was also considered a large, simple design, and we'll discuss this type of trial more in a moment. There were more than 41,000 patients in ISIS-3, and it had more than 914 participating hospitals, and these hospitals were in 20 different countries. Here is the representation of the ISIS-3 three by two factorial table. In the top row of this table we have the patients who are allocated to aspirin plus heparin. In the bottom row of the table, we have the patients that were allocated to aspirin alone. In the first column we have patients that were allocated to Streptokinase. The middle column is patients allocated to TPA, and the right most column is patients allocated to APSAC. And the cells within the table represent the combination of the two treatments. So in the top left cell we have patients allocated to aspirin plus heparin and allocated to Streptokinase and similarly for the other cells in the table. Large, simple designs are exactly as they sound. They are large, and they are simple. They are characterized by a very large number of patients, usually numbering in the tens of thousands. They are recruited from many, many centers. In ISIS-3 we saw that they had over 900 centers, and they require minimal data collection on each participant. The rationale for a large, simple trial is that it takes large sample sizes to detect modest benefit. We don't find many penicillins anymore. We don't find treatments that have effects that are so large that we can see the effect by only observing a small number of people. If we are looking for a treatment that provides a small survival benefit for those with heart disease for instance, we need a large sample size to detect it. Why would we be interested in a small clinical effect? Well, heart disease is a very common condition, so small advancements in treatment that improve the survival time by even a small amount can correspond to a large public health benefit. Another premise to a large, simple design is that there are unlikely to be many treatment interactions. We aren't likely to have a well-defined subgroup of people that respond well to therapy when others don't. Given the assumption that it's unlikely to have treatment interactions, it's not that important to collect a lot of information about baseline characteristics and interim response variables because we are not expecting to want to look at the treatment effect in lots of different subgroups or examine the mechanism, because we aren't expecting a group of people to respond very differently from another group of people. In a large, simple design, we tolerate less precision in estimation. With such a heterogeneous group of participants and study personnel, we'll have a wide variety of people enrolled. We'll have less control over the training and standardization of administration of treatment and of outcome assessment, so we have to expect that there will be more error or increased variance in the estimation of the outcome measures. And we counter this increase in error with a large sample size. I'm sure that you can already anticipate some of the requirements that would be necessary in order to make it practical to implement a large, simple design. You have to have an easily administered intervention. Typically, this requires that the intervention does not require long-term adherence. In other words, it must be something that can be administered in just one or two contacts with the participant. The treatment shouldn't require any adjustments to the dose or the timing of the administration, and it also shouldn't need any ongoing monitoring for adverse events. These are all major limitations to the large simple designs since a large number of interventions do require one if not all three of these activities. In order to use this design, we need an easily-ascertained outcome. For example, in ISIS-3, the outcomes were 35-day mortality, which was assessed by government records and other clinical events that occurred while the patient was in the hospital. You must use an outcome that does not require a complicated follow-up for diagnosis. For example, in order to diagnose Alzheimer's disease for a study that I work on, we require extensive neuro-psychological testing, detailed medical history. We do interviews with friends and family about changes in the participant's cognitive functioning. We also do lab work and imaging. It's a rather complicated outcome, and it would not be appropriate for a large simple study where you have thousands and thousands of patients for whom who have to readily assess the outcome. Large, simple trials also tend to have very limited data collection at baseline because of the assumption that we've already discussed that treatment interactions are unlikely. And finally, for a large, simple design we need to have confidence that simple data will be persuasive enough. In ISIS-3, there was no baseline form to document eligibility. The randomization was completed through a phone call to a 24-hour service that was provided by the coordinating center. The treatments were conveniently packaged for ease of use, and they were already available at the recruiting centers. There was no restriction on the use of ancillary treatments, so the treatments could be used in addition to other clinical management that was ongoing for any particular patient. And at discharge, a one-sided form was completed to document events that had occurred during the hospital stay, and the only follow-up after discharge was through search of the government records for vital status. These are both extremely simple methods of data collection. If you are interested in the results that they found in the ISIS-3 trial, you can refer to this reference that I've listed again at the bottom of this slide. Okay, that's the end of section B where we have covered two extensions of the parallel design. When we come back, we're going to go into section C which is a section on testing for hypotheses other than superiority.