So in this section we'll be covering research data planning. This is a very important topic for any clinical and translational research study that you're considering. Specifically we'll cover basic concepts behind the planning and collection of research data. We'll talk about the importance of thinking things through in a real world practice before starting the collection of data. And finally we'll talk about the importance of good recordkeeping and documentation of records for ongoing and shared research. The way we'll structure this section is we'll talk through another of important concepts and best practices that we've picked up over the years supporting and, and investing in many, many research teams. These are very common things that, that we see and by, by conveying and teaching them in our own local environment, we find that we increase the capacity as well as the value of the research done here. So we'll run through a number of examples as we go. First example that I always give is the importance of, of good data planning up front taking a lot of time up front to make sure that you are collecting, that you are planning appropriately, so, so that at the end of the study, you get good results out. we have a phrase," Garbage In Equals Garbage Out." And I've seen this play out many times in research teams that are eager to get started. They want to, they want to move ahead because they've got a great idea, they want to move ahead with the study very quickly. And in many cases, they'll forget important things because they didn't, plan up front, that they didn't take the time required up front to make sure they weren't missing anything. A lot of times this really doesn't become evident until late in the study when you're analyzing the data. And so, so I always stress, even when you think you've got it right, run it through another iteration and make sure that everybody on the team thinks it right before you move forward. This really becomes an ethics issue. No matter what type of study you're doing, you're always spending resources, you're always putting individuals at some inconvenience, or even risk, by doing a Clinical and translational type studies. And so if you don't get the the details right from the front end and you compromise the ability to leverage the data that you've collected. And you're not able to publish, and, or make definitive statements on that data at the end of the trial or study, as you would if you had done it right. Then you've really done the people that working with, the individuals that are contributing to your study as volunteers, you've done them a great disservice, because you haven't gotten the scientific benefit. That goes along with the risk or the inconvenience that they put into the study. So I'd stress it's, it's an ethics thing as well as a data and a study reliability thing. So what are the things that, that I see a lot here is we'll, we'll have a, someone who has a great idea. We will meet with that individual and, and sort of start fleshing out that idea. And a lot of times they, they haven't really though things through. What they really had an idea on is hey, let's just start collecting a lot of data. And this maybe, might come down from a department chair or maybe You know, well meaning individual that, that's really kind of getting their getting their interest peaked around data and being able to sort of collect and manage large data sets. And sort of the promise that if we do this then great things are going to happen. A lot of times if you don't put the time up front really sort of dial things down and think very hard about a primary hypoethesis for what you are doing. May be a secondary outcome measurements. But instead he's just trying to do everything at ones, bad things happen. You know good way that this happen is practice I think i'll meet with resource teams from time to time and rather than individual research team I'm meeting with the department. And, and, and the department chair will say, hey, I've got this great idea. And what we want to do is we want to create this great big registry. And, and I'll meet with the group and say, well, you know, what, what are, what are you going to do with that registry? And, and sort of, the, the, the answer to that is very short. And it's just, well you know, if we had a, if we had a registry and we were able to collect data. From all of our individuals that are seeing patients and the patients might be entering data themselves through surveys, you know, we can just do everything. And I'll usually say, well, you know, that's, that's great, but, but, you know, who is going to be entering the data? Well, everyone is going to be entering the data. And, and we'll get it from all of the, all of the systems around campus as well. And, and then I'll ask well what are you going to do with that data? What kind of reporting are you going to do with that data and, and you know the answer will be you know, everything. You know, we, if we had this, we could do recruitment, we could do quality improvement studies all sorts of things are possible here. And usually what I try to do next is say well you know let's, let's think about that because one thing that we've learned about time is that by caring about everything. You really care about nothing, when it, when it comes to data. Because there's no way you're going to be able to invest all of the resources it's going to take to have really solid good quality data. unless you really, really think it through. Unless you quantify, sort, sort of the benefits and the return on investment on these things. So, so that you're really getting good quality, oversight of the data. So rather than thinking in, in terms of everything, why don't we think about it in terms of you know what would be thing that we could do in two months, that would really change the world here. And if we could come up with something small, and sort of build up and around that, then we've got something to build from. Or, in the case where we're doing a clinical study or trial, I would typically say" Hey, this is, this is bad policy, we really don't want to go fishing'' when were doing a clinical study or trial. We really need to sort of reign things in and you really need to think about that primary hypothesis. You need to be able to think about what figure one in the study is going to look like when you publish it. Or table one and you need to think, think very carefully about collecting data around one or two things that you're going to be testing definitively rather than coming at it with sort of a full-fledged, full, just collect everything and decide later. So, so it sounds simple, but in practice, it's really hard to, to practice self-control and go through the process, making sure that you're collecting the primary data that you need for your clinical study or trial. So sort of going the other way, I've seen a lot of studies that, that have come to me that, that basically say, you know this is a really simple study. All we need to do is collect a few things. Then we run the analysis. And then we publish in the New England Journal of Medicine, etcetera. I remember specifically a study team coming to me a number of years ago, and, and they used that phrase. You know Paul, this is a real simple study. We're going to have people come in for two visits. And we're really just going to do two things. And we're going to do blood draw, and we're do from that a metabolic panel. And we're going to shoot an MRI image. And, from that. Well we're going to, we're going to do the study over, over two visits maybe it was a randomization of, you know, you know a little bit of randomization with the patients etcetera. But, but from the, the standpoint of the data it was very, very simple. Two visits, we're going to measuring a couple of things and then we're done. So, I started asking questions at that point. I said, well, you know, I understand a little bit about imaging. So, I also understand enough about it that I know that you can't just sort of feed an MRI image into a statistics package. So, maybe let's think about it, a little bit, what you're planning on doing there. Is it the tumor size, is it a diameter, is it a volume type measurement? Think about those things that are going to be coming out of that high-density image data that you're going to be using for your quantitative analysis. And from that, you know, maybe we came up with five or six measurements that were going to be important at each of those imaging events. And then we started talking about metabolic panel. You know, that's not just one thing. That's really a number of things, you know, that might be the cholesterol, the glucose level. And so we started thinking about that and particularly putting things in, in terms of units. It really helped to study the study team start thinking very discretely. Oh yeah, yeah, I guess we do need a field for glucose, I guess we do need one for, fo-, for, blood pressure. Oh, and, and while we're collecting blood pressure, maybe we could split that up into Systolic and Diastolic blood pressure. So, so I'd say by the time that, study team left my office that day. We probably had them up to about 60 measurements that they were going to be collecting. So the things that I just mentioned, as well as things like, you know, do you really need to know, you know, some sort of identifier for the patient, or do you need to know a name? An address, phone number, maybe gender or ethnicity. Those things that might, might really play a factor in helping you analyze this data later on. So they're left with about 60 measurements that they were going to be collecting rather than two. But also left them with this exercise. But I said, you know, go home and think about your primary, secondary hypothesis. [INAUDIBLE]. Let's look at those outcome measurements. Think about how you're going to organize that, and then come back to me, and come back with an even richer set of information and data that you're going to be collecting. And I stressed, as I always do, the thinking about this in terms of what exactly would be stored in a you know in, in an Excel type field or a, a data table type field. And what units would belong to that particular measurement and if you, if, if they do that, what I've found is you know, really, it really is hard to start thinking about MRI images in one of those Excel spreadsheet type cells. And so I say well go, go do this exercise and really sort of think about these things in a discrete way and then get back to me. By the time we launched that study I think we had about 400 variables that we were collecting on, on every patient. And so, I started with two and ended with 400 and, and I would say that, you know, that, that's pretty common when you start thinking things are, are simple. Lot of times they become more complex later on. And that's good because we really want to make sure that we are collecting the right things that we're going to analyze later. So, another, another thing I always try to stress in, in, when individual are looking at putting a data collection strategy together. [COUGH] is in addition to thinking about each measurement, think about the right type of measurement for, for, for each of those concepts. And so and you know, in basic statistics classes, individuals learn about the, the differences between nominal, ordinal, and continuous variables nominal being things like sex and race where there's no ordinal effect. Ordinal maybe where there's some clustering or cata-, cata-, categorization, of data that has an ordering effect. That, but it's not, ta-, tiny and discreet like a continuous measurement may be like a blood pressure which, which would have, you know, one increment for every millimetre of mercury. So, so I always stress, that you know again while we're thinking about, the collection of this let's think about the analysis phase. And let's make sure that we're choosing the right type of variable for each of the, each of the different entities that we're collecting. as well, I always stress that unless there's some really good reason on the front end don't ever collapse data before you need to. So continuous variables. You know, if you got a temperature measurement you can always calculate later whether they had a fever by some rule but you can't ever go the other way just by knowing yes, no and fever. We can't go backwards and say let's calculate the continuous variable. So, you know, don't, don't collapse variables before they before they need to be collapsed. The other thing that comes out a lot in the discussion around measurement types is what do I do with that high density data? Things like an ECG waveform, or an MRI or a CT or an X-Ray scan. And again, we talked about that in the last last slide. really I, I always stress that, you know, keep the high density data intact. You know, you don't want to throw away the MRI image. You don't want to throw away the ECG waveform. But typically when you get ready to analyze those data. let's, let's choose the ECG as, as an example in this, in this case. You know typically you, your not going to be analysing, you know, the whole ECG and comparing that against, everyone else's ECG. What your probably going to be doing is looking for the average heart rate. Or maybe the, the longest QT interval during this segment of, of an exercise profile, etcetera. So, so I always recommend that we keep. The high density data, keep it on file. But, but in terms of the, the structured data that we're going to be collecting for the study and analysing later, think about the post process values. That, that long Q T interval. The, the tumor diameter as measured on that MRI.