Welcome back. I hope that rigorous content on regression and related techniques makes sense to you. I hope it's pretty clear now how we can take a bunch of data from, let's say, period 1, whether it's, again, past behavior, marketing activities, competition, whatever, to predict something about period 2. Whether it's the number of purchases, whether someone stays with us or not. It's really really important to be able to do that. Unfortunately, those techniques are common, they're very accessible. You don't necessarily have to have special software. You can do it in something as simple as Microsoft Excel. In fact, I want to talk about an example where people were doing that thing long before they had any kind of computational power that we have today, or even the rich data that we have today. I want to take you back to the late 1960s, the early 1970s. It was the dawn of what today we would know as direct marketing. It really was when a lot of these ideas of customer analytics were born. It was the first time that we really had any kind of granularity about what particular customers were doing and a desire to know what each and every one of those customers would be doing next, for how long, and for how much money. So it became very important for companies to come up with what we like to call KPIs, key performance indicators. Can we look at some indicators of what people had been doing in the past in order to make some accurate statements about what they're likely to do in the future. This is just a natural area to run something like a regression model, and indeed, regression models were used for this kind of purpose. But it wasn't, let's just throw in tons and tons and tons of data because part of it as the data was limited, part of it, as I said, is that our computational power was limited. So we have to think very carefully. It was very very important for us to come up with just a few measures that would be fairly predictive of what customers would be worth in the future. So our forefathers in direct marketing, they basically did the kinds of things we've been talking about here. Let's take our dataset, let's chop it into two pieces, let's collect some data from period 1 to see which elements of that period 1 data would be most predictive of what people did in period 2. In period 2 would be looking at, how many purchases they made or what was the dollar value of those customers? They ran lots of models to try to find out which bits of data were most predictive, and they'd do it over and over and over again on lots of different datasets, for lots of different products, lots of different geographies, lots of different customer segments. Because we wanted to find a few of those explanatory variables that were pretty robust, that time and time again would prove to be predictive. This is where our forefathers in direct marketing came up with the idea of R-F-M, recency, frequency, monetary value. What they found time and time again back in the '60s, early '70s, and we still see true today here in the 21st century, is that you can give me these three summary metrics. You give me recency, frequency, monetary value. You tell me the last time that someone made a purchase with me or did some other kind of economically valuable activity, maybe they took a sales call. Maybe they visited the website. So they did something that suggests that they're going to become a more valuable customer. Generally, we're talking about a purchase. So that's R, that's recency. Now tell me about frequency. Tell me how many purchases they made or how many economically beneficial activities they did over a set period of time, let's say the last year or two, and third would be monetary value. I think that's pretty much self-explanatory. So when they did those economically beneficial activities, what was the overall or the average monetary value of each and every one of them? So if you can give me R-F-M, recency, frequency, monetary value, I can make a very accurate statement about what that customer's going to be worth in period 2. Again, this was one of the first areas where regression analysis was used in marketing. It was one of the first ways for folks in marketing to say, "You know what, all of that data that we've been collecting, not really sure what to do with it, there's real value there. We can really predict stuff and then we can start to change our business to take advantage of these insights about what's likely to happen in the future, not just what happened in the past." So I just want to put RFM out there as just one very nice example of an application of the kinds of things that [inaudible] was talking about. Now I want to go one step further. So we can run these regression models and we can take whatever data we have. Again, we could start with something as simple as RFM. We can bring in many many more kinds of measures, much more complicated, much more interesting, and make statements about what's likely to happen in period 2. Again, if all you're interested in is making statements about period two, how many purchases are going to happen in the next year? Who's going to churn or not? Then regression type models are fine. In fact, you can't do better than regression type models, or different kinds of data mining that might be out there. But what happens when you want to go beyond period two? What happens when you want to make statements about period three, or period four? Or what happens if you want to talk about something like customer lifetime value? But we don't want to limit our statements just to what a particular customer is going to do over the next year. But if we want to go out there and acquire customers, if we want to figure out what's the maximum amount that we should be willing to spend on a customer, We can't limit ourselves just to how much they're going to pay us, how much profits we'll get from them in the next period, we need to project that out way into the future. The problem is, regression type models are fairly limited at their ability to do that kind of thing. Let me try to explain why. Let's go back to the timeline that I described before. We get all of this data in period one to make a statement about what we see in period two. We run a regression model to predict sales as a function of; visits to the website, usage of social media, marketing activities, everything under the sun, that's great. But what happens if we want to make statements about period three? Well, if all you want to do is make statements about period three, that's not so bad. You'll say, "Wait a minute. I have this data on period two. Instead of using period two as my dependent variable, that's the thing that I want to explain in my regression, why don't I look at period two and get my explanatory variables from it. Why don't I look at; the visits to the website, the marketing touches, the RFM? I have period two, so let me take all the period two data now to try to make a prediction about what will happen in period three. Hey, I already ran my regression. So I have my regression coefficients, I have all the outputs and everything that rigor was talking about. So let me just jam in my period two data into that regression and make statements about period three. " You see, I can predict the future and that's great. If you want to go one statement at one period out, terrific. But what happens if you want to go to period four? We don't have any data beyond period two ,we don't have any x variables from period three in order to predict period four, what are we going to do there? How far out into the future can we go? The problem with regression type models is that they're limited. That if you don't have any data to use as inputs into the model, then you can't get the output. So no matter how long your observation period might be, you're limited as far as how far into the future you can make statements. Now, in many cases this isn't a problem. For many kinds of decisions that the companies want to make, simply being able to make statements about one maybe two periods out, is perfectly fine. In fact, you might say that most decisions are perfectly adequate and these limitations of regression aren't going to be a problem, and I agree. But there are times, especially when we want to ask when type questions, or long run type questions. Like I mentioned, customer lifetime value already. It's one thing if we want to make a statement about, ''Is this customer going to churn in the next period or not?'' Regression models are going to be great for that kind of thing. But if we want to ask a question instead, ''When will this customer churn?'' Like if they survive through the next period, how many more periods will they survive? Regression won't really work well, when we're projecting way outside of the range of data that we had in the first place to run the original model. So if we want to make these longer run projections, and I'm going to keep coming back to talk about customer lifetime value as one very nice, very practical example of something that we're going to want to do over a longer period of time. Today, as firms start talking much more about customer centricity, that we want to figure out who the right customers are and we're willing to invest in them because they're going to be so worth it in the long run, we need to have some visibility into the long run. We need to be able to make these predictive statements about the long run in order to see if those investments are justified. So there's much more interest than ever to be able to make statements beyond period two. So I want to talk about a very different kind of modeling approach. It's not nearly as popular as regression models are, but it's not necessarily any more complicated. As our view to the future goes further and further out, it becomes more and more important to add this other kind of modeling approach to your toolkit. That's what we'll do next.