Welcome to the Fundamentals of Quantitative Modeling, Module 3: Probabilistic Models. In another module we have talked about deterministic models. Deterministic models are those in which there's no uncertainty in either the inputs or the output of the model. In this module, we're going to address the situation that often arises in practice where there's uncertainty in the picture. And in particular, we're going to talk around about probabilistic models. So in terms of content, we'll define a probabilistic model. We're going to talk about random variables and probability distributions which are the building blocks of these probabilistic models. We'll look at a set of examples, so you can see instances and applications of these probabilistic models. And then, once we have some motivation and examples under our belt, we're going to have a look at some specific probability distributions and summaries of those probability distributions. In particular, means, variances, and standard deviation. We're going to look at random variables that are termed Bernoulli, Binomial, and Normal. These three types of random variables are very foundational in terms of probabilistic modeling. Their not the only ones that are out there, but their certainly core to the process. And then, we'll finish off by looking at a rule that is termed The Empirical Rule that is suitable when your data or underlying model is approximately normally distributed. Now, in terms of a definition of a probabilistic model, these are models that incorporate random variables and probability distributions. Now, a random variable we often have an intuitive sense of what that means, but a random variable represents the potential outcomes of an uncertain event. So an easy way of thinking about a random variable is of an event that has not yet happened but you know is going to. So for example, I have a die in my hand, I'm going to throw it, a number is going to come up, one through six, but I don't know what it is prior to having thrown the die. So prior to throwing the die we'd call the outcome a random variable. Now, along with a random variable comes a probability distribution. And the probability distribution is used to assign probabilities to the potential outcomes. And so, if we're talking about a die, and it's a fair die, there are going to be six potential outcomes, and fair means that each outcome would have probability one six. So that's what we mean by the probability distribution. And we use these probabilistic models in practice because to be realistic in our decision making we often have to acknowledge that we don't have absolute certainty in the inputs and consequently there's going to be uncertainly associated with the outputs, as well. And so, being able to formulate and use probabilistic models will allow you to be more relevant and more appropriate to many business situations that you'll find yourself in. So the key feature of the probabilistic model is that it incorporates uncertainty explicitly in the model. And because we have incorporated that uncertainty explicitly, we are able to propagate that uncertainty through the model, so that when we get an output from the model, we are able to understand the uncertainty in the output, as well. So models are often used to create forecasts, but typically it's much more useful to have a range of potential outcomes rather than a single best guess. And a probabilistic model will often allow ourselves to give a range of potential outcomes and that's just a more realistic endeavor to do so. Another aspect of probabilistic models is that probability and uncertainty is typically synonymous with the risk in the business setting. And businesses, if they're going to operate well, need to understand the risk of the environment that they operate in. If it's a financial firm, they need to understand risk associated with the stock market. If it's an insurance company, they'll need to understand risk perhaps associated with weather events. And so, if you're interested in risk then you are absolutely interested in uncertainty and hence probability. So anyone who's dealing with risk is going to have to be able to formulate and use probabilistic models. Now, I wanted to start off with a couple of examples. So you can see how probabilistic thinking can ultimately be useful in the decision making process. So the first example that I have is to think about a company that is very energy intensive in the sense that it uses a lot of energy resources. So an example of such a company, would be an airline. For the sorts of passenger airlines that we typically fly on, the cost of jet fuel is something like 20-25% of their operating expenses. So fuel, and ultimately, oil, which the jet fuel comes from, is a key component of the entire business. And if an airline wanted to plan for the future, not tomorrow perhaps, but medium term and long term planning and that would typically involve the purchase or the leasing of new planes, then clearly the anticipated future price of oil ultimately jet fuel is going to be very, very important to them. If we believe that the price of oil is going to remain low then it's that certain types of aircraft might well still be profitable to the company or as if the price of oil becomes very, very high then it might well change the mix of aircraft that they would want to fly and if they want to remain profitable. But immediately we're faced with a big problem because we're trying to do medium or long term planning and it would be a very bold person who would get up and say, yes, I know what the price of oil is going to be in five years. Or ten years time. So how do we deal with this situation? We know that this quantity, the price of oil is a key component of our decision-making process, but at the same time we don't know what it's going to be. So the probabilistic approach to this sorts of problems is to acknowledge that we don't know exactly what the price of oil is going to be. And to use our expert knowledge and models to create or try to create some realistic probability distribution that captures the likelihood of the price of oil taking on certain values in the future. So it's the creation of this probability distribution that is tempting to model the potential prizes of oil in the future, that is key to incorporating the energy component into the future planning. And so, if one is able to do that, by which I mean create a realistic probability distribution, and incorporate that into the decision making process, then hopefully the company will be making more informed decisions and will certainly have a better understanding of the risk associated with the decisions that they're making. So that's an example just thinking about energy prices in the future. Here's a second example. Imagine you are an investment company, and you're considering whether or not to invest in a drug company. Drug companies have, typically, many compounds, potential drugs under development, and the one that I'm thinking about has ten drugs in a development portfolio. Now, if a drug gets approved, remember, these are under development, so one doesn't know whether or not they're going to be approved by the regulator or not, or whether they're even going to be successful drugs. But it's not unreasonable to be able to estimate that if they were approved, what sort of revenue they might generate. And you could do that by looking at similar drugs within the therapeutic category. So one could say, if this drug gets approved, I anticipate it's revenue is going to be a certain amount. But of course whether or not a particular drug is approved in the future is uncertain event, in other words a random variable. Now, if we can estimate the probability of a drug being approved and again, we might be able to do that by looking at similar drugs or the history of the company, then we're starting to build up the elements of a probabilistic model. That will allow us to make a more informed investment decision. So let's say we only want to invest in the company if the expected total revenue from this portfolio of 10 drugs is greater than $10 billion in 5 years. So that might be our investment criteria. Of course, we don't know for sure whether or not these drugs are going to get past the regulatory hurdle. But if we've got a probability estimate for whether or not they can get past and we've also got an estimate of the revenue that they're going to be able to generate, then we have the building blocks in place to create a probability distribution for the total revenue. And if we're able to create that probability distribution then we can use that as a part of our decision making process. So for example, we could work out the probability that the portfolio creates more than $10 billion in revenue in 5 years' time. So there's a second example and these are realistic examples. They are activities that companies really do go through and examples of incorporating probabilistic models.