Let's now speak briefly about stochastic games. This is a topic that lends itself to a very long discussion and it's, it's quite a complicated one, but we'll touch on the, the main points and position it in the landscape of topics we're discussing. And so, the, the starting point are repeated games. as we know, a repeated game is simply a game in normal form, for example, that we repeat over and over again. For example, we play Prisoner's Dilemma once, twice, three times, maybe a finite time, maybe an infinite number of time. Then, we accumulate all the rewards all the time to sum overall rewards. A stochastic game really is a generalization of it, where we play a game repeatedly, but not necessarily the same game. So, we play a game depending on how we play the game, let's say, Prisoners Dilemma, we each got some payoff. But depending on how we play the game, we also probabilistically transition to some other game, play that in turn, and the, the, and the process continues. A graphical way to look at it is here. If, if, if this here is a, is a repeated game, where you play the same game over and over again, here, you play the game and then if you happen to play this, you'll transition to this game. If you happen to play this, you'll maybe transition to the same game. If you play this, you, you'll transition here. If you play this, maybe you'll play this game again, and so on. From each game, you transition probabilistically to, to, to other games. this is a stochastic game. formally speaking, it's, it's, it's the following tuple. It's a lot of notation, but the concept is exactly what we, we saw. We have a, a finite set of state Q. We have a set of players. we have a set of actions where actions are available to, to, to, to specific players of A sub i, is the action available to play i. And then, we have two, two functions. We have the transition probability function. So, this is, depending on the state we're in and on the actions we took, we move to each of any of the other state s or the, the very same state with a certain probability as governed by this probability distribution. And, and similarly, our reward is the reward function which tells us if in a certain state, a certain action profile was taken by the agents, then this is a reward to, to that particular agent to, to each of the agents. So, r sub i is a reward to, to agent i. that's the formal definition. notice that you sort of assume the implicity that you have the same action spaces here but you, you could define it otherwise. It simply would involve more notation. There's nothing inherently important about the action spaces being the same in the different games within the stochastic game. So, just a few final comments on it. First of all as we as we saw, this obviously generalizes a notion of a repeated game. But also, it generalizes a notion of an MDP or a Markov Decision Process. If a, if it's a stochastic game, if a repeated game is a stochastic game with only one game, a Markov Decision Process or MDP, is a game with only one player. And so, you have states there that, where the agents take, agent takes an action, receives a remuneratory reward, and probably moves to some other state. And the only difference is that, that he's the only actor in the setting. I mention this because while MDPs have been studied substantially in a variety of disciplines from optimization to Computer Science to pure Math and beyond, but also, these two perspectives of a generalization repeated games, and of MDPs, give you a sense for the theory and investigations into stochastic games. So, from repeated games, we inherit the definitions of different ways of aggregating rewards over time. You can have limited average rewards, future discountage rewards whereas, from the, the literature on optimization and on MDPs, we get a notion such as stationarity and Markovian strategies. These have to do with, we also have a notion of reachability about the structure of the underlying transition probability. And so again, these are issues that are involved that we won't get to into more in this lecture but at least we flagged their existence.