Stride convolutions is another piece of the basic building block of convolutions as using convolution neural networks. Let me show you an example. Let's say you want to convolve this serves by seven image with this three by three filter. Except that instead of doing the usual way, we're going to do it with a stride of two. What that means, is you take the element wise product as usual in this upper left three by three region and then multiply and add, and that gives you 91. But then instead of stepping the blue box over by one step, we're going to step it over by two steps. We're going to make it hop over two steps like so. Notice how the upper left-hand corner has gone from this start to this start, jumping over one position. Then you do the usual element-wise product and summing, that gives you tones of 100. Now we're going to do that again, and make the blue box jump over by two steps. You end up there and that gives you 83. Now when you go to the next row, you again actually take two steps instead of one step. We're going to move the blue box over there. Notice how we're skipping over one of the positions. Then just gives you 69, and now you gain step over two steps. This gives you 91, and so on. So 127. Then for the final row, 44, 72, and 74. In this example, we convolve with a seven by seven matrix of a, three by three matrix, and we get a, three by three output. The input and output dimensions turns out to be governed by the following formula. If you have an n by n image convolve with an f by f filter. If you use adding, p and stride s. In this example, s is equal to two. Then you end up with an output that is n plus 2p, minus f. Now because you're stepping s steps at a time this up just one step at a time, you know divide by s plus 1, and then by the same thing. In our example we have, 7+0-3 divided by 2, that's a stride plus one equals, let's see, that's 4/2+1=3, which is why we wind up with this three by three output. Now, just one last detail, which is one of this fraction is not an integer. In that case, we're going to round this down. This notation denotes the four or something. This is also called the floor of z. It means taking z, and rounding down to the nearest integer. If the way this is implemented is that you take this type of blue box multiplication only if the blue box is fully contained within the image or the image plus the padding. If any of this blue box, part of it hangs outside then you just do not do that computation. Then it turns out that if that's a convention that your feedback, the filter must lie entirely within your image or the image plus the padding region before there's a corresponding output generator. That's convention. Then the right thing to do, to compute the output dimension is to round down, in case this n+2p-f/s is not an integer. Just to summarize the dimensions, if you have an n by n matrix or n by n image, that you convolve with an f by f matrix and if I filter with padding p and stride s, then the output size, will have this dimension. It is nice we can choose all of these numbers, so that isn't integer, although sometimes you don't have to do that and rounding down this is just fine as well. But please feel free to work through a few examples of values of n, f, p, and s for yourself to convince yourself if you want, that this formula is correct for the output size. Now, before moving on, there is a technical comment I want to make about cross-correlation verses convolutions. This won't affect what you have to do to implement convolution neural networks. But depending on the DVD, different math textbook or signal processing textbook, there is one other possible inconsistency in the notation. Which is that if you look at a typical math textbook the way that the convolution is defined, before doing the element-wise product and summing, there's actually one other step that you would first take, which is to convolve this six by six matrix, and this three by three filter, you first take the three by three filter, and flip it on the horizontal as well as the vertical axis. There's 3, 4, 5 102 minus 197 will become three goes here, four goes there, five goes there. Then the second row 201 minus 197. But this is really taking the three by three filter, and mirroring it both on the vertical and the horizontal axis. Then it was this flipped matrix that you would then copy over here. To compute the output, you would take 2*7 and so on. You actually multiply out the elements of this filter matrix in order to compute the upper left-hand, most elements of the four by four output as follows. Then you take those nine numbers and your shift them over by one, and so on. The way we've defined the convolution operation in these videos is that we've skipped this mirroring operation. Technically what we're actually doing? Really the operation we've been using for the last few videos is sometimes cross-correlation instead of convolution. That in the deep learning literature, by convention, we just call this the convolution operation. Just to summarize, by convention, in machine learning, we usually do not bother with this flipping operation. Technically this operation is maybe better called cross-correlation. But most of the deep learning literature just causes the convolution operator. I'm going to use that convention in these videos as well, and if you read a lot of the machine learning literature, you find most people would just call this, the convolution operator without bothering to use these flips. It turns out that in signal processing or in certain branches of mathematics, doing the flipping in the definition of convolution causes convolution operator, to enjoy this property that A convolve with B, convolve with C, is equal to A convolve with B, convolve with C. And this is called associativity in mathematics. This is nice for some signal processing applications. But for deep neural networks, it really doesn't matter, and so omitting this double mirroring operation just simplifies the code, and mixing neural network just as well. By convention, most of us just call this convolution. Even though, the mathematicians prefer to call this cross-correlation sometimes. But this should not affect anything you have to implement in their permanent exercises, and should not affect your ability to read and understand the deep learning literature. You've now seen how to carry out convolutions, and you've seen how to use padding as well as strides for convolutions. But so far all we've been using is convolutions over matrices like over a six by six matrix. In the next video, you'll see how to carry out convolutions over volumes. This will make why you can do a convolution suddenly much more powerful. Let's go on to the next video.