I found the unit architecture useful for many applications I have worked on, and there's this one of the most important and foundational neural network architectures of computer vision today. So in this video, let's dig in the details of exactly how the U-Net works. This is what a U-Net looks like. I know there's a lot going on in this picture, is okay we'll break it down and step it through one step at the time. But I want to quickly show you the final outcome of what a unit architecture looks like. And this, by the way, is also why it's called a unit, because when you draw it like this, it looks a lot like a U. But let's break it down and build this back up one piece at a time. So we're going to make sure we know what all of the things going on in this diagram actually are doing. These ideas were due to Olaf Ranneberger for the Fisher and Thomas Bronx. Fun fact, when they wrote the original unit paper, they were thinking of the application of biomedical image segmentation, really segmenting medical images. But these ideas turned out to be useful for many other computer vision, semantic segmentation applications as well. So the input to the unit is an image, let's say, is h by w by three, for three channels RGB channels. I'm going to visualize this image as a thin layer like that. I know that previously we had taken neural network layers and drawn them, as three D blocks like this, where this might be rise of h by w by three. But in order to make the unit diagram look simpler going to imagine just looking at this edge on. So all you see is this. And the rest of this is hidden behind this dark blue rectangle that I've have drawn. And so that's what this is. The height of this is, the height h and the width of this is three, right, with the number of channels,the depth is 3. So it looks very thin. So to simplify the unit diagram, I'm going to use these rectangles rather than three D shapes to illustrate the algorithm. Now, the first part of the unit uses normal feed forward neural network convolutional layers. So I'm going to use a black arrow to denote a convolutional layer, followed by a value activation function. So the next layer, maybe we'll say that, we have increased the number of channels a little bit, but the dimension is still height by width by a little bit more channels and then another convolutional layer with a rather activation function. Now we're still in the first half of the neural network. We're going to use Max pooling to reduce the height and width. So maybe you end up with a set of activations where the height and width is lower, but maybe a sticker, so the number of channels is increasing. Then we have two more layers of normal feet forward convolutions with a radio activation function, and then the supply Max pooling again. And so you get that, right. And then you repeat again. And so you end up with this. So, so far, this is the normal convolution layers with activation functions that you've been used to from earlier videos with occasional max pooling layers. So notice that the height of this layer I deal with is now very small. So we're going to start to apply transpose convolution layers, which I'm going to note by the green arrow in order to build the dimension of this neural network back up. So with the first transpose convolutional layer or trans conv layer, you're going to get a set of activations that looks like that. In this example, we did not increase the height and width, but we did decrease the number of channels. But there's one more thing you need to do to build a unit, which is to add in that skip connection which I'm going to denote with this grey arrow. What the skip connection does is it takes this set of activations and just copies it over to the right. And so the set of activations you end up with is like this. The light blue part comes from the transpose convolution, and the dark blue part is just copied over from the left. To keep on building up the unit we are going to then apply a couple more layers of the regular convolutions, followed by our value activation function so denoted by the black arrows, like so, and then we apply another transpose convolutional layer. So green arrow and here we're going to start to increase the dimension, increase the height and width of this image. And so now the height is getting bigger. But here, too, we're going to apply a skip connection. So there's a grey arrow again where they take this set of activations and just copy it right there, over to the right. More convolutional layers and other transpose convolution, skip connection. Once again, we're going to take this set of activations and copy it over to the right and then more convolutional layers, followed by another transpose convolution. Skip connection, copy that over. And now we're back to a set of activations that is the original input images, height and width. We're going to have a couple more layers of a normal fee forward convolutions, and then finally, to take this and map this to our segmentation map, we're going to use a one by one convolution which I'm going to denote with that magenta arrow to finally give us this which is going to be our output. The dimensions of this output layer is going to be h by w, so the same dimensions as our original input by num classes. So if you have three classes to try and recognize, this will be three. If you have ten different classes to try to recognize in your segmentation at then that last number will be ten. And so what this does is for every one of your pixels you have h by w pixels you have, an array or a vector, essentially of n classes numbers that tells you for our pixel how likely is that pixel to come from each of these different classes. And if you take a arg max over these n classes, then that's how you classify each of the pixels into one of the classes, and you can visualize it like the segmentation map showing on the right. So that's it. You've learned about the transpose convolution and the unit architecture. Congrats getting to the end of this week's videos. I hope you also enjoy working through these ideas in the program exercise. Next week, we'll come back and talk about some specialized architecture for face recognition and for your style transfer where you get to create some really interesting artwork using neural networks. Have fun with pro exercise, and I look forward to seeing you next week.