- [Morgan] As more people visit our application, the demand on our two web servers is going to increase. At some point, our two instances aren't going to be able to handle that demand and we're going to need more EC2 instances. Instead of launching these instances manually, we want to do it automatically with EC2 Auto Scaling. Auto scaling is what allows us to provision more capacity on demand, depending on different thresholds that we set and we can set those thresholds in CloudWatch. Okay, so we're going to draw a high-level example of how this works and then Alana is going to build it out for us. So looking at our application, traffic coming in from the outside can come down to either EC2 instance. In this video, these EC2 instances will be part of an auto scaling group. Each of the EC2 instances that we launch will be completely identical. Then we're going to run code to simulate stress on our employee directory application that will make the instances think that they're overloaded. When this happens, the instances will report to CloudWatch and say that their CPUs are spiking. CloudWatch will then go into an alarm state and tell auto scaling hey, give me more EC2 instances. As each instance comes online, they'll pass ALB health checks, they'll start receiving traffic and give us the horizontal scalability that we need. As each instance comes online and accepts traffic, we should see the CPU utilization go down across the entire fleet. All right, Alana, did you hear that? Do you mind building this out for us? - [Alana] Let's make it happen. So to get this built out, the first thing we need to create is a launch template. This is going to define what to launch. So it's all about setting the configurations for the instances we want to launch. Morgan said we want our instances to be identical and that's what we'll be configuring here. First thing we're going to do in the EC2 dashboard is find Launch Templates on the side panel. And then click Create launch template. You'll first provide a name and description. I'll call this app-launch-template and then the description will be a web server for the employee directory application. Then you'll notice this handy checkbox here that asks if you're going to use this launch template with EC2 Auto Scaling. We are so we'll check it and then scroll down. We'll then choose the AMI. Again, the launch template is all about what we launch. So what we wanna do is create mirrors of the web servers hosting our app that we already have running. That way whenever we scale out, we just have more instances with the exact same configuration. When we launched the instance hosting our app earlier, we used the Amazon Linux 2 AMI and a t2.micro instance type. So we'll select both of those options. We'll also select the same key pair that we used earlier called the app key pair. Next, we choose the security group for our new instances. We'll use the same security group you created in the course which in this case, is the web-security-group. And then we can scroll down and expand the Advanced details section. In the instance profile dropdown, we'll choose the same role we used previously, which is this one. Once we do that, we'll scroll all the way down and paste in our user data. This is what grabs our source code and unzips it so that our application runs on EC2. Now we're done and we can click Create. Now that we've configured our launch template, which again, defines what we launch, we now have to define an auto scaling group, which tells us where, when and how many instances to launch. To create an auto scaling group, I'll go ahead and back out of this. And then we'll select Auto Scaling Groups on the side panel. And then we'll click Create Auto Scaling group. Then we'll enter a name such as app-asg and then choose the launch template we just created. And then we can click Next. Then we'll select our network. We'll choose the same VPC that we created earlier in the course, app VPC. And then we'll select both of the public subnets, Public Subnet 1 and Public Subnet 2. And then we'll go ahead and click Next. We then need to select Attach to an existing load balancer to receive traffic from the load balancer we created earlier and then choose our target group, app-target-group-2. Then we can click Enable ELB health checks, so that the load balancer will check if your instances are healthy and then we can go ahead and click Next. Now we'll choose the minimum, maximum and desired capacity. The minimum we'll say is two. That means that our auto scaling group will always have at least two instances, one in each availability zone. The maximum we'll say is four, which means that our fleet can scale up to four instances. And the desired capacity, which is the number of instances you want running, we'll also say is two. Next, we can configure these scaling policies. With scaling policies, you define how to scale the capacity of your auto scaling group and response to changing demand. For example, you could set up a scaling policy with CloudWatch that whenever your instance CPU utilization or any other metric that you would like reaches a certain limit, you can deploy more EC2 instances to your auto scaling group. So what we want to do is use auto scaling policies to adjust the capacity of this group. Earlier, Morgan created a CloudWatch alarm that resulted in the action of sending out an email. Here, we're going to create a CloudWatch alarm, much like Morgan did, but this time it will result in the action of triggering an auto scaling event. So we'll go ahead and enable target tracking. We'll name this CPU Utilization. Leave the metric type as CPU utilization. And then we'll say that we want to add a new instance into our fleet whenever the target value is 60%. We'll also say that instances will need 300 seconds to warm up. Then we'll click Next to configure notifications when a scaling event happens. This is optional, so for now we're going to skip past it, but you'll do it in the exercise coming up. You can also configure tags as well, but we'll skip past that too. All right, so we're here and we can review this and then click Create Auto Scaling group. Now all that's left is to stress our application and make sure that it actually scales up to meet the demand. To do that, I'll open up a new tab and paste the end point for our elastic load balancer. So here's our application. I'm going to go to the /info page by appending it on the URL. And you'll notice here that we built in a stress CPU feature. This is going to mimic the load coming onto our servers. In the real world, you would probably use a load testing tool but instead we built a stress CPU button as a quick and easy way to test this out. Then we can watch our servers scale with that auto scaling policy. So as the CPU utilization goes up, our auto scaling policy will be triggered and our server groups will grow in size. So I'm going to go ahead and select five minutes for the stress test. All right, some time has passed. We've stressed our application. And if we take a look at CloudWatch, we can see what happened. So let's find CloudWatch. And if we click on our OK Alarms and click on the alarm high alarm, we can see our alarm summary. As you can see, we went over our 60% CPU utilization across our instances and that was our threshold for our auto scaling group to launch new instances. So ideally, what would have happened is auto scaling launched more instances to handle this load. Because we've launched more instances into our auto scaling group, there were more hosts to accept traffic from our elastic load balancer and the average CPU utilization that then went down across our servers and you can see that here in the alarm summary. All right, so let's go ahead and look at the EC2 instances that were launched into the auto scaling group. I'm going to go to the EC2 dashboard. I'm going to scroll down to Target Groups and then select the app-target-group, click on Targets. And as you can see here, we launched one more instance to help handle the load. So we have three healthy targets for our auto scaling group. So now we have an environment with an auto scaling group and our launch template. We've set up an alarm in CloudWatch that whenever our CPU utilization goes over 60%, we're going to launch more instances into that group. Okay, Morgan, back to you. - [Morgan] Thanks, Alana. Now, if we wait around long enough, what we would see is as the CPU load dropped off, then one by one each instance would get terminated because we no longer need them, bringing our state all the way back down to the basic two servers that we need, which is the minimum number for this particular auto scaling group. All of that done without any human intervention. If for some reason you are doing this in your AWS account, don't forget to delete the auto scaling group instead of just the instances. Otherwise, guess what will happen? Well, the auto scaling group is gonna say, hey, we need a minimum of two instances. So if you go in and just try to terminate the instances in that auto scaling group, it's going to notice that you are not meeting the minimum, it's gonna spin up two more in its place. So in order to clean up in your account, make sure that you delete the auto scaling group or you set the minimum to be zero, so that way it doesn't go recreating resources that you're trying to delete.