The best health and availability of your Amazon EC2 instances are essential for providing your users with uninterrupted services in the fast-paced world of cloud computing. As the foundation of your infrastructure, EC2 instances need constant monitoring and prompt intervention to prevent any downtime and performance problems. In this blog post, we’ll look at how to use AWS Systems Manager (SSM) Runbooks and Amazon EC2 Status Checks to automate the process of restarting your EC2 instances when necessary while also proactively monitoring their health.
In this configuration we are using AWS services such as IAM, CloudWatch, Eventbridge and SSM.
Step 1:
Select the EC2 instance that you want to restart when the EC2 status check fails. First of all configure SSM in it. Then go to the status checks option then dropdown the Actions, from there you can create a status check alarm. You can also add SNS topic for notifications.
Step 2:
Once the Alarm creation complete, we can create an event rule to run the SSM runbook on the state of ALARM. For that you can use the below event pattern.
When you choose the target, select the target as Systems Manager Automation and Document as AWS-RestartEC2Instance, then provide the instance ID in the InstanceId field. Please have a check on the below screenshot.
We need to specify the Execution role, either we can create a new role or we can use an existing role. We need to specify two policies in the role CloudWatchFullAccess and AmazonEC2FullAccess.
That’s all the setup is completed now. Whenever the EC2 instance status check fail the cloudwatch metric will change to ALARM state and the event rule will trigger the runbook.
You can see the runbook execution from AWS Systems Manager >> Automation.