|

Introduction

After spending any time building resilient environments on Amazon AWS, it becomes clear that you need to fully automate the launching and terminating of instances. Traditionally, you might pre-configure your AMI, and configure auto-scaling groups to launch and terminate instances based on various parameters such as, say, CPU utilisation, memory consumption, I/O performance.

Another option is to implement a configuration management framework such as Puppet, and configure a single, base AMI for all of your instances with Puppet performing the final configuration. There is one major problem with this approach – AWS will reuse IP addresses and hostnames, and the puppet agents on the new instances will fail to register with the puppet master, and the final configuration will fail.

One option commonly taken is to run a script via a cron job to periodically check your active instances and clean up the certificates on the puppet master for those instances that have not contacted the puppet master in a certain time period. This is not ideal, but can work. A better solution is to have AWS notify your puppet master when it has launched or terminated an instance so that you can properly whitelist or revoke certificates in response to those events.

This article will describe the steps necessary to integrate this level of automation into your AWS based puppet infrastructure. We will make use of SQS message queues, SNS notifications, and some puppet master based scripting to enable this functionality, in addition to the autoscaling you have already configured.

Configure the message queue

The message queue will be polled by the puppet master in order to identify certificates that need to be revoked and cleaned up, or new certificate names that need to be whitelisted for auto-signing.

In your AWS console, navigate to the SQS page

2015-08-26_9-54-24

Click “Create new queue”

2015-08-26_9-55-26

Enter the queue name, configure any options you need, and click “Create Queue”

2015-08-26_9-56-01

Configure the notification service

The auto-scaling groups will use SNS to post events to the message queue ready for the puppet master to act on.

On your AWS console, navigate to the SNS page

2015-08-26_9-57-07

Choose “Topics”, and “Create new topic”:

2015-08-26_9-59-05

Enter a topic name, and optional display name:

2015-08-26_10-01-07

Configure auto-scaling groups to send notifications

Navigate to your auto-scaling group and select the “Notifications” tab:

2015-08-26_10-03-08

Choose the message queue you created earlier as the destination to send messages to:

2015-08-26_10-04-54

Navigate back to the message queue, select it and choose “Subscribe Queue to SNS Topic” in the “Queue Actions” dropdown:

2015-08-26_10-16-01

Select the SNS Topic and click “Subscribe”:

2015-08-26_10-16-27

The autoscaling groups are now configured to post a message to the SQS queue when an instance is launched or terminated.

Configure the puppet master

The puppet master must be configured to autosign whitelisted certificates only. In this mode, certificate names present in /etc/puppet/autosign.conf will be automatically signed and accepted by the puppet master when presented by a puppet agent. All other certificate signing requests will require manual signing using the puppet cert sign command.

2015-08-26_13-22-13

A script is required to monitor the AWS SQS message queue in order to pick up instance launches and terminations. In this case, a python script is used that will clean up the agent certificate upon receipt of an instance termination event, and add the certificate name to the whitelist on receipt of an instance launch event from the configured autoscaling groups. This script can be placed anywhere (maybe /usr/local/bin/puppet-autoscale.py). The script takes two arguments – the AWS region, and the SQS queue name:

 

The script can be scheduled to poll the message queue via cron:

Configure the puppet agents

Due to the limited information available from termination events, it is necessary to modify the agents to include the instance ID and region in their certificate names. The format used in this example is <INSTANCE_ID>-<REGION>.amazonaws.com. You are free to use your own domain name component, but it must remain consistent throughout the configuration.

In the instance user-data script, the agents are configured to use a certificate name on the same form as mentioned above:

Testing

Now, when an autoscaling group launches instances, it sends a notification to the message queue. Every minute, the puppet master will poll the queue for new messages, and when found, it will either de-register and remove the whitelist for the agent certificates, or add the whitelist depending on whether the instance was terminated or launched.

To test, I usually have a few shell sessions on the puppet master, one to monitor certificates (watch -d ls -al /var/lib/puppet/ssl/ca/signed/), and one to monitor the whitelist (watch -d cat /etc/puppet/autosign.conf).

You can now terminate an auto-scaled instance and watch for the certificate to be removed, for the certificate for the new instance to be whitelisted, and finally for the new certificate to be installed. You can then verify on the new instance that puppet ran by running puppet agent -t –noop.

Related Post