To generate certificates, we need to initialize a new CA on the lb machine. This can be done easily using the puppet cert subcommand, as shown here:
With the CA certificate generated, we can now create a new certificate for the master. When nodes connect to Puppet, they will search for a machine named puppet. Since the name of my test machine is lb, I will alter Puppet configuration to have Puppet believe that the name of the machine is puppet. This is done by adding the following to the puppet.conf file in either the [main] or [master] sections. The file is located in /etc/puppetlabs/puppet/puppet.conf:
The domain of my test machine is example.com and I will generate the certificate for lb with the example.com domain defined. To generate this new certificate, we will use the puppet certificate generate subcommand, as shown here:
Now, since the certificate has been generated, we need to sign the certificate, as shown here:
The signed certificate will be placed into the /etc/puppetlabs/puppet/ssl/ca/signed directory; we need to place the certificate in the /etc/puppetlabs/puppet/ssl/certs directory. This can be done with the puppet certificate find command, as shown here:
In addition to displaying the certificate, the puppet cert sign command will also place the certificate into the correct directory.
With the certificate in place, we are ready to start the puppetserver process.
Enterprise Linux 7 (EL7) based distributions now use systemd to control the starting and stopping of processes. EL7 distributions still support the service command to start and stop services. However, using the equivalent systemd commands is the preferred method and will be used in this book. systemd is a complete rewrite of the System V init process and includes many changes from traditional UNIX init systems. More information on systemd can be found on the freedesktop website at http://www.freedesktop.org/wiki/Software/systemd/.
To start the puppetserver service using systemd, use the systemctl command, as shown here:
puppetserver will start after a lengthy process of creating JVMs. To verify that puppetserver is running, verify that the Puppet master port (TCP port 8140) is listening for connections with the following command:
At this point, your server will be ready to accept connections from Puppet agents. To ensure that the puppetserver service is started when our machine is rebooted, use the enable option with systemctl, as shown here:
With Puppet master running, we can now begin to configure a load balancer for our workload.
At this point, the lb machine is acting as a Puppet master running the puppetserver service. Puppet agents will not be able to connect to this service. By default, EL7 machines are configured with a firewall service that will prevent access to port 8140. At this point, you can either configure the firewall using firewalld to allow the connection, or disable the firewall.
Note
Host based firewalls can be useful; by disabling the firewall, any service that is started on our server will be accessible from outside machines. This may potentially expose services we do not wish to expose from our server.
To disable the firewall, issue the following commands:
Alternatively, to allow access to port 8140, issue the following commands:
We will now create a load balancing configuration with three servers: our first lb machine and two machines running puppetserver and acting as Puppet masters. I will name these puppetmaster1 and puppetmaster2.
To configure the lb machine as a load balancer, we need to reconfigure puppetserver in order to listen on an alternate port. We will configure Apache to listen on the default Puppet master port of 8140. To make this change, edit the webserver.conf file in the /etc/puppetlabs/puppetserver/conf.d directory, so that its contents are the following:
This will configure puppetserver to listen on port 8141 for TLS encrypted traffic and port 18140 for unencrypted traffic. After making this change, we need to restart the puppetserver service using systemctl, as follows:
Next, we will configure Apache to listen on the master port and act as a proxy to the puppetserver process.
To configure Apache to act as a proxy service for our load balancer, we will need to install
httpd, the Apache server. We will also need to install the mod_ssl package to support encryption on our load balancer. To install both these packages, issue the following yum command:
Next, create a configuration file for the load balancer that uses the puppet.example.com certificates, which we created earlier. Create a file named puppet_lb.conf in the /etc/httpd/conf.d directory with the following contents:
This configuration creates an Apache VirtualHost that will listen for connections on port 8140 and redirect traffic to one of the three puppetserver instances. One puppetserver instance is the instance running on the load balancer machine lb. The other two are Puppet master servers, which we have not built yet. To continue with our configuration, create two new machines and install puppetserver, as we did on the lb machine; name these servers, as puppetmaster1 and puppetmaster2.
In our load balancing configuration, communication between the lb machine and the Puppet masters will be unencrypted. To maintain security, a private network should be established between the lb machine and the Puppet masters. In my configuration, I gave the two Puppet masters IP addresses 192.168.0.100 and 192.168.0.101, respectively. The lb machine was given the IP address 192.168.0.110.
The following lines in the Apache configuration are used to create two proxy balancer locations, using Apache's built-in proxying engine:
The puppetca balancer points to the local puppetserver running on lb. The puppetworker balancer points to both puppetmaster1 and puppetmaster2 and will round robin between the two machines.
The following ProxyPass and ProxyPassMatch configuration lines direct traffic between the two balancer endpoints:
These lines take advantage of the API redesign in Puppet 4. In previous versions of Puppet, the Puppet REST API defined the endpoints using the following syntax:
The first part of the path is the environment used by the node. The second part is the endpoint. The endpoint may be one of certificate, file, or catalog (there are other endpoints, but these are the important ones here). All traffic concerned with certificate signing and retrieval will have the word "certificate" as the endpoint. To redirect all certificate related traffic to a specific machine, the following ProxyPassMatch directive can be used:
Indeed, this was the ProxyPassMatch line that I used when working with Puppet 3 in the previous version of this book. Starting with Puppet 4, the REST API URLs have been changed, such that all certificate or
certificate authority (CA) traffic is directed to the puppet-ca endpoint. In Puppet 4, the API endpoints are defined, as follows:
Or, as follows:
The environment is now placed as an argument to the URL after ?. All CA related traffic is directed to the /puppet-ca URL and everything else to the /puppet URL.
To take advantage of this, we use the following ProxyPassMatch directive:
With this configuration in place, all certificate traffic is directed to the puppetca balancer.
In the next section, we will discuss how TLS encryption information is handled by our load balancer.
When a Puppet agent connects to a Puppet master, the communication is authenticated with X.509 certificates. In our load balancing configuration, we are interjecting ourselves between the nodes and the puppetserver processes on the Puppet master servers. To allow the TLS communication to flow, we configure Apache to place the TLS information into headers, as shown in the following configuration lines:
These lines take information from the connecting nodes and place them into HTTP headers that are then passed to the puppetserver processes. We can now start Apache and begin answering requests on port 8140.
Security-Enhanced Linux (SELinux) is a system for Linux that provides support for
mandatory access controls (MAC). If your servers are running with SELinux enabled, great! You will need to enable an SELinux Boolean to allow Apache to connect to the puppetserver servers on port 18140. This Boolean is httpd_can_network_connect. To set this Boolean, use the setsebool command, as shown here:
SELinux provides an extra level of security. For this load balancer configuration, the Boolean is the only SELinux configuration change that was required. If you have unexplained errors, you can check for SELinux AVC messages in /var/log/audit/audit.log. To allow any access that SELinux is denying, you use the setenforce command, as shown here:
More information on SELinux is available at http://selinuxproject.org/page/Main_Page.
Now a configuration change must be made for the puppetserver processes to access certificate information passed in headers. The master.conf file must be created in the /etc/puppetlabs/puppetserver/conf.d directory with the following content:
After making this change, puppetserver must be restarted.
At this point, there will be three puppetserver processes running; there will be one on each of the Puppet masters and another on the lb machine.
Before we can use the new master servers, we need to copy the certificate information from the lb machine. The quickest way to do this is to copy the entire /etc/puppetlabs/puppet/ssl directory to the masters. I did this by creating a TAR file of the directory and copying the TAR file using the following commands:
With the certificates in place, the next step is to configure Puppet on the Puppet masters.
To test the configuration of the load balancer, create site.pp manifests in the code production directory /etc/puppetlabs/code/environments/production/manifests with the following content:
Create the corresponding file on puppetmaster2:
With these files in place and the puppetserver processes running on all three machines, we can now test our infrastructure. You can begin by creating a client node and installing the puppetlabs release package and then the puppet-agent package. With Puppet installed, you will need to either configure DNS, such that the lb machine is known as puppet or add the IP address of the lb machine to /etc/hosts as the puppet machine, as shown here:
Next, start the Puppet agent on the client machine. This will create a certificate for the machine on the lb machine, as shown here:
On the lb machine, list the unsigned certificates with the puppet cert list command, as shown here:
Now sign the certificate using the puppet cert sign command, as shown:
With the certificate signed, we can run puppet agent again on the client machine and verify the output:
If we run the agent again, we might see another message from the other Puppet master:
An important thing to note here is that the certificate for our client machine is only available on the lb machine. When we list all the certificates available on puppetmaster1, we only see the puppet.localdomain certificate, as shown in the following output:
However, running the same command on the lb machine returns the certificate we were expecting:
So at this point, when the nodes connect to our lb machine, all the certificate traffic is directed to the puppetserver process running locally on the lb machine. The catalog requests will be shared between puppetmaster1 and puppetmaster2, using the Apache proxy module. We now have a load balancing puppet infrastructure. To scale out by adding more Puppet masters, we only need to add them to the proxy balancer in the Apache configuration. In the next section, we'll discuss how to keep the code on the various machines up to date.
Keeping the code consistent
At this point, we are can scale out our catalog compilation to as many servers as we need. However, we've neglected one important thing: we need to make sure that Puppet code on all the workers remains in sync. There are a few ways in which we can do this and when we cover integration with Git in Chapter 3, Git and Environments, we will see how to use Git to distribute the code.
A simple method to distribute the code is with rsync. This isn't the best solution, but for example, you will need to run rsync whenever you change the code. This will require changing the Puppet user's shell from /sbin/nologin to /bin/bash or /bin/rbash, which is a potential security risk.
First, we create an SSH key for rsync to use to SSH between the Puppet master nodes and the load balancer. We then copy the key into the authorized_keys file of the Puppet user on the Puppet masters, using the ssh-copy-id command. We start by generating the certificate on the load balancer, as shown here:
This creates puppet_rsync.pub and puppet_rsync. Now, on the Puppet masters, configure the Puppet user on those machines to allow access using this key using the following commands:
The changes made here allow us to access the Puppet master server from the load balancer machine, using the SSH key. We can now use rsync to copy our code from the load balancer machine to each of the Puppet masters, as shown here:
Note
Creating SSH keys and using rsync
The trailing slash in the first part /etc/puppetlabs/code/ and the absence of the slash in the second part puppet@puppetmaster1:/etc/puppetlabs/code is by design. In this manner, we get the contents of /etc/puppetlabs/code on the load balancer placed into /etc/puppetlabs/code on the Puppet master.
Using rsync is not a good enterprise solution. The concept of using the SSH keys and transferring the files as the Puppet user is useful. In Chapter 2, Organizing Your Nodes and Data, we will use this same concept when triggering code updates via Git.
A second option to keep the code consistent is to use NFS. If you already have an NAS appliance, then using the NAS to share Puppet code may be the simplest solution. If not, using Puppet master as an NFS server is another. However, this makes your Puppet master a big, single point of failure. NFS is not the best solution for this sort of problem.
Using a clustered filesystem, such as gfs2 or glusterfs
is a good way to maintain consistency between nodes. This also removes the problem of the single point of failure with NFS. A cluster of three machines makes it far less likely that the failure of a single machine will render the Puppet code unavailable.
The third option is to have your version control system keep the files in sync with a post-commit hook or scripts that call Git directly such as r10k or puppet-sync. We will cover how to configure Git to do some housekeeping for us in Chapter 2, Organizing Your Nodes and Data. Using Git to distribute the code is a popular solution, since it only updates the code when a commit is made. This is the continuous delivery model. If your organization would rather push code at certain points (not automatically), then I would suggest using the scripts mentioned earlier on a routine basis.