TSM - Puppet-based automation

Claudiu Demian - Systems Administrator

Automation represents a very important component in IT, whether it is used for software development (in continuous integration, for example) or for the administration of different systems and infrastructure. In the case of big, dynamic environments, implementing a form of automation represents one of the most basic needs in order to ensure the optimization of the resource management.  

Puppet is a configuration management system which allows its system administrators to assess the state of the IT infrastructure. Any change that has to be performed is translated into a change within the puppet configuration for the respective resource (file/package/nodes/group of nodes etc.), which is automatically applied to all the servers (nodes) that are related to the change.  

The description of this state can be done using the puppet language, which is declarative. The general configuration can be found in the file /etc/puppet/manifests/site.pp. Here we can find the modules, classes, and the resources that were defined in /etc/puppet/modules.

Puppet works in a client-server type of architecture, as well as in a stand-alone scenario. In the first situation, the puppet server is called puppetmaster. The configuration of the infrastructure is defined on the puppetmaster machine, which is later retrieved by clients at regular time intervals (intervals which can be adjusted by the administrator).

The following lines contain a very good example of how this language can be used:

/etc/puppet/modules/lighttpd/manifests/init.pp:

class lighttpd {

package{'lighttpd':

  ensure => installed,

}

file{‘/etc/lighttpd/lighttpd.conf':

  content => template('lighttpd/lighttpd.conf.erb'),

  notify  => Service['lighttpd'];

}

service{'lighttpd':

  ensure => running,

  enable => true,

}

}

This class describes the state of an installation of the web server lighttpd, focusing on the description of each necessary component. Therefore, we can identify the first types of resources that puppet benefits from, which are, probably, also the most used.

A module represents a set of classes, definitions, templates and files that, taken together, have only one goal. This also brings us to the first recommendation in writing a module: it has to perform only one functionality. For example: a LAMP server could be managed using only one puppet module that deals with installing Apache, the MySQL server, the PHP and all the other related services from Linux (authentication, NTP etc.). One problem that may appear is the size of the module: it can become too big and therefore hard to manage. More than that, it loses portability. A more elegant solution would be to divide the configurations into 4 separate modules, reducing therefore their complexity.

A class represents a block of code which can be instantiated. The main class of a module is defined in the manifests/init.pp file of each module. Instantiating a class for a node, so that the defined configurations can be applied, can be done using the include function.

/etc/puppet/manifests/site.pp:

node /web\d+/ {

include lighttpd

}

Classes allow for the use of inheritances and can be instantiated multiple times in the general puppet configuration, which can be found in the site.pp file. Parameters can also be applied to classes.

Within a class, we define the desired state of the system using predefined or administrator-defined resources. In the last example, we can identify the following resources: the lighttpd package, the /etc/lighttpd/lighttpd.conf file and the lighttpd service. Each resource has a type (package, file and, respectively, service), a name and one or more attributes (ensure, content, notify etc.). Puppet disposes of a satisfactory number of predefined types for resources, very well covered in their official documentation.

Another very useful functionality in puppet is template-ing. Using the ERB (Embedded Ruby) language, puppet offers the possibility of generating files according to the parameters introduced by the administrator when instancing the class or by the state of the system, using facts. Along with the puppet, a utility called facter is also installed. This utility gathers information about the system and displays it as facts (for example: ip_address, fqdn, operatingsystem etc.). This functionality can be extended with facts defined by the administrator which, in their own turn, can be used in templates.

Due to the fact that the modules are independent and reusable, puppet offers the service PuppetForge, through which the users can offer, for free, modules made by them and can download the modules of other users.

Aside from the main functionality, that of describing the state of the system, puppet can be extended to gather statistical data about the infrastructure. This can be done by using the puppetdb service. This service is backed by a data base in which the facts belonging to all the systems in the infrastructure are aggregated. This information can later be used within the classes in order to generate resources in a dynamic way.  

One example of how to use puppetdb is represented by a management module of a Nagios instance. By using information about the systems from puppetdb (hostname, ip, the hostgroups being a part of that), it automatically generates a configuration file for each new host introduced in the infrastructure. The administrator’s only duty then is to define commands, checks and hostgroups.

Another possibility to extend puppet is by using the puppet-dashboard. This service offers a web interface through which we can access more information and statistics about the infrastructure. Each client sends a report about the last run of the user to the server on which the puppet-dashboard ran: if it was successful or not, if there have been any changes or if something failed. We can view these reports in the application, as well as in the statistics generated by these reports.

Puppet-dashboard also offers us another service, called Inventory Service, through which we can query the state of the system, again according to facts. If by using puppetdb we can use this information inside classes, by using Inventory Service we have access to it from the outside. By using their API, we can put together tools/applications that can use this information.

This article is meant to be an introduction to puppet for those who still haven’t added it in their environment. We only addressed this segment of the IT developers because, for those who have already adopted it, it is hard to imagine that they did not immediately fall in love with puppet’s power, flexibility and capacity to ease the sys admins’ work. We don’t claim that puppet represents a solution to all the problems or that it is the best configuration management system (there are other systems out there, such as chef, cfengine, and many other commercial systems), but we do claim that any system administrator should use such a system in the infrastructure that he or she is managing.