An adventure in agile administration: Postinstall scripts are not configuration management

This post is a reaction to reviewing a puppet manifest at ${WORK} that installs a package and then has an exec resource run a script that modifies a configuration file for the package. Something in me just screams that this sort of approach is wrong, but I was looking for a way to explain why. Rather than just rant and rave at length which I am known to do on occasion, I wanted to give a succinct set of rules the team could follow. Fortunately, people aren't robots and so they usually want to understand the reasons behind those rules. This post is a means of me fleshing out those reasons.

It should be noted that the author of said manifest is a junior member of staff learning Puppet and so this is not a critique of their work by any means. I wanted to take the time to explain to them and provide guidance to the rest of the team about how various components should be handled.

First of all, let's start with what this person did right

They fully automated the installation of this agent using Puppet
They deployed the binaries and scripts using a package

Based on that alone, this person is already heading in the right way and is leagues ahead of a large percentage of system administrators. If I asked "How would you install X on Y systems?" in an interview and that was their answer, they would be hired based purely on their potential.Furthermore, given that the manual way to install X was to install the package and run the configuration script, it is understandable why the person who wrote the manifest would do it that way.

The issue is that using scripting to modify configuration files or settings has a number of problems with it

Once you "shell out" of Puppet to perform an action or create configuration, Puppet has no way of knowing what was done.
RPM also has the same issue, especially if you are using the script to modify the state of files delivered by the package. RPM does have the concept of a %config macro to help with this problem, but ultimately once you use a script to change the contents of a file from its original state, RPM verify will start reporting errors.
What ever the script is doing, this should be handled explicitly within puppet. In the example, a configuration file was modified to use specify that a particular user should be used to execute the agent. This should have been delivered as an ERB template with the user as a variable. Other examples would be enabling a service etc.
Scripts, either external to the package or things like preinstall or postinstall scripts are rarely idempotent and unless coded specifically to detect the state of what they are modifying will generally result in different results. An example of this would be adding a line to xinetd.conf for a service, in order to make the script idempotent, you would first need to check if the line exists and if it does not, then you can safely add it.

General rules

Based on the above, what general rules can be stated about packages and configuration?

Note: These general rules assume that you have a configuration management tool

All binaries and scripts should be packaged. If it won't change once you install it on the system, then it belongs in a package.
All configuration dependencies should be expressed in Puppet (users, services etc)
All configuration files should be delivered as ERB templates as part of the puppet module and any modifications should be made either by including Puppet variables or using system facts for any host-specific configuration. The collary to this is that configuration files should not be packaged.
Packages should contain no pre or post install scripts, any configuration requirements should be expressed within puppet.

This thinking lines up with how the IPS packaging system works in Solaris 11. This is described in Stephen Hahn's paper called pkg(5): a no scripting zone. While the logic of IPS is sound, there is an implicit reliance on another tool (Puppet, Chef etc) to handle the configuration. There is a hack to use SMF to launch a script to configure the package, but that just seems out of place and awkward. The flaw in the logic that Sun had and Oracle inherited was that Solaris had no native configuration management tools to handle this and still do not.

Benefits of this approach

If your packages only contain binaries and scripts that do not change, then the package verification checks such as Red Hat's RPM Verify or Solaris' pkgchk(1m), come back with no errors as the expected contents of the package match what is actually on the system.
There are no conflicts between your package management systems and configuration management systems. If RPMs modify files delivered by packages or packages modify files controlled by Puppet, then a discrepancy will arise. Because the package management scripting is one shot only based on installation time, Puppet will overwrite the configuration with what it expects.
It is very clear what requirements a particular piece of software has. For example, if a piece of software needs a user defined, this should be clear from the manifest. Should the user need to change, this can be easily handled within puppet.

Other considerations

One problem with this approach is that you have information about a particular component across two different area; the packages and the puppet code. It is important that both are kept under source control, preferably within the same repository because there is a dependency between them. These items should cross reference each other as well in the source repository.

The other downside is if you have to manage systems that are under puppet control and systems which are not. If you have to manage both types of systems, then I think you need to allow configuration files to be delivered in packages and modified in scripts so that you legacy systems will continue to work. For your newer systems, you should also express that configuration within Puppet and live with the expectation that puppet will redo the work of the package.

Closing Thoughts

This was a general introduction to how I believe configuration files should be handled. What I would really like to see is a series of patterns that can be used to explain these sorts of concepts. The Limoncelli, Hogan and Chalup book The Practice of System and Network Administration provides a good overview, but could be updated as the second version is 5 years old today as I write this.

An adventure in agile administration

Saturday, July 14, 2012

Postinstall scripts are not configuration management

No comments:

Post a Comment