It should be noted that the author of said manifest is a junior member of staff learning Puppet and so this is not a critique of their work by any means. I wanted to take the time to explain to them and provide guidance to the rest of the team about how various components should be handled.
First of all, let's start with what this person did right
- They fully automated the installation of this agent using Puppet
- They deployed the binaries and scripts using a package
Based on that alone, this person is already heading in the right way and is leagues ahead of a large percentage of system administrators. If I asked "How would you install X on Y systems?" in an interview and that was their answer, they would be hired based purely on their potential.Furthermore, given that the manual way to install X was to install the package and run the configuration script, it is understandable why the person who wrote the manifest would do it that way.
The issue is that using scripting to modify configuration files or settings has a number of problems with it
- Once you "shell out" of Puppet to perform an action or create configuration, Puppet has no way of knowing what was done.
- RPM also has the same issue, especially if you are using the script to modify the state of files delivered by the package. RPM does have the concept of a %config macro to help with this problem, but ultimately once you use a script to change the contents of a file from its original state, RPM verify will start reporting errors.
- What ever the script is doing, this should be handled explicitly within puppet. In the example, a configuration file was modified to use specify that a particular user should be used to execute the agent. This should have been delivered as an ERB template with the user as a variable. Other examples would be enabling a service etc.
- Scripts, either external to the package or things like preinstall or postinstall scripts are rarely idempotent and unless coded specifically to detect the state of what they are modifying will generally result in different results. An example of this would be adding a line to xinetd.conf for a service, in order to make the script idempotent, you would first need to check if the line exists and if it does not, then you can safely add it.
General rules
Based on the above, what general rules can be stated about packages and configuration?
Note: These general rules assume that you have a configuration management tool
- All binaries and scripts should be packaged. If it won't change once you install it on the system, then it belongs in a package.
- All configuration dependencies should be expressed in Puppet (users, services etc)
- All configuration files should be delivered as ERB templates as part of the puppet module and any modifications should be made either by including Puppet variables or using system facts for any host-specific configuration. The collary to this is that configuration files should not be packaged.
- Packages should contain no pre or post install scripts, any configuration requirements should be expressed within puppet.
Benefits of this approach
- If your packages only contain binaries and scripts that do not change, then the package verification checks such as Red Hat's RPM Verify or Solaris' pkgchk(1m), come back with no errors as the expected contents of the package match what is actually on the system.
- There are no conflicts between your package management systems and configuration management systems. If RPMs modify files delivered by packages or packages modify files controlled by Puppet, then a discrepancy will arise. Because the package management scripting is one shot only based on installation time, Puppet will overwrite the configuration with what it expects.
- It is very clear what requirements a particular piece of software has. For example, if a piece of software needs a user defined, this should be clear from the manifest. Should the user need to change, this can be easily handled within puppet.
Other considerations
One problem with this approach is that you have information about a particular component across two different area; the packages and the puppet code. It is important that both are kept under source control, preferably within the same repository because there is a dependency between them. These items should cross reference each other as well in the source repository.
The other downside is if you have to manage systems that are under puppet control and systems which are not. If you have to manage both types of systems, then I think you need to allow configuration files to be delivered in packages and modified in scripts so that you legacy systems will continue to work. For your newer systems, you should also express that configuration within Puppet and live with the expectation that puppet will redo the work of the package.
Closing Thoughts
This was a general introduction to how I believe configuration files should be handled. What I would really like to see is a series of patterns that can be used to explain these sorts of concepts. The Limoncelli, Hogan and Chalup book The Practice of System and Network Administration provides a good overview, but could be updated as the second version is 5 years old today as I write this.
No comments:
Post a Comment