Sunday, January 10, 2010

Reload That Config File

It is common for applications to load a configuration file on startup to control various options. Some applications can also reload their configuration file while running, allowing you to modify the application configuration without having to restart the application.

Contents

Goals

Configuration files (or config files) are useful because they let us change the behavior of an application with a mechanism that is much simpler and faster than modifying the application source and recompiling it. Being able to reload the configuration of a running application allows us to take that concept a bit further, as generally we can make reloading the configuration operationally simpler and faster than shutting down and restarting the application.

When reloading the configuration, we have the following goals:
  • Reloading a configuration should be a simple operation for the operator to trigger.
  • It should not be possible to load an invalid configuration. If the operator tries to do so, the application should continue running with the old configuration.
  • When reloading a configuration, the application should smoothly switch from the old configuration to the new configuration, ensuring that it is always operating with a consistent configuration. More precisely, an operational sequence that requires a consistent set of configuration parameters for the entire sequence should complete its sequence with the same set of configuration parameters as were active when the sequence started.
  • The application should provide feedback so that the operator knows what the application is doing. Logging, notification or statistics about configuration reloads should be available.

Dependency Injection

Dependency injection (DI) is a form of structural configuration in which different suppliers of a service are wired into an application based on the contents of a configuration file. In the typical case, these configurations are unlikely to change once an application has started. Although in principle it is possible to reload a DI configuration, and thus all of the discussion below could apply, in practice you might want to separate out the kind of relatively static structural configuration that is typically done with DI from the more dynamic parametric configuration that you might want to change while the application is running, and use different mechanisms to implement those two sets of configurations.

Alternatively, you can selectively disallow (as part of your validation step) configuration changes that are too much work to implement, requiring the user who wants to make such changes to restart the application.

Config Contents

If you think of a config file as being a set of late-binding commands for controlling the behavior of an application program, it should be clear that the most flexible config file is one that is itself a program. Applications that already have a built-in interpreter, such as emacs and applications written in Lisp, often simply feed their config files to their interpreter, giving them the full power of a Turing-complete language in which to express site-specific program behavior.

If you have an interpreter available, this can be a reasonable option: it is simple to implement, takes little work to document (assuming you already have to provide documentation for the interpreted language anyway), and provides a great deal of flexibility. One potential downside is that you might not want all of the power of the language to be available in a config file; in particular, if you are using the language internally, your program may have made available certain functions that you don't want a user to call from a config file. If your application already has a security framework built in to it, this may be easy enough to do, or you may not be concerned about it. In any case, you should at least be aware of this potential pitfall if you choose to use a language as your config file syntax.

At the other end of the spectrum, you could choose a standard name=value format, such as Windows INI file or a Java Properties file If you have a relatively simple application with just a few config parameters to set, this is probably a reasonable option.

You can treat all of your config data as strings and let the application deal with each individually, or you can define a set of datatypes that can be uniformly represented in a config file. This might include lists of data or other compound types.

One of the typical capabilities implemented in config systems is the ability to group config parameters into logical groupings. The standard Windows INI file does this with its [section] prefixes. You can simulate this in a Properties file by selecting a character to be a name separator (typically a period), then using that separator character to define names for your parameters that indicate their grouping. This can easily be extended to multiple levels to allow a hierarchy of grouped parameters.

Once you have groups of config parameters, you might want to implement some kind of inheritance mechanism, whereby you can declare a set of names and values in group A, then declare that group B has the same item values as group A, possibly with some specified exceptions. Or perhaps you would like to be able to set the value of a parameter to be the same as the value of some other parameter, or some combination or transformation of other parameters.

You can continue to add more capabilities to your config file, but once you start getting too complex, you probably want to adopt an existing language syntax to avoid creating something that is complicated to implement and maintain, tedious to document, and difficult to learn and use.

If you do use a language for your config file, you may need to modify your approach in order to be able to implement all of the steps given below. In particular, you should not directly modify your operational objects from the config file, as this violates the separation of config data from the application and makes it more difficult to validate the entire config before activating it. One solution is to make your config file code only set data into the new Config objects that are being created for the reload process. Other solutions are possible, such as setting up a mock execution environment in which code can be validated before being applied, but a detailed discussion of such techniques is outside the scope of this post.

When choosing a format, you might consider whether you plan on maintaining config files through a program (either the application being configured or a separate config maintenance application), or if editing config files with a text editor is sufficient. Some applications maintain their config files in XML format for this reason, as there are many packages that can easily read and write XML files, as well as do basic syntax checking outside of the application being configured. Properties files can also be easily written, but there are many other formats that could be used. This can get tricky if you are trying to use an application to maintain config files when you are using a general purpose language for those files.

No matter what format you settle on for your config files, the same concerns discussed below apply regarding reloading the config.

Config Objects

In the approach described here we store in-memory configuration information in special Config objects that are separate from the operational objects that they configure. Defining separate Config objects gives us these benefits:
  • It allows us to represent multiple configurations simultaneously. In particular, it allows us to load and operate on a configuration that is separate from the currently active configuration.
  • It provides a convenient location to collect the methods that manipulate or otherwise access the configuration parameters.
There should be a set of Config objects that correspond to the different operational objects that can be configured. Each different class of operational object to be configured should have a different custom class of Config object associated with it. An operational class with multiple instances should have a separate instance of its Config class associated with each operational instance.

The various Config objects should be related to each other in the same way as the operational objects are related to each other; for example, if operational object A can have multiple children of type B, then ConfigA should be able to have multiple children of type ConfigB. There should be a single Config object which serves as the root Config object from which all other Config objects can be reached.

If the application is written such that there is a single application-wide active configuration, then the application should have a singleton which is the active root Config object. In the discussion below, I assume that such a singleton exists; if your application has multiple contexts, each with a different set of config info, you should interpret the word "singleton" to refer to the single active root config for the context whose config is being updated.

All of the Config classes can inherit from a standard base Config class that provides implementations of common useful methods such as type-safe calls to get integer and date parameters.

Seven Steps

There are seven steps involved in loading or reloading configuration data: Trigger, Locate, Load, Validate, Activate, Report, and Use. Each of these steps can be considered independently of the others. Each step has its own design decisions and implementation choices. In the approach we are using, the Config objects mentioned above are the common data shared by all but the first two steps.

Trigger

If your application is going to reload its configuration information, it needs to know when to do that. There are a number of options:
  • Your app can check for changes on a regular interval and reload if the source has changed. This is a typical approach used with logging configuration files such as for log4j, in which you can specify automatic reloading with a call to the static configureAndWatch method of DOMConfigurator or PropertyConfigurator.
  • If your app has a command line interface (CLI), you can add a command that reloads the config info.
  • If your app has a web interface, you can add a web page that controls config reloads. This can be a full web page with a form and feedback, or a simple URL that triggers a reload.
  • On a Unix system, a standalone app such as a daemon can be written such that a reload is triggered on receipt of a signal. You can then use the kill command to send the process that signal. SIGHUP (signal 1) is often used by Unix daemon programs for this purpose, some examples being acpid, dnsmasq, postgresd, smartd, smbd, winbindd, and ypbind.
  • For a Java app, you can enable JMX and use that to send commands to your application with a JMX console app such as jconsole or MC4J. JBoss uses this technique, allowing you to reload its log4j config using the JBoss jmx-console.
  • For many apps, you can pretty easily add a web interface, such as by using Jetty for Java apps, for the purpose of allowing control and status feedback.
You may want to limit how often a reload can be triggered to prevent a DOS attack (or the same effect caused by a bug in whatever is producing the trigger).

Locate

Once the app has been triggered to reload the config info, it needs to locate that info. Some options:
  • Assume the data is in the same location as before and reopen that location, such as is often done for a log4j config file.
  • Provide the location of the data along with the trigger. This is easy to do if you have a CLI, web form, or web URL, not so easy if you are using a timer or a Unix signal.
Some applications (such as one that uses the standard Props class in Lift, including the way Lift handles its log4j configuration) have more sophisticated file lookup mechanisms that allow configuration information to be split among multiple files or segregated according to the runtime environment to be used. If you are using a package that looks for one or more out of a set of possible files, and you want to be able to add or remove a config file and then reload, you should check to make sure the package is able to reload files and that it will rescan its set of possible files and not just assume that the same config files should be used as when they were first loaded.

Load

Once the data has been located, it needs to be loaded into memory where it can be manipulated. Note that you should load the data into a new Config object or set of objects so that you can do the validation checks on it before activating it.

You should not have to write the code that actually loads the data, as there are a number of usable options available. As an example, you can store your config data in the standard Java Properties format, then load that data using Properties.load. After reading the data into a Properties object, you can create your custom Config objects from the data in the Properties object.

Validate

Once the config data is loaded into your Config objects, you are ready to validate the new configuration. You should make the following checks:
  1. Ensure that the syntax of all configuration values is correct. Depending on how you loaded the data and converted it to your Config objects, some of these checks may already have been done. If there are any values which have not yet been checked for correct syntax, those values should be checked now.
  2. Perform semantic checks on individual parameters. This includes things such as checking that numbers are within allowable ranges, or that each selection parameter has a value that is one of the allowable selections for that parameter.
  3. Perform validity checks on multiple parameters. This includes situations in which you have two or more parameters that are related and which thus must have values consistent with each other.
  4. Compare the new set of Config objects against the current set to ensure that all proposed changes are allowed. You may decide that some changes are too much work to bother to implement; you can disallow those changes in this step.
With a Config class that corresponds to each configurable operational class, we can put the validation code directly in those classes rather than in the operational classes.

After completing the above validation steps, and assuming there were no errors, you have done all error checking and know that you will be able to switch to the new config without errors, but you have not yet done so.

Errors in any of these steps should be collected so that they are available for Reporting.

Activate

Assuming that the loaded Config objects pass all of your validation tests, it is time to activate the new Config. While conceptually simple, this is the trickiest step.

The key issue here is ensuring that the application works properly in the presence of concurrent access to the config data. You want to make sure that the application cleanly switches from using the old configuration to using the new one, without the possibility that some operations will be performed with part of the old configuration and part of the new one.

There are two basic updates you need to make, which correspond to the two basic approaches to using the data:
  1. Update the active root Config singleton.
  2. Update all operational objects that contain configuration state.
Handling the first approach is pretty easy: inside a synchronized block, update the active root Config singleton. When another thread begins an operational sequence that relies on any config parameters, it reads the current root Config singleton (in a synchronized block) and keeps it in a local variable for the duration of the operational sequence. All queries for config parameters during that sequence are done against the local Config variable, ensuring that the entire sequence uses a single Config even if the Config singleton is updated in the middle of that operational sequence.

If you are using the second approach, updates are a bit trickier. It would be simple if the activation thread could just update the state in the operational objects, but another thread may currently be running and using those operational objects in an active operation. You can't just update the state in all of the operational objects from the activation thread because the operational thread might then pick up the new state in the middle of one operational sequence, and we assume that starting an operational sequence with one state and finishing it with another state will cause problems.

The key to handling changes when using this second approach is to build on how we solved changes to the first approach by capturing the value of the active root Config singleton at the start of the operational sequence. That starting point is the point at which we know (by definition) that it is safe to change over to a new config. When we start our operational sequence, we capture the currently active Config into a local variable, as described above as the solution for changes to the first approach. We then check to see if the config has changed since the last time we started the sequence. We do this by comparing our newly captured Config against the Config that we used the previous time we executed our sequence, which means we need a second variable that stores that previous Config. If the newly captured Config is not the same as the previous Config we used, then we reconfigure our operational objects according to the newly captured Config, then save that as well as the most recently used Config for the next execution.

When using the above solution, if the only time you update the operational state is when a thread starts an operational sequence, and that thread waits for a long time before beginning execution of the sequence, then the switch of the operational state to the new config may not happen for a long time. Despite having validated our new config, it is possible that, due to a bug, the new config will fail when we attempt to apply it to our operational objects, and it is generally better to have that happen immediately when the config is activated rather than much later, when it might not be obvious that the problem is due to the new config. In order to avoid this situation, you should add code to make your threads wake up and apply the new config immediately after it is activated, even if there is no other work for them to do.

If you have multiple independent operational sequences you should separately capture a copy of the active Config at the start of each sequence. However, you need to make sure that each sequence is in fact independent of the others as far as the config parameters that each uses, since when using the above approach you may end up with two threads executing different sequences at the same time with one using the old config and the other using the new config.

If the different threads are related, such that it is not acceptable for one thread to be using the new config while another is still using the old config, then you will have to use a different approach. In this case, you will probably need to write some code to ensure that no thread starts using a new config until all threads have stopped using the old config.

You can do this with two flags, properly synchronized:
  1. config-in-use
  2. ok-to-use-config
The activation thread turns off ok-to-use-config, then waits until config-in-use is zero. At that point it updates the root Config singleton and turns on ok-to-use-config.

The operational threads check ok-to-use-config before capturing the current config. If turned off, they wait until it is turned on. They then increment config-in-use, use the config, and decrement config-in-use when done. Synchronized and try/catch blocks should be used to avoid race conditions and ensure the config-in-use count doesn't get stuck on.

Report

Feedback is important. Ideally, the user should get the following feedback:
  • When loading of an updated config is triggered, the user should get feedback on whether or not the new configuration was activated.
  • If the new configuration was not activated, the user should get feedback on why the new configuration was rejected (i.e. he should see a list of config errors).
  • Ideally, at a later point in time it should be possible for the user to determine what configuration is currently being used and how long it has been active. This is useful in situations where an on-disk config was changed at some point in the past but not loaded into the application.
Generally the reporting feedback channel is related to the trigger mechanism:
  • If you use a CLI command to trigger the reload, that command can print out the feedback.
  • If you use a web page to trigger the reload, the web response page can display the feedback.
  • If you use JMX, the feedback can be returned through that protocol.
  • If you use a web URL, the HTTP response can include the feedback.
If you application does logging, the feedback can be logged to the log file. This can be done in addition to any of the above feedback mechanisms.

Use

It is important to ensure that the config parameters used are consistent throughout an operational sequence, even when a config reload occurs while that sequence is executing, as discussed above in the Activate section above. Once you have handled that, you can move on to other usage aspects.

There are two basic approaches to using the active config parameters:
  1. Use the current Config object directly each time a config value is needed. This is suitable for simple options that are tested each time a specific behavior or feature is desired.
  2. Load data from the current Config object into operational objects. This is necessary when some of the config info refers to state that is managed by an operational object, such as the endpoint for a TCP connection.
The first approach provides for simpler updating of the config data, but sometimes the second approach is necessary for performance reasons or due to how state information is stored in other objects. The timing for when to update config state in operational objects is discussed in the Activate section above.

A minimal implementation of the Config object would provide just a single method to retrieve any parameter by name, such as is provided by the Properties.getProperty method. While this is easy to implement, it does not provide as much protection against programming errors as other approaches described below.

For type safety, you should implement (or use a package that provides) a set of methods with specific return types that match the types of your config parameters. You can then pass in the name of each parameter and not have to type-cast the result.

For maximum type safety your Config object should provide methods specific to each config parameter being retrieved. This ensures not only that you have the correct return type for the parameter, but that you have not accidentally mistyped a parameter name in a call to retrieve its value. (Of course, your unit tests should also catch this error, but you will catch it sooner and more surely with compile-time checks.)

Unit Testing

Keeping the configuration management code in separate Config objects improves the testability of your code. You can write unit tests for your Config objects to test that they properly locate, load (or reload), and validate config files, and you can create a set of mock Config objects that you can use to test how your application responds to different configurations.

A more complete test suite will include tests that verify proper functionality when a reload operation is performed by one thread while one or more other threads are in the middle of processing and using config data. However, a detailed discussion of this kind of multi-thread testing is beyond the scope of this post.

Implementation Options

You can write all of your own config code from scratch, or you can leverage an existing package. Whatever approach you take, you will want to ensure that your application handles all seven of the steps discussed above.

A few packages are listed below, with a discussion of the steps for which they provide support. For bullet items marked no support you will have to write your own code. None of the packages provides support for all of the steps. Even if a package did provide that support, you must still provide application-specific code for validation, activation, and use.

Caveat: Except for Properties, I have not used the packages listed below. My evaluation of their capabilities is based entirely on reading the documentation and examining the source code, so it is possible that I have made some mistakes in that evaluation.

Properties (Java)

The standard Java library includes the Properties class, which can be used for simple applications that require only a few parameters.
  • Trigger: no support.
  • Locate: no support.
  • Load: You can load a properties file with a single call to Properties.load, where you pass in the name of the file to load.
  • Validate: no support.
  • Activate: no support.
  • Report: no support.
  • Use: The Properties.get method will return the value of a property as a String. You can use this directly as a generic call to retrieve config parameters by name, or you can layer your type-safe methods on top of this.

JavaConfig (Java)

JavaConfig (not to be confused with Spring JavaConfig, which is used for Dependency Injection configuration) reads config files using the standard Properties file format. The package provides a generic Config class, which you subclass to create your application-specific config class. It handles a defined set of data types.

JavaConfig specifically does not include any logging, so that it can be used to read the configuration for another logging package.
  • Trigger: no support.
  • Locate: no support.
  • Load: You pass the name of a properties file to the Config constructor, which loads the properties file.
  • Validate: After instantiating your config object, you call the validateConfiguration method on it, which returns a ConfigValidationResult object that contains the validation results. This validateConfiguration method calls all of your getter methods. For each of your methods that throws an exception, the message is collected and made available through the ConfigValidationResult object.
  • Activate: no support.
  • Report: The ConfigValidationResult class collects the error messages from all of your getter methods that throw exceptions, and makes them available
  • Use: The base Config class provides type-safe methods such as getInt and getBoolean that accept a parameter name. In your config class that extends that class, you define a getter method for each of your config parameters. Each of your methods should call one of the underlying type-safe methods, passing it a config parameter name, and return that result. Your method should also perform any validation checks and throw an exception if there are any validation errors.

Apache Commons Config (Java)

Apache Commons Config provides a mechanism to allow config info to be loaded from a variety of sources, such as files or databases. You can mix config info from multiple different sources, such as reading some info from a database and some from system properties, and access it all through a single config object. It supports file includes and value substitution.
  • Trigger: The package org.apache.commons.configuration.reloading provides a mechanism for defining a reload strategy when using file-based configuration, such as reloading on access to a config element if the file has changed, and some support for using JMX to trigger a reload.
  • Locate: You can pass in a relative filename, and the package will look in various locations for a config file of that name to load.
  • Load: You can create a configuration object for a specific data source, such as a file, or you can create a composite configuration object from multiple other configuration objects.
  • Validate: no support.
  • Activate: no support.
  • Report: no support.
  • Use: There are a set of type-safe methods to which you pass an item name and receive back its value.

Configgy (Scala)

Configgy includes logging as well as configuration, so it can use its own config files to configure logging. Its config files look like a cross between an XML file and a Properties file, with hierarchy represented by XML syntax, and individual parameters looking more like Properties. It handles a defined set of data types and has the ability to represent lists of values for a single parameter.

Configgy supports a lot of options when defining parameters, including hierarchy, inheritance, includes, variable substitution (including system properties), and conditional assignment.

Note that Configgy allows the application to set values in the config after it has been loaded into memory. As with any situation in which one datastructure might be shared among multiple threads, you should be very cautious with this capability. In particular, if you have a thread which has read the config and used that data to set state in its own application objects, setting the value in the config object alone may not have the desired effect. Your application can set up a subscriber for changes, which will be called when there are runtime changes to a config value, but you need to handle synchronization of these runtime changes in the same manner as when reloading the entire config. And your application code that is calling the set method must be prepared to handle a thrown exception if a subscriber decides the change is invalid.

The examples given in the Configgy documentation use a Scala object (as opposed to class) and does not discuss reloading, but there is a reload method available on the main object, which should work if you use the approach described above (using a pair of flags) for the case when all threads are related. Also, there are separate config objects being used under the covers, so it should be possible to use those directly, rather than the main object, if you want to be able to switch some threads over to your new config while some other threads continue to use the old config.

If you are writing a Scala application, Configgy is probably your best option.
  • Trigger: There is some JMX support built in; reload is not one of the methods available from the JMX interface, but it should not be too difficult to add it.
  • Locate: Configgy has calls to allow you to set the location of the config file to load. You can call this before calling reload() to control the source for the reload.
  • Load: You call Configgy with a filename and it loads that file and any file referenced with an include statement.
  • Validate: Configgy uses a subscription/callback model to let the application know when data has been changed. Your callback is called with an argument that tells you whether Configgy is doing a validation pass or an activation ("commit") pass. On the validation pass, your callback can throw an exception to indicate that the new value fails validation.
  • Activate: The validate/commit subscription model provides hooks to allow you to write your own validation and activation, but you still need to consider synchronization when using multiple threads.
  • Report: no support.
  • Use: There is a set of type-safe methods to which you pass a parameter name, which can be hierarchical.