It’s tempting to believe that having a flexible file format for a system’s configuration is enough to ensure forward-compatibility and flexibility as the system grows. The thought process is that with a suitable data format—for example, XML or JSON—extra information can be added without breaking existing systems. For adding new fields into the data, this is true. However, if you don’t engineer additional flexibility into the fields themselves then you may end up needing to make breaking changes anyway or, worse, end up with a much more complicated data file.

Imagine you are implementing an ecommerce server that exposes products. The spec for the server stipulates that there will only be a very small number of products. It also stipulates that rebooting the server to refresh the system with new products is fine. With that in mind, you add a products field to the server’s configuration:

port: 8080
products:
 - name: Acme Explosive
 - name: Fake Backdrop 

One day—and this day always comes—a change to the specification comes: some products should be hidden on the site. To implement that feature, you add a hidden: field to the product. It is assumed that if a hidden: field is not defined for a product then it should default to false. This is backwards-compatible and easy to implement given the extensible YAML format being used:

- name: Fake Backdrop
  hidden: true

Later on, it emerges that product-hiding should not be binary. It should depend on the group that the user belongs to. To implement that, you add a hidden_to: field. However, to allow for backwards compatibility, it is enforced that hidden_tos entries take priority over the hidden state. Side-rules are beginning to creep in:

- name: Fake Backdrop
  hidden: false
  hidden_to:
    - roadrunners
    - unregistered-customers

As the system gains popularity, it becomes increasingly necessary to be able to add products to a running system. It is decided that products should be held in a database rather being loaded from the configuration file. The configuration file, therefore, should support providing database details as an alternative to products. However, backwards compatibility should still be maintained to prevent legacy deployments and tests from breaking:

port: 8080

use_database: true
database:
  url: //localhost/database
  user: productapp
  pw: password123

A heuristic is adopted that if use_database has a value of true then the server should load products via the database defined in database; otherwise, it should load them via products.

The sysadmin then mentions that it would be useful to allow the server to listen on multiple ports, some of them being HTTP and some of them being HTTPS. Marketing also wants to integrate email updates into the server. However, one week into developing email updates, they also mention that some systems may need to send tweets instead of emails. Your implementation strategy follows the same logic as previous development efforts:

port: 8080

other_port_configurations:
  - port: 8081
    protocol: http
    port: 8082
    protocol: https
	
use_database: true

database:
  url: //localhost/database
  user: productapp
  pw: password123database:

send_tweets_instead_of_email: false

email:
  host: smtp.app.com
  port: 442
  username: productapp-email
  password: thisisgettingsillynow

This process of introducing additional data that is handled by special-case heuristics is a common pattern for extending configurations without breaking existing fields. However, it can become quite difficult to keep track of all the different rules that arise out of this gradual growth in complexity. Not only that, the resulting configuration file can be difficult to work with.

What would make this process a lot easier would be to extend the actual fields themselves. However products, for example, is defined to contain an array of products. The only actions that can be carried out on products are to add or remove products.

The solution to this is to to ensure that there are extension points at major parts of the configuration hierarchy when it’s initially designed. Relegate primitive values (which can’t be extended later) deeper into the configuration hierarchy and have clear extension points.

An example of adding extension points would be:

products:
  product_data:
    - name:
        value: Acme Explosive

In this design products, product (within product_data), and name could all be extended later with more type switches or other information to allow them to be loaded/handled differently. Applying this nesting concept to the configuration at each step of the above scenario would result in something like this:

server:
  application:
    port: 8080
    protocol: https
  admin:
    port: 8081
    protocol: http
  internal:
    port: 8082
    protocol: http
	
products:
  source: database
  database:
    url: //localhost/database
    user: productapp
    pw: password123database:database:

product_notifications:
  type: email
  settings:
    host: smtp.app.com
    port: 442
    username: productapp-email
    password: thisisgettingsillynow

There would still be some heuristics involved with handling missing keys, defaults, type switches, etc. However, the hierarchy between features is maintained and there are now no sibling heuristics. As a result, the config-handling code can be modularized to then read each key in a configuration in isolation. The products in this improved example could be handed to .handleProductsConfiguration(ProductsConfiguration conf), a function that now doesn’t also need to receive sibling use_database and database keys.

A strict tree hierarchy maps cleanly onto polymorphic systems and is surprisingly easy to implement using libraries such as jackson (java):

@JsonTypeInfo(
  use = JsonTypeInfo.Id.CLASS,
  include = JsonTypeInfo.As.PROPERTY,
  property = "source")
interface ProductsConfiguration {
  <T> T accept(ProductsConfigurationVisitor<T> visitor);
}

class DatabaseProductsConfiguration implements ProductsConfiguration {
  @JsonProperty
  private String url;
  @JsonProperty
  private String user;
  @JsonProperty
  private String pw;

  @Override
  public <T> T accept(ProductsConfigurationVisitor<T> visitor) {
    visitor.visit(this);
  }
}

The above is compile-time verified, type-safe, and involves no conditional logic. Completely new product sources can be configured by defining a configuration that implements ProductsConfiguration and updating the relevant visitor. Clearly, this implementation requires more up-front engineering than the original design. However, for projects with changing specification (i.e. most projects) allowing for flexibility pays for itself.