Object Oriented Programming: Utilities

Mon, May 16, 2016 ❝On the use of utilities and utility methods in the series on Object Oriented Programming.❞

Contents

In a previous article we discussed how we should design objects, or rather what restrictions we should impose when designing an object. The proposed guideline is strict and will leave a lot of methods “homeless” as these do not qualify for the given restrictions. Utility methods and other utilities are locations for these methods, the usage logic.

‘Utilities’

First, let’s quickly define what I consider ‘utilities’. In the article Implementation and usage we distinguish between two types of logic. One is logic that implements the behavior for corresponding internal state of an object: the implementation logic, and one that represents usage patterns for a particular concept (possibly a single object): the usage logic.

The so called “utility methods” are better known. Given the above definition, “utility methods” are also considered utilities, as you would expect.

Utility methods

Let’s start with the basic form first, the utility methods. In the article on generalized code for implementation and usage we discuss that we can “interpret” usage logic as a usage pattern. The exact implementation of the object we use is irrelevant, as long as we can trust that the guarantees that we expect this object to have are valid. That is why we consider usage patterns to operate on the concept rather than a concrete type and why an interface is sufficient information for a programming language to ensure that guarantees are met.

As remarked before, any logic that we add to a single object as a separate method (as opposed to in line in a method with a different goal), is a failed attempt to preserve reusability as usage logic does not rely on internal state, hence there is no reason to position it with implementation logic, which is tied to a single internal state, thus a single implementation. There are no advantages (except maybe “convenience to the programmer”) to this and we already pointed out a significant disadvantage.

Characteristics of utility methods

In order to write a successful utility method, there are a number of characteristics it should have:

The method should be “pure”.
A pure method does not leave side-effects. For example, for every next time the method is called using the same parameters, it should provide a user with the same return value as it did the first time. This is called Idempotence and it is important because utility methods could be used anywhere at any time and you would not want to influence the result if at a later point in time, somewhere earlier (in perspective of execution) you introduce a new call to this utility method.
The method should be context-free.
The utility method should have no assumptions for any particular context. Utility methods are the “embodiment” of reusable usage patterns. Therefore, to stimulate actual reuse, we need to be able use the pattern in any kind of context. That means that there can be no assumptions and therefore it is needed to make any possible assumptions explicit using parameters.

With the above requirements, it is sufficient for utility methods to be static. There is no need to tie these utility methods to an object, as they should not contain state anyways. By keeping utility methods away from the state-holding object, we create a separation of the usage pattern from the concept on which it (currently) relies, such as an interface or a concrete type. As static methods they are very easily accessible, as you would expect for context-free logic.

In Object Oriented Programming: Expectations, we make a distinction between internal state and expectations that are defined by the “outside context”, the application. As utility methods cannot contain state, the most prevalent form of expectations - those provided at runtime by the calling code, must be provided through the use of method parameters. The other kind of expectations, i.e. fixed values defined in specifications, can still exist: constants are still accepted. The reason for this is quite simple. The utility method must be able to operate according to specifications of the concept that it provides the utility methods for. Specifications for this concept are fixed and therefore context-independent. Constants will not interfere with idempotence.

The location of utility methods

So how do you call utility methods? And where do you put them?

There is no real convention for this, but there are a few often-used patterns. Let’s look at Oracle and OpenJDK first:

Package: java.util (primarily)
Naming convention: plural form of the name of the type of object for which utilities are provided.

For example: java.util.Objects for utility methods that are applicable for all types with the conceptual characteristics of an Object. (And yes, that does mean every possible instance.) These utility methods provide some basic utilities for the most general concept available, that of an object.

Or the pattern used by Apache Commons:

Package: with type of object in the same package
Naming convention: TypeUtils. (Name of type, suffixed with ‘Utils’.)

For example: Apache Commons Collections 4’s Closure with its utility methods in ClosureUtils.

Google Guava uses a comparable pattern to Oracle/OpenJDK with:

Package: with a package name chosen to represent the category of code. It contains both types and utility classes of this category.
Naming convention: plural form of the name of the type of object for which the utilities are provided.

See for example the utility class Iterables.

We position utility methods with the most generalized form to which the usage pattern is applicable. That means that if your utility method only requires the capability to iterate over the contents of some type of collection, then Iterable is sufficient. If, on the other hand, your utility method relies on a definite order, then List is as general as you can go without losing the ordering guarantee. Similarly, utility methods for your domain types - which have very specific meaning and assumptions - will likely only apply to your domain type. Generalization may only go as far as the most general of your defined domain types.

By providing the method for the most general type, we ensure maximal reusability and it provides predictability for where to find a certain utility.

One may also find TypeUtil (Name of type, suffixed with ‘Util’, singular) and there may be other variations. The two variations mentioned above seem to be most common, though.

Most general application

As we already mentioned implicitly, utility methods are designed to be applicable to the most general type for which the usage pattern is applicable. This also means that utility methods are often defined for interfaces rather than individual concrete types. The utility method, which represents a usage pattern, is applicable to any concrete implementation of the same concept, because the concept (not the concrete type implementing the concept) defines the guarantees that the usage pattern relies on.

The existence of a utility method makes its usage pattern easily “repeatable” and the use of most general types makes the usage pattern widely applicable.

When to create utility methods

In general we use utility methods to have easy access to commonly used usage patterns. This simplifies code and improves readability because single method calls represent more elaborate patterns.

A utility method is the “embodiment” of a usage pattern. Therefore, the prime prerequisite is that the logic we extract as a utility method must be usage logic, and adhere to its characteristics: pure & context-free.
A very obscure usage pattern that is so elaborate that it is only useful in a very specific situations is not an attractive candidate. General patterns, that might be so common that you hardly think about them anymore, might be perfect candidates.
The pattern must be extractable. That is, any assumptions or context-dependent variables must be cleared from the logic and may be provided as parameters of the method. Complex patterns may not be suitable as utility methods because of their complex interweaving with implementation logic. Complex patterns may contain multiple suitable basic usage patterns, though.

Given the rules as defined in Designing objects, we only add implementation logic if we require privileged access. Any other logic is simply usage logic that leverages the public API. To make this logic reusable, we create utility methods.

By extracting and using utility methods for concepts (possibly concrete types), we allow ourselves to write more concise implementation logic using larger “building blocks”. Implementation logic will contain less lines of code and each line of code expresses more elaborate operations.

Identifying “hidden” usage patterns

There are a few pointers that hint at a utility method. These are just hints and are not guaranteed success and even so, the need for repetition and the need for separating certain usage patterns is what determines the usefulness of a utility method.

Identified hints:

A sequence of statements without references to internal state.
This “in-line” usage logic pattern can be extracted and replaced with a single method call to a utility method.
(Private) Methods that access no internal state and (“should”) return an interface type result.
That is, the concrete type is not relevant as the result is only used. Furthermore, as we only use - as opposed to manage - the result, we are not expected to have knowledge of its concrete type.
Logic that requires the introduction of new fields into the object.
These fields are not required for representing the original concept, the original, single concern of the object. They are, however, necessary for the logic to work.
Note that adding fields to the internal state in order to support a method is the inverse of what you would expect. This contradicts the rules defined earlier. Implementation logic manages the internal state. The need for internal state to support a method, indicates that there is an elaborate usage pattern that requires some persistence, i.e. an internal state.

The first case is the typical in-line usage logic. To repeat an earlier example: Objects.requireNonNull, a basic pattern you will typically find in-line with other code in a method. Trivial to extract for reuse, yet the utility method offers both the happy flow and the exceptional case in which a programming error is detected and a suitable exception thrown.

The second case typically identifies a utility method that is already extracted as a private method on an object. To improve reusability we should move this to a utility class. This also reduces the amount of logic bound to a single object.

The third case identifies more elaborate utilities. These are not simply static stateless utility methods, but rather a more complex, possibly stateful, composition. We discuss these utilities in the following sections.

Why separate from the concrete type?

We discussed before how a utility method is a usage pattern and therefore it applies to the general concept rather than an individual concrete type. In Designing objects we discussed how we should reduce the amount of methods on the object to the minimal necessary amount. There is no alternative for methods that require privileged access to the internal state, but for all other logic, i.e. usage logic, we can suffice with the public API.

In the same line of reasoning we can distinguish between static methods that require privileged access and static methods, such as utility methods, that do not. In ‘Designing objects’ we discuss this approach with examples of Java’s Optional.of and Objects.requireNonNull. Optional.of relies on the constructor and thus is tied to an individual concrete type, while Objects.requireNonNull is generalized usage pattern and therefore relies only on public API.

In Designing objects we define a strict distinguisher for this separation in order to avoid this (unintended) “scope creep”, as it will blur the single concern for which we originally designed the object. In Evolving code we discuss how utility methods fit in the evolution of code.

Tips and tricks

Utility methods operate on a very general capability assumption. In many cases, the only guarantees we have are those defined by the (most general) interface. Any usage logic inside a utility method will therefore not be able to use any “short-cuts” that are available to a specialized type only, such as a concrete type.

The “base” usage logic of a utility method will always contain the generally applicable usage logic. This is its base case, its reason for existing. This is the most plain implementation that requires no assumptions.

However, more specialized types provide a more specialized API. This may help to execute your goal more efficiently. A utility method may use this knowledge if it encounters a type that it knows to have this option. The utility method tests for a specific type in order to then choose to execute an alternative logic that leverages this “short-cut”. There are of course downsides. For one, a dependency on that type is then introduced. This trick is most useful if multiple implementations are provided within the same package or a related package upon which a dependency already exists.

public class CollectionUtils {

  public static <E extends Object> E getLast(final Collection<E> col) {
    if (col.isEmpty()) {
      throw new IllegalArgumentException("cannot retrieve last element from empty collection");
    }
    if (col instanceof LinkedList) {
      return ((LinkedList<E>) col).getLast();
    } else if (col instanceof List) {
      return ((List<E>) col).get(col.size() - 1);
    } else {
      // skip all but last element and return last element
      final Iterator<E> it = col.iterator();
      E last = null;
      while (it.hasNext()) {
        last = it.next();
      }
      return last;
    }
  }
}

This is a constructed example. It serves the purpose of demonstrating the trick. For one, not all collections guarantee a deterministic order - this is not defined by the Collection interface. Secondly, LinkedLists will already start iterating from the end of the list whenever an element at an index past the middle of the list is requested.

Tricks like this do get used in practice, though. Google Guava, a Java support library by Google, uses such tricks, e.g. in Collections2.filter(…), to give you an arbitrary example.

The strength of such utility methods is in being able to construct an isolated, easily testable piece of usage logic that includes error handling, and with the possibility to introduce intelligence for advantageous situations. This eliminates the need to recreate this complex logic in-line with implementation logic. The latter is what unnecessarily complicates implementation logic, makes it hard to read and understand. And apart from that, trying to reproduce such logic in-line will often lead to incomplete reproductions due to non-obvious edge cases and insufficient error handling.

In the next article, Utilities: Next Generation, we look into some advanced constructs for utility methods.

This post is part of the Object Oriented Programming series.
Other posts in this series:

Timelessness