Error handling in modern languages

Sat, Feb 13, 2016 ❝On variations of error handling mechanisms in modern languages.❞

Contents

This is a sequel to an earlier post Why I prefer error values over exceptions.

In recent languages one of the more prominent differences as opposed to the more well established languages is the way error handling is performed. Let’s have a look at various mechanisms of error handling as they are available in established languages and new mechanisms that are provided in more recent languages.

Ye olde ways

First we look at a number of well established / “old style” error handling techniques.

Return codes: sentinel return values
A function provides a certain function and returns a resulting value. Values that are considered impossible as a calculation result are repurposed as error indicators. This is probably the simplest form of error reporting apart from doing no error reporting at all.
Due to a lack of type safety or by faulty or no type checking, we can accidentally use return values indicating errors as if they were computation results. Such cases are known.
For example, a case where a process is forked and the resulting pid (Process ID) value is stored for later use. At some later point in time, when the forked process needs to be killed, one can call kill using the stored Process ID. Due to how sentinel values can unfortunately coincide, this can lead to interesting effects:
1. fork returns -1 in case of an error.
2. kill kills all processes except for pid 1 (init) when provided process id -1.
3. As a consequence, if one fails to check all error situations, one might accidentally kill all possible processes if forking failed at some point and the error value is misinterpreted as a “computation” result - as opposed to an error indicator - and passed on to kill as the process ID of the forked child that never actually forked because of an error.
Return codes: error indicator as return value
Alternatively, one can dedicate the return value for error indication. One typically returns 0 if no error occurred, or any non-zero value to indicate some error where the value indicates the type of error. If a function needs to return another result, this has to be provided through one of the function arguments. One might provide a pointer which can be used to access a result structure to which the computation result can be written.
Global error variable
A dedicated global variable is used to store any error information. After calling a function that leverages this method of error reporting, the user is expected to check the global variable for possible error information.
Pointer to error struct
Functions define a parameter that points to an error structure. The function’s regular return type is used to return the computation result only. In case of an error, the pointer to the error structure is used to write error information to such that the caller can subsequently check this structure to see if any error occurred during execution.
Exceptions
In this day and age, probably the most well-known mechanism for error handling. The exception is created and thrown. It will traverse up the call stack until caught. Exceptions do not go through the predefined exit points (through function return types) but instead have their own “emergency” exits. In case an exception is thrown, the function is ended prematurely and no result is returned. Given that this mechanism uses a separate “exit”, we can dedicate the return type to the computation result of a function.
Unchecked: Unchecked exceptions need not be declared in the function signature. They can be thrown at any time and thus are unannounced.
Checked: Java has a special type of exception known as a checked exception. These exceptions behave the same as other exceptions. However, checked exceptions are declared in the function signature. Furthermore, the type system forces authors to either catch or redeclare the exception as potentially thrown. Checked exceptions are therefore more predictable and more easily controlled.

Error handling in recent languages

More recently developed languages provide different mechanisms for handling errors. We’re going to look at error handling in the languages Rust, Go and …

Multiple return values + dedicated error type: (Go)
Go uses a combination of existing features to provide a simple and effective error signaling method. The combination of multiple return values, a static type system and a dedicated error type.
- Multiple return values ensure that we do not reduce expressivity of computation result in order to squeeze in error handling. We dedicate one or more “return paths” for results and one (or more) “return path” for error indication.
- The static type system and the dedicated error type ensure that we do not interpret an error as a value (for example, an integer result). We receive errors as type ’error’ (a built-in type) so any use of this value outside of what is defined by error will not (implicitly) be allowed.
- Due to the local context in which errors occur and the known situation of errors being evaluated on a case by case basis (as opposed to a large catch-block and very general error handling), known error cases can be predeclared in (error) variables. Upon encountering this error case, a predeclared error variable can be returned. The same error variable (provided that it is publicly accessible) can be used by error handling code to compare to in order to distinguish between various errors / error cases.
  This is a standard way of working. Predeclared variables as error instances are possible because they do not need to contain information on the context. Information on the context is already known by the caller who receives the error.
‘Panic’ for programmer mistakes: (Go)
Panic is most closely related to the notion of Exceptions. Panic creates an exception-like event that traverses up the call stack, until “recovered”. If a panic is not recovered, it will halt the program. Panics are meant to be used only in case of a programming error. Therefore a ‘panic’ should always indicate a programming error and thus a faulty program. Any reactive control flow handling should be performed using error return values, which would be your common day-to-day mechanism of error handling.
When exactly we should use panic is actually not clear for everyone. An innocent question on the reasoning behind the choice for panic can lead to a significant amount of discussion. This is mainly due to the large impact of panic. The unexpected interruption of control flow logic at any moment is considered something we only fall back on at very rare cases. (Which, ironically, is considered the default error handling mechanism in some other languages.)
“Monadic error handling”: (Rust)
In Rust, (other) existing features of the programming language are composed in order to provide a mechanism for error handling. Pattern matching and a generic type Result, containing a specified result value type, are used. See the article Error Handling in Rust by Andrew Gallant for a very good description on Rust’s error handling mechanisms.
- Result is either Ok or Err. Ok is a type that contains the result value that is added upon successful execution.. Err is a type that contains the error that is provided upon detecting an error.
- Using pattern matching, one can immediately handle the returned value by matching on these two patterns: Ok and Err. Consequently one can split up control flow logic into a “successful execution” case and an “error” case.

Error handling patterns

With new mechanisms for error handling come new patterns for the actual handling of errors. As expected, these patterns depend on the underlying mechanisms. We will look at how errors are handled in Go and Rust.

Go error handling tactics

In Go, one handles errors by verifying the returned error instance. In case of nil, no error occurred and we can continue logic with the acquired execution result. However, in case of repeated method calls this would mean that we need to verify the error value return on every call. This can be quite bothersome, but at the same time this is its strength, because you are sure that you handle every error appropriately.

It will depend on the semantics of the error value, whether or not the error is of immediately use or needs to be returned to a higher level. There are a number of different tactics that can be applied where we treat error value returns slightly different.

Immediately check error value and act appropriately, such as halting further execution of the method and immediately returning an error value. This is the most obvious one.
First process the (partial) result value that is acquired. Then check the error value and if non-nil stop processing.
This use case is common with IO operations. For example, a read operation may get interrupted because of a lost connection, however it will still return as much of the data as it was able to acquire. The error is still significant, of course. After processing the remaining data, we continue to handle the error.
Keep executing but preserve the first/last error value that is acquired and non-nil. This is useful in cases where it does not make sense to stop half-way through the processing cycle. We return the first/last acquired error value after finishing processing.
Process everything and gather all non-nil errors in a list. At the end of processing, a list of error values is returned for the caller to use/report.

Repeated behavior can be simplified by creating a closure function at the start of execution of such a method. Instead of repeating the same code at every function call, we can wrap the function call inside the closure and let it handle the error values appropriately, whatever your error handling tactic might be.

Go: verifying that all errors are handled

In Go, it is possible to simply not assign the error result to any variable. The error is implicitly silenced. To discover accidentally silenced errors, we can run errcheck. errcheck reports on all function calls where the returned built-in error type is not assigned to any variable. Errors that are of no value can be explicitly silenced by assigning them to ‘_’ (underscore). errcheck considers this an explicitly silenced error and does not report on such function calls by default.

Rust’s `try!` macro

Rust relies quite a bit on macros. Macros are used to simplify things for programmers that can already be resolved at compile-time. Similarly, there is a try! macro. (Macros can be recognized by the ! suffix.) This is a short-hand for verifying the result and returning either the result value of a successful execution, or the error value in case of unsuccessful execution.

This approach is similar to Go’s manual checking, except that the try! macro is a short-hand that can be used as long as the function in which it is executed has a matching signature. It needs to return a Result-type of matching result value type and error type.

For cases where this standard pattern cannot be applied, Rust relies on the programmer to correctly process the returned Result.

References

If you are interested in error handling, safe and predictable programs and systems programming. Be sure to read The Error Model by Joe Duffy. It is about a programming language called Midori. In this post, he discusses error handling in great detail. He discusses the various models that are known and all their advantages and disadvantages, both from the perspective of effectivity of use and the efficiency of execution.

Timelessness