Features vs. Requirements - I/O Operations

In an previous article in this series, we noted that synchronous calls are mostly convenient, but have some drawbacks. The article discussed asynchronous execution. This article focuses on blocking and non-blocking semantics. Non-blocking semantics may be implemented using asynchronous execution but there are alternatives. Specifically for non-blocking I/O we can leverage the knowledge that we have of the underlying implementation.

Preconceptions of blocking I/O

First of all, let’s get one important preconception out of the way. Blocking I/O implies that you wait for an I/O operation to complete. However, that is not exactly the full story. As you know an operating system is an advanced piece of software. It turns out that the operating system often is able to predict I/O requests that it will receive and may have the data readily available in memory, such that a blocking operation does not provide any overhead at all. Another example of this are the send and receive buffers. Sending data, for example over the network, is typically considered a lengthy operation, however the only action for the operating system to take is to copy the data to the network interface’s send buffer. A consequence of this is that some blocking operations that are considered to be “lengthy” and expensive, may turn out to be nothing more than a copy of data from one memory location to another.

Any extra effort you and your program put into initiating an asynchronous I/O operation, doing intermediate “filler” code - as you would sometimes do to update e.g. a progress bar - and getting the result only adds to the source code complexity and execution overhead for what would otherwise be a near instantaneous operation.

An interesting talk by David Beazley on the Python GIL (David Beazley - Understanding the Python GIL) found this out while going into the design of a cooperative scheduler that relied on blocking (I/O) operations of various kinds as opportunities for switching tasks.

Mechanisms that avoid blocking

As we have seen earlier, there is a need for multi-processing and/or multi-threading in order to do multiple actions at the same time. This means that you definitely cannot wait for response when you request to read from or write to some I/O device, such as a hard disk or a network interface as these devices can be slow and hence force you to wait until the operation finishes. There are a number of options for avoiding this blocking behaviour.

Asynchronous I/O

Blocking I/O is very convenient. As soon as the I/O operation has finished, and thus the data is either read or persisted, execution will resume. If you need to do multiple related operations, you can call these operations sequentially, i.e. as one statement following another, in the same thread of execution. You can be confident that as soon as one statement finishes, the next one will be called. There is nothing more convenient than having a sequential process that executes from top to bottom - for readability, understandability, predictability and determinism. There is no need to think of anything other than writing each of the operations that should execute after the previous. It reads like a story.

Asynchronous I/O on the other hand, starts you off with a problem: you initiated I/O in some other “far away” thread or process, and now you need to synchronize at the right moment. (Presumably at some later moment, since otherwise you would simply have a blocking I/O operation.) This means that you have to determine a suitable point in your code to merge the asynchronously processed I/O result back into your thread. This is not so trivial as some I/O operations are hard to predict and would mean that either you are joining threads too early and would still need to wait for the result, or you join too late and you may have wasted valuable time for a result that you could have obtained earlier. The intermediate activity may be useful in itself, or it may just be a “filler” to keep your application responsive. Furthermore, you immediately introduce additional steps and additional complexity as you need to keep track of any running asynchronous operations.

Non-blocking I/O operations

Another solution to avoid inconvenience of blocking in synchronous I/O calls is non-blocking I/O. Non-blocking I/O leverages known implementation details of the I/O operations in order to do just as much as can be done without calling expensive and/or blocking operations, while preserving synchronous calling semantics.

For example, in case of networking this means you pass on just enough data to be put in the network buffer without the need to start sending data. Copying data into the network buffer is essentially a copy operation in memory, while actually sending the data will depend on hardware and the state of the network.

After having done as much I/O as possible without blocking, we can initiate or need to wait for the expensive I/O operation in another thread. We need not wait for that operation to complete. At least, until we need I/O again. Or, depending on available infrastructure, there may be other buffers that we can fill, before needing to fall back to the first buffer - which we just filled - again.

Additionally, given the nature of I/O, specifically that it is typically managed by the operating system, we get some of these facilities for free. For example, the networking buffer is available, we simply use the implementation to our advantage. The drawback to non-blocking I/O is that in the case where large message or lots of data needs to be sent, we only get so much advantage from filling up the buffer. At some point we are still dependent on the buffer being flushed such that new buffer space becomes available.

Deciding on the mechanism

In order to decide on the right mechanism, you need to be clear on the required functionality. In many cases, non-blocking I/O is not really required, especially not in a library implementation. In that case, blocking behaviour is perfectly fine. As discussed earlier in the article on Threading, the user is supposed to initiate a thread for himself whenever it is required, so there is no need to write the library implementation to avoid blocking I/O in advance.

Non-blocking is convenient only if you prioritize predictability and strict execution times over the amount of data that is sent. You can trade off the number of bytes sent for the preservation of control over execution and not end up waiting unexpectedly long for the OS to process the full buffer at the cost of other plans being delayed. Non-blocking I/O, because of its predictability, may be useful for low latency, high speed systems, such as trading systems. For most use cases, it only adds complexity and overhead.

Some food for thought: Java provides package java.nio (nio = Non-blocking I/O). The SocketChannel is the nio-alternative for java.net.Socket. Even though the SocketChannel is in the package java.nio, instances get created in blocking mode by default. You need to explicitly switch to non-blocking mode in order to take advantage of the non-blocking behaviour.

Interrupting I/O operations (Time-outs)

How long will you wait for an I/O operation (network, disk, etc.) to complete? What would be your trigger to interrupt an operation and continue execution? By far the most commonly available method for interrupting a (blocking) I/O operation is the timeout parameter. The timeout parameters is a duration, a relative amount of time to wait for a result before the operation is aborted.

Functional relevance of an interruption

Essentially, the timeout parameter is just an arbitrary choice of a mechanism for interrupting an operation. Another one could be a signal from another system, a signal at a specific time (an absolute moment in time instead of the relative waiting time of a timeout), a sufficient number of results from a set of concurrent operations, the event of receiving an email message, the loss of a network connection, etc. The timeout for I/O operations is just a common interruption trigger, but by no way the only relevant one. It is a choice of a solution to an unknown problem, as the library author does not know beforehand how the library will be used.

From a functional perspective, it is very important to choose the right interruption mechanisms. Note that often there are multiple relevant interruptions that should be handled, such as a time-out when a response takes too long, but also the fastest server returning an anticipated result and a state change of your application that would render your task irrelevant.

As you get closer to the hardware itself, you may encounter operations that cannot be interrupted. In that case, timeouts (and any other interruption mechanisms) are just a way to give you back control, while the operation itself continues in the background. If the operation finishes and control is already returned, then the value will be discarded, as it is now irrelevant and often cannot be returned anymore.

Timeouts as technical workaround for a technical issue

There is one case where timeouts solve a technical issue. When attempting to connect to server or when a connection is not yet fully established, but communication is failing, then having to wait “indefinitely” (or actually the default is 60 seconds) may force your network connection to stay open for much longer than is acceptable.

The timeout is a mechanism that can enforce - on the operating system level - that a connection gets cut off after the set boundary is passed. This ensures that connections are recycled within reasonable time and you do not run out of ports, threads or any other limited resource. However, as noted, this is a purely technical reason and should not influence the choice and flexibility of interruption mechanisms you use for your functional purposes.

The preferred solution

The preferred solution for I/O - unless directed otherwise by specific requirements - should be based on synchronous calling and blocking semantics. Running I/O together with the handling control flow in a different thread - which means you can keep I/O operations themselves blocking - or relying on non-blocking I/O depends largely on your use case. Each has its advantages and drawbacks.

Blocking I/O need not be slow, especially when an operating system predicts user requirements, but it can be. If this relevant to the implementation of the library (not just as a service to a potential user) to be “fast” in some manner, then you need to work around blocking behaviour. Non-blocking I/O may work for you if you need to keep latency low and predictability high, but the amount of data that can be sent at one time may be an issue. Initiating an I/O operation asynchronously, i.e. simply launching an I/O operation in the background and getting back to it later, is not as convenient as you might think and may only add to the complexity.

Pick your triggers based on required behaviour of your library. There are many different types of events that may trigger your application’s behaviour. Do not limit yourself to the provided relative timeout while attempting to bend it into the desired behaviour. Define and implement the triggers relevant to the use case. Use the timeout as a defensive measure for a technical limitation or in the case your trigger requirements match exactly. Make sure you recognize all relevant events that need to act as a trigger, not just one obvious, convenient one.

As we’re wrapping up, I’ll leave you with a blog post by Martin Fowler on a Java-based project for a high-performance business logic processing: The LMAX disruptor architecture project. You can find the source code for LMAX architecture at github. It has very nice descriptions of decisions made in the interest of performance.

disclaimer One should know that (non-)blocking is not the right term to use for all cases. There are 2 variants. One is the principle of blocking. The other is the principe of waiting. These are both variants of synchronous calls.
I do not make a clear distinction between them in the article itself, so I will shortly go into it here. In case of I/O operations it is called blocking, as you are blocked until the (operating) system finishes some operation for you, such as completing an I/O operation. With the use of locking primitives, where we need to wait for some other thread to finish its work, we call it waiting. As the name suggests, the execution is waiting idly for the signal to continue. Often enough no distinction is made between blocking and waiting, and for this particular case, it is sufficient to go with the less strict version.

This post is part of the Features vs Requirements series.
Other posts in this series: