# Java signed integer types and alternative representations

❝On signed integer types and other representations.❞
Contents

Java has a number of different notations for expressing a number value. The decimal notation (e.g. `13`) is by far the best known. Then there are the binary (e.g. `0b1101`), octal (e.g. `015`) and hexadecimal (e.g. `0xD`) notations that can alternatively be used to express a literal number value.

Now, when you use and reason about decimal values, Java behaves exactly as expected. Types are signed, i.e. both positive and negative values can be stored, 0 is a valid value and you can compare values as you would expect. Things start to break down when you attempt to use other representations. Binary, octal and hexadecimal notations do not have a notion of a signed value. Hexadecimal is often used as a convenient shorthand for writing bit values. For example, `0xff` is bit value `0b11111111` (255) of which the first `f` represents the upper 4 bits and the second `f` represents the lower 4 bits. This notation is especially useful when you are doing bitwise operations, are implementing low level protocols or are writing data directly to the wire.

However, even though this representation is different from the decimal representation, you do need to take into account that Java will still interpret them as signed decimal numbers, i.e. with the highest bit representing the sign, indicating whether the value is positive or negative. (See Two’s complement for details on how integers are stored in memory.) This goes as far as number comparisons, even when 2 hexadecimal number literals are compared.

Consider the following comparisons:

• `0x00000000 < 0x7fffffff`: `true`
or in decimals: `0 < 2147483647`
• `0x7fffffff < 0x80000000`: `false`
or in decimals: `2147483647 < -2147483648`
• `0x80000000 < 0xffffffff`: `true`
or in decimals: `-2147483648 < -1`
• `0xffffffff < 0x00000000`: `true`
or in decimals: `-1 < 0`

You will see that in the decimal representation this makes sense. The binary, octal and hexadecimal representations do not have a notion of signedness and as such have this “flip-over” point where the highest bit starts to be used and comparisons break down.

Furthermore, due to Java interpreting values as signed, certain types of conversions will fail even though these values technically still fit in memory.

A “safe” integer (sign bit untouched) value will convert as expected:

• `0x7fffffffL` = `2147483647`
• `(int)0x7fffffffL` = `2147483647`
• `(long)(int)0x7fffffffL` = `2147483647`
or in hexadecimal representation: `0x7fffffff`

… while a value that depends on the sign bit will get converted incorrectly:

• `0xffffffffL` = `4294967295`
• `(int)0xffffffffL` = `-1`
• `(long)(int)0xffffffffL` = `-1`
or in hexadecimal representation: `0xffffffffffffffffL`

So, what happened in the last case? The value `0xffffffffL`, a value of type `long`, was cast to `int`. Java basically copied the bits “as is” to the memory location of the `int`, resulting in the value `-1` in signed decimal representation. Then we cast back to `long`. However, now we cast to a larger type and as such it is certain that we can convert the value without loss of precision. Hence, it will convert the value `-1` to the value `-1` as expressed in type `long`. In the case of `long` this is `0xffffffffffffffffL`, since the size of type `long` is 64 bits, instead of `int`’s 32 bits.

To be clear, none of this behaviour is erroneous. Java provides signed integer types and behaves as such. It is merely good practice to be aware of this behaviour when using alternative representations such as hexadecimal. And especially hexadecimal, as this format is often used in protocol specifications for (low level) control values.

There is no real alternative to unsigned types in Java. In general, the advice is to use a larger type which fits the whole value in the signed part of the type. In my particular case I needed a type that can store a 32 bits value as defined in the specification. These values are used as identifiers and as such only equality is relevant. There is no meaning to `x < y` and similar comparisons for this specific purpose and we know that signed `int`s still have 32 bits of memory available for use.