# The symbiotic relationship of hash and AEAD

Thu, Jan 30, 2020 ❝Observations on combining a cryptographic hash function and AEAD.❞## Contents

AEAD and hash function independently are useful cryptographic concepts. A few days ago, I was looking into an interesting and somewhat curious concept called “time-lock encryption”, and starting pondering the added benefits of combining hash and AEAD, i.e using them as a single construct. There does not seem to be much information on this, at least based on a quick *search*. Most likely because it is too trivial in nature. However, being curious, I decided to look into it myself.

*Disclaimer*: I am not a cryptographer. I am curious about cryptographic concepts. I might make seemingly-obvious mistakes, derive wrong conclusions, etc. So be critical as you read this.

## SHA-256 (Hash) + AES-256-GCM (AEAD)

The initial combination I looked at was `SHA256`

+`AES-256-GCM`

. This choice is purely based on convenience as both primitives are readily available in the Go standard library.

The construction looks as follows:

```
password → [SHA-256] ↴ (key)
│ plaintext → [AES-256-GCM] → ciphertext
└ (associated data) ⬏
```

Or stated differently, given:

`h = sha256(m)`

`ciphertext = aes256gcm(plaintext, associated, key, nonce)`

Combined as: `ciphertext = aes256gcm(plaintext, password, sha256(password), nonce)`

`password`

is input for the hash-function. The hash-function output is used as encryption key. *In addition*, `password`

is added as *associated data* to the AEAD.

*Authenticated-Encryption with Associated-Data* describes the following property, among others: (emphasis mine)

(2) It is outside of the model how the

associated-data`H`

is made known to the receiver. We do not consider the associated-data to be part of the ciphertext,though the receiver will need it in order to decrypt.The same comments apply the nonce`N`

.

Both hash and AEAD benefit from this construction:

- AEAD is bound to the context, in this case the
*hash function*that is part of the construction. - The hash function, used independently, suffers from possibility of
*second pre-images*given a sufficiently large input space. By using the*hash function’s*input as*associated data*,*second pre-images*can no longer usefully contribute. That is, the hash value may be correct, but the necessary*associated data*is still unavailable.

Hash function and AEAD become co-dependent.

## AEAD

An AEAD provides several guarantees: *integrity*, *confidentiality* and *authenticity*. The *ciphertext* is guaranteed all three properties.

The *associated data* is part of the *authentication* guarantee of the ciphertext. The *associated data* is turned into a necessary condition to decrypting the ciphertext. If the ciphertext is taken out of this context, decryption becomes infeasible. As an added benefit, the *associated data* itself is guaranteed *integrity*.

It is the *authenticity* and *integrity* properties that ensure that the original *associated data* is provided.

## Hash-function

A quick reminder: the properties of cryptographic hash functions, according to Wikipedia.

Pre-image resistance: Given a hash value`h`

it should be difficult to find any message`m`

such that`h = hash(m)`

. This concept is related to that of a one-way function. Functions that lack this property are vulnerable to preimage attacks.

Second pre-image resistance: Given an input`m1`

, it should be difficult to find a different input`m2`

such that`hash(m1) = hash(m2)`

. This property is sometimes referred to as weak collision resistance. Functions that lack this property are vulnerable to second-preimage attacks.

Collision resistance: It should be difficult to find two different messages`m1`

and`m2`

such that`hash(m1) = hash(m2)`

. Such a pair is called a cryptographic hash collision. This property is sometimes referred to as strong collision resistance. It requires a hash value at least twice as long as that required for pre-image resistance; otherwise collisions may be found by a birthday attack.

Now to evaluate the properties of the *hash function* as part of the construction:

: this was never an issue as one of the two hashes is predetermined.~~collision resistance~~: binding the hash input to the ciphertext through~~second pre-image resistance~~*associated data*means that a*second pre-image*is no longer sufficient.*pre-image resistance*: circumstances have not changed. We rely on the*pre-image resistance*guarantee as we did before.

Effectively, we have mitigated one limitation of hash functions.

## Conclusions

To conclude, the *associated data* of an AEAD can be used to “cement” a construct in place, such that it cannot be taken apart. The pair `(hash, AEAD)`

is a trivial example which demonstrates a symbiotic relationship.

## Further investigation

Although this construction gives a nice benefit, it is important to look at the implications of its application. Hash functions are *not ideal* for key derivation as hash functions are *designed to be fast*. This property primarily benefits the attacker.

The broader class of Key Derivation Functions provides options that are more attractive for deriving keys from passwords. It would be interesting to see if those functions can benefit in the same way.

There are a few further questions to look into:

- Does the combined construct indeed bring (only) benefits? Or is the combined construction redundant/meaningless/risky?
- How far can this be generalized? A typical use case is KDFs (key derivation functions). Not all KDFs are hash-based, so will it bring similar benefits for other types?
- Do AEADs indeed internally use a similar co-dependency for the
*symmetric key*and the header`H`

used in its*Authenticated Encryption*?