2019-06-04 04:11:33 -04:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
/*
|
|
|
|
* Scalar fixed time AES core transform
|
|
|
|
*
|
|
|
|
* Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <crypto/aes.h>
|
2022-11-24 23:36:28 -05:00
|
|
|
#include <crypto/algapi.h>
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
#include <linux/module.h>
|
|
|
|
|
|
|
|
static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
|
|
|
|
unsigned int key_len)
|
|
|
|
{
|
|
|
|
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
|
|
|
|
|
2019-07-02 15:41:22 -04:00
|
|
|
return aes_expandkey(ctx, in_key, key_len);
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
|
|
|
{
|
|
|
|
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
|
2018-10-18 00:37:58 -04:00
|
|
|
unsigned long flags;
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
|
2018-10-18 00:37:58 -04:00
|
|
|
/*
|
|
|
|
* Temporarily disable interrupts to avoid races where cachelines are
|
|
|
|
* evicted when the CPU is interrupted to do something else.
|
|
|
|
*/
|
|
|
|
local_irq_save(flags);
|
|
|
|
|
2019-07-02 15:41:22 -04:00
|
|
|
aes_encrypt(ctx, out, in);
|
2018-10-18 00:37:58 -04:00
|
|
|
|
|
|
|
local_irq_restore(flags);
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
|
|
|
{
|
|
|
|
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
|
2018-10-18 00:37:58 -04:00
|
|
|
unsigned long flags;
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
|
2018-10-18 00:37:58 -04:00
|
|
|
/*
|
|
|
|
* Temporarily disable interrupts to avoid races where cachelines are
|
|
|
|
* evicted when the CPU is interrupted to do something else.
|
|
|
|
*/
|
|
|
|
local_irq_save(flags);
|
|
|
|
|
2019-07-02 15:41:22 -04:00
|
|
|
aes_decrypt(ctx, out, in);
|
2018-10-18 00:37:58 -04:00
|
|
|
|
|
|
|
local_irq_restore(flags);
|
crypto: aes - add generic time invariant AES cipher
Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.
For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.
For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.
This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02 11:37:40 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct crypto_alg aes_alg = {
|
|
|
|
.cra_name = "aes",
|
|
|
|
.cra_driver_name = "aes-fixed-time",
|
|
|
|
.cra_priority = 100 + 1,
|
|
|
|
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
|
|
|
|
.cra_blocksize = AES_BLOCK_SIZE,
|
|
|
|
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
|
|
|
|
.cra_module = THIS_MODULE,
|
|
|
|
|
|
|
|
.cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE,
|
|
|
|
.cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE,
|
|
|
|
.cra_cipher.cia_setkey = aesti_set_key,
|
|
|
|
.cra_cipher.cia_encrypt = aesti_encrypt,
|
|
|
|
.cra_cipher.cia_decrypt = aesti_decrypt
|
|
|
|
};
|
|
|
|
|
|
|
|
static int __init aes_init(void)
|
|
|
|
{
|
|
|
|
return crypto_register_alg(&aes_alg);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __exit aes_fini(void)
|
|
|
|
{
|
|
|
|
crypto_unregister_alg(&aes_alg);
|
|
|
|
}
|
|
|
|
|
|
|
|
module_init(aes_init);
|
|
|
|
module_exit(aes_fini);
|
|
|
|
|
|
|
|
MODULE_DESCRIPTION("Generic fixed time AES");
|
|
|
|
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
|
|
|
MODULE_LICENSE("GPL v2");
|