提交 5bcbe22c 编写于 作者: L Linus Torvalds

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu:
 "API:
   - Try to catch hash output overrun in testmgr
   - Introduce walksize attribute for batched walking
   - Make crypto_xor() and crypto_inc() alignment agnostic

  Algorithms:
   - Add time-invariant AES algorithm
   - Add standalone CBCMAC algorithm

  Drivers:
   - Add NEON acclerated chacha20 on ARM/ARM64
   - Expose AES-CTR as synchronous skcipher on ARM64
   - Add scalar AES implementation on ARM64
   - Improve scalar AES implementation on ARM
   - Improve NEON AES implementation on ARM/ARM64
   - Merge CRC32 and PMULL instruction based drivers on ARM64
   - Add NEON acclerated CBCMAC/CMAC/XCBC AES on ARM64
   - Add IPsec AUTHENC implementation in atmel
   - Add Support for Octeon-tx CPT Engine
   - Add Broadcom SPU driver
   - Add MediaTek driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (142 commits)
  crypto: xts - Add ECB dependency
  crypto: cavium - switch to pci_alloc_irq_vectors
  crypto: cavium - switch to pci_alloc_irq_vectors
  crypto: cavium - remove dead MSI-X related define
  crypto: brcm - Avoid double free in ahash_finup()
  crypto: cavium - fix Kconfig dependencies
  crypto: cavium - cpt_bind_vq_to_grp could return an error code
  crypto: doc - fix typo
  hwrng: omap - update Kconfig help description
  crypto: ccm - drop unnecessary minimum 32-bit alignment
  crypto: ccm - honour alignmask of subordinate MAC cipher
  crypto: caam - fix state buffer DMA (un)mapping
  crypto: caam - abstract ahash request double buffering
  crypto: caam - fix error path for ctx_dma mapping failure
  crypto: caam - fix DMA API leaks for multiple setkey() calls
  crypto: caam - don't dma_map key for hash algorithms
  crypto: caam - use dma_map_sg() return code
  crypto: caam - replace sg_count() with sg_nents_for_len()
  crypto: caam - check sg_count() return value
  crypto: caam - fix HW S/G in ablkcipher_giv_edesc_alloc()
  ..
...@@ -14,7 +14,7 @@ Asynchronous Message Digest API ...@@ -14,7 +14,7 @@ Asynchronous Message Digest API
:doc: Asynchronous Message Digest API :doc: Asynchronous Message Digest API
.. kernel-doc:: include/crypto/hash.h .. kernel-doc:: include/crypto/hash.h
:functions: crypto_alloc_ahash crypto_free_ahash crypto_ahash_init crypto_ahash_digestsize crypto_ahash_reqtfm crypto_ahash_reqsize crypto_ahash_setkey crypto_ahash_finup crypto_ahash_final crypto_ahash_digest crypto_ahash_export crypto_ahash_import :functions: crypto_alloc_ahash crypto_free_ahash crypto_ahash_init crypto_ahash_digestsize crypto_ahash_reqtfm crypto_ahash_reqsize crypto_ahash_statesize crypto_ahash_setkey crypto_ahash_finup crypto_ahash_final crypto_ahash_digest crypto_ahash_export crypto_ahash_import
Asynchronous Hash Request Handle Asynchronous Hash Request Handle
-------------------------------- --------------------------------
......
...@@ -59,4 +59,4 @@ Synchronous Block Cipher API - Deprecated ...@@ -59,4 +59,4 @@ Synchronous Block Cipher API - Deprecated
:doc: Synchronous Block Cipher API :doc: Synchronous Block Cipher API
.. kernel-doc:: include/linux/crypto.h .. kernel-doc:: include/linux/crypto.h
:functions: crypto_alloc_blkcipher rypto_free_blkcipher crypto_has_blkcipher crypto_blkcipher_name crypto_blkcipher_ivsize crypto_blkcipher_blocksize crypto_blkcipher_setkey crypto_blkcipher_encrypt crypto_blkcipher_encrypt_iv crypto_blkcipher_decrypt crypto_blkcipher_decrypt_iv crypto_blkcipher_set_iv crypto_blkcipher_get_iv :functions: crypto_alloc_blkcipher crypto_free_blkcipher crypto_has_blkcipher crypto_blkcipher_name crypto_blkcipher_ivsize crypto_blkcipher_blocksize crypto_blkcipher_setkey crypto_blkcipher_encrypt crypto_blkcipher_encrypt_iv crypto_blkcipher_decrypt crypto_blkcipher_decrypt_iv crypto_blkcipher_set_iv crypto_blkcipher_get_iv
The Broadcom Secure Processing Unit (SPU) hardware supports symmetric
cryptographic offload for Broadcom SoCs. A SoC may have multiple SPU hardware
blocks.
Required properties:
- compatible: Should be one of the following:
brcm,spum-crypto - for devices with SPU-M hardware
brcm,spu2-crypto - for devices with SPU2 hardware
brcm,spu2-v2-crypto - for devices with enhanced SPU2 hardware features like SHA3
and Rabin Fingerprint support
brcm,spum-nsp-crypto - for the Northstar Plus variant of the SPU-M hardware
- reg: Should contain SPU registers location and length.
- mboxes: The mailbox channel to be used to communicate with the SPU.
Mailbox channels correspond to DMA rings on the device.
Example:
crypto@612d0000 {
compatible = "brcm,spum-crypto";
reg = <0 0x612d0000 0 0x900>;
mboxes = <&pdc0 0>;
};
MediaTek cryptographic accelerators
Required properties:
- compatible: Should be "mediatek,eip97-crypto"
- reg: Address and length of the register set for the device
- interrupts: Should contain the five crypto engines interrupts in numeric
order. These are global system and four descriptor rings.
- clocks: the clock used by the core
- clock-names: the names of the clock listed in the clocks property. These are
"ethif", "cryp"
- power-domains: Must contain a reference to the PM domain.
Example:
crypto: crypto@1b240000 {
compatible = "mediatek,eip97-crypto";
reg = <0 0x1b240000 0 0x20000>;
interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_LOW>,
<GIC_SPI 83 IRQ_TYPE_LEVEL_LOW>,
<GIC_SPI 84 IRQ_TYPE_LEVEL_LOW>,
<GIC_SPI 91 IRQ_TYPE_LEVEL_LOW>,
<GIC_SPI 97 IRQ_TYPE_LEVEL_LOW>;
clocks = <&topckgen CLK_TOP_ETHIF_SEL>,
<&ethsys CLK_ETHSYS_CRYPTO>;
clock-names = "ethif","cryp";
power-domains = <&scpsys MT2701_POWER_DOMAIN_ETH>;
};
...@@ -3031,6 +3031,13 @@ W: http://www.cavium.com ...@@ -3031,6 +3031,13 @@ W: http://www.cavium.com
S: Supported S: Supported
F: drivers/net/ethernet/cavium/liquidio/ F: drivers/net/ethernet/cavium/liquidio/
CAVIUM OCTEON-TX CRYPTO DRIVER
M: George Cherian <george.cherian@cavium.com>
L: linux-crypto@vger.kernel.org
W: http://www.cavium.com
S: Supported
F: drivers/crypto/cavium/cpt/
CC2520 IEEE-802.15.4 RADIO DRIVER CC2520 IEEE-802.15.4 RADIO DRIVER
M: Varka Bhadram <varkabhadram@gmail.com> M: Varka Bhadram <varkabhadram@gmail.com>
L: linux-wpan@vger.kernel.org L: linux-wpan@vger.kernel.org
......
...@@ -62,35 +62,18 @@ config CRYPTO_SHA512_ARM ...@@ -62,35 +62,18 @@ config CRYPTO_SHA512_ARM
using optimized ARM assembler and NEON, when available. using optimized ARM assembler and NEON, when available.
config CRYPTO_AES_ARM config CRYPTO_AES_ARM
tristate "AES cipher algorithms (ARM-asm)" tristate "Scalar AES cipher for ARM"
depends on ARM
select CRYPTO_ALGAPI select CRYPTO_ALGAPI
select CRYPTO_AES select CRYPTO_AES
help help
Use optimized AES assembler routines for ARM platforms. Use optimized AES assembler routines for ARM platforms.
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
Rijndael appears to be consistently a very good performer in
both hardware and software across a wide range of computing
environments regardless of its use in feedback or non-feedback
modes. Its key setup time is excellent, and its key agility is
good. Rijndael's very low memory requirements make it very well
suited for restricted-space environments, in which it also
demonstrates excellent performance. Rijndael's operations are
among the easiest to defend against power and timing attacks.
The AES specifies three key sizes: 128, 192 and 256 bits
See <http://csrc.nist.gov/encryption/aes/> for more information.
config CRYPTO_AES_ARM_BS config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions" tristate "Bit sliced AES using NEON instructions"
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
select CRYPTO_AES_ARM
select CRYPTO_BLKCIPHER select CRYPTO_BLKCIPHER
select CRYPTO_SIMD select CRYPTO_SIMD
select CRYPTO_AES_ARM
help help
Use a faster and more secure NEON based implementation of AES in CBC, Use a faster and more secure NEON based implementation of AES in CBC,
CTR and XTS modes CTR and XTS modes
...@@ -130,4 +113,10 @@ config CRYPTO_CRC32_ARM_CE ...@@ -130,4 +113,10 @@ config CRYPTO_CRC32_ARM_CE
depends on KERNEL_MODE_NEON && CRC32 depends on KERNEL_MODE_NEON && CRC32
select CRYPTO_HASH select CRYPTO_HASH
config CRYPTO_CHACHA20_NEON
tristate "NEON accelerated ChaCha20 symmetric cipher"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_CHACHA20
endif endif
...@@ -8,6 +8,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o ...@@ -8,6 +8,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
...@@ -26,8 +27,8 @@ $(warning $(ce-obj-y) $(ce-obj-m)) ...@@ -26,8 +27,8 @@ $(warning $(ce-obj-y) $(ce-obj-m))
endif endif
endif endif
aes-arm-y := aes-armv4.o aes_glue.o aes-arm-y := aes-cipher-core.o aes-cipher-glue.o
aes-arm-bs-y := aesbs-core.o aesbs-glue.o aes-arm-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
sha1-arm-y := sha1-armv4-large.o sha1_glue.o sha1-arm-y := sha1-armv4-large.o sha1_glue.o
sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o
sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
...@@ -40,17 +41,15 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o ...@@ -40,17 +41,15 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
quiet_cmd_perl = PERL $@ quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@) cmd_perl = $(PERL) $(<) > $(@)
$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl
$(call cmd,perl)
$(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl $(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl
$(call cmd,perl) $(call cmd,perl)
$(src)/sha512-core.S_shipped: $(src)/sha512-armv4.pl $(src)/sha512-core.S_shipped: $(src)/sha512-armv4.pl
$(call cmd,perl) $(call cmd,perl)
.PRECIOUS: $(obj)/aesbs-core.S $(obj)/sha256-core.S $(obj)/sha512-core.S .PRECIOUS: $(obj)/sha256-core.S $(obj)/sha512-core.S
此差异已折叠。
...@@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt) ...@@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt)
.Lecbencloop3x: .Lecbencloop3x:
subs r4, r4, #3 subs r4, r4, #3
bmi .Lecbenc1x bmi .Lecbenc1x
vld1.8 {q0-q1}, [r1, :64]! vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1, :64]! vld1.8 {q2}, [r1]!
bl aes_encrypt_3x bl aes_encrypt_3x
vst1.8 {q0-q1}, [r0, :64]! vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
b .Lecbencloop3x b .Lecbencloop3x
.Lecbenc1x: .Lecbenc1x:
adds r4, r4, #3 adds r4, r4, #3
beq .Lecbencout beq .Lecbencout
.Lecbencloop: .Lecbencloop:
vld1.8 {q0}, [r1, :64]! vld1.8 {q0}, [r1]!
bl aes_encrypt bl aes_encrypt
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
bne .Lecbencloop bne .Lecbencloop
.Lecbencout: .Lecbencout:
...@@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt) ...@@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt)
.Lecbdecloop3x: .Lecbdecloop3x:
subs r4, r4, #3 subs r4, r4, #3
bmi .Lecbdec1x bmi .Lecbdec1x
vld1.8 {q0-q1}, [r1, :64]! vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1, :64]! vld1.8 {q2}, [r1]!
bl aes_decrypt_3x bl aes_decrypt_3x
vst1.8 {q0-q1}, [r0, :64]! vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
b .Lecbdecloop3x b .Lecbdecloop3x
.Lecbdec1x: .Lecbdec1x:
adds r4, r4, #3 adds r4, r4, #3
beq .Lecbdecout beq .Lecbdecout
.Lecbdecloop: .Lecbdecloop:
vld1.8 {q0}, [r1, :64]! vld1.8 {q0}, [r1]!
bl aes_decrypt bl aes_decrypt
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
bne .Lecbdecloop bne .Lecbdecloop
.Lecbdecout: .Lecbdecout:
...@@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt) ...@@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt)
vld1.8 {q0}, [r5] vld1.8 {q0}, [r5]
prepare_key r2, r3 prepare_key r2, r3
.Lcbcencloop: .Lcbcencloop:
vld1.8 {q1}, [r1, :64]! @ get next pt block vld1.8 {q1}, [r1]! @ get next pt block
veor q0, q0, q1 @ ..and xor with iv veor q0, q0, q1 @ ..and xor with iv
bl aes_encrypt bl aes_encrypt
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
bne .Lcbcencloop bne .Lcbcencloop
vst1.8 {q0}, [r5] vst1.8 {q0}, [r5]
...@@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt) ...@@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt)
.Lcbcdecloop3x: .Lcbcdecloop3x:
subs r4, r4, #3 subs r4, r4, #3
bmi .Lcbcdec1x bmi .Lcbcdec1x
vld1.8 {q0-q1}, [r1, :64]! vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1, :64]! vld1.8 {q2}, [r1]!
vmov q3, q0 vmov q3, q0
vmov q4, q1 vmov q4, q1
vmov q5, q2 vmov q5, q2
...@@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt) ...@@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt)
veor q1, q1, q3 veor q1, q1, q3
veor q2, q2, q4 veor q2, q2, q4
vmov q6, q5 vmov q6, q5
vst1.8 {q0-q1}, [r0, :64]! vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
b .Lcbcdecloop3x b .Lcbcdecloop3x
.Lcbcdec1x: .Lcbcdec1x:
adds r4, r4, #3 adds r4, r4, #3
beq .Lcbcdecout beq .Lcbcdecout
vmov q15, q14 @ preserve last round key vmov q15, q14 @ preserve last round key
.Lcbcdecloop: .Lcbcdecloop:
vld1.8 {q0}, [r1, :64]! @ get next ct block vld1.8 {q0}, [r1]! @ get next ct block
veor q14, q15, q6 @ combine prev ct with last key veor q14, q15, q6 @ combine prev ct with last key
vmov q6, q0 vmov q6, q0
bl aes_decrypt bl aes_decrypt
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
bne .Lcbcdecloop bne .Lcbcdecloop
.Lcbcdecout: .Lcbcdecout:
...@@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt) ...@@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt)
rev ip, r6 rev ip, r6
add r6, r6, #1 add r6, r6, #1
vmov s11, ip vmov s11, ip
vld1.8 {q3-q4}, [r1, :64]! vld1.8 {q3-q4}, [r1]!
vld1.8 {q5}, [r1, :64]! vld1.8 {q5}, [r1]!
bl aes_encrypt_3x bl aes_encrypt_3x
veor q0, q0, q3 veor q0, q0, q3
veor q1, q1, q4 veor q1, q1, q4
veor q2, q2, q5 veor q2, q2, q5
rev ip, r6 rev ip, r6
vst1.8 {q0-q1}, [r0, :64]! vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
vmov s27, ip vmov s27, ip
b .Lctrloop3x b .Lctrloop3x
.Lctr1x: .Lctr1x:
...@@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt) ...@@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt)
vmov q0, q6 vmov q0, q6
bl aes_encrypt bl aes_encrypt
subs r4, r4, #1 subs r4, r4, #1
bmi .Lctrhalfblock @ blocks < 0 means 1/2 block bmi .Lctrtailblock @ blocks < 0 means tail block
vld1.8 {q3}, [r1, :64]! vld1.8 {q3}, [r1]!
veor q3, q0, q3 veor q3, q0, q3
vst1.8 {q3}, [r0, :64]! vst1.8 {q3}, [r0]!
adds r6, r6, #1 @ increment BE ctr adds r6, r6, #1 @ increment BE ctr
rev ip, r6 rev ip, r6
...@@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt) ...@@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt)
vst1.8 {q6}, [r5] vst1.8 {q6}, [r5]
pop {r4-r6, pc} pop {r4-r6, pc}
.Lctrhalfblock: .Lctrtailblock:
vld1.8 {d1}, [r1, :64] vst1.8 {q0}, [r0, :64] @ return just the key stream
veor d0, d0, d1
vst1.8 {d0}, [r0, :64]
pop {r4-r6, pc} pop {r4-r6, pc}
.Lctrcarry: .Lctrcarry:
...@@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt) ...@@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt)
.Lxtsenc3x: .Lxtsenc3x:
subs r4, r4, #3 subs r4, r4, #3
bmi .Lxtsenc1x bmi .Lxtsenc1x
vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks
vld1.8 {q2}, [r1, :64]! vld1.8 {q2}, [r1]!
next_tweak q4, q3, q7, q6 next_tweak q4, q3, q7, q6
veor q0, q0, q3 veor q0, q0, q3
next_tweak q5, q4, q7, q6 next_tweak q5, q4, q7, q6
...@@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt) ...@@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt)
veor q0, q0, q3 veor q0, q0, q3
veor q1, q1, q4 veor q1, q1, q4
veor q2, q2, q5 veor q2, q2, q5
vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
vmov q3, q5 vmov q3, q5
teq r4, #0 teq r4, #0
beq .Lxtsencout beq .Lxtsencout
...@@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt) ...@@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt)
adds r4, r4, #3 adds r4, r4, #3
beq .Lxtsencout beq .Lxtsencout
.Lxtsencloop: .Lxtsencloop:
vld1.8 {q0}, [r1, :64]! vld1.8 {q0}, [r1]!
veor q0, q0, q3 veor q0, q0, q3
bl aes_encrypt bl aes_encrypt
veor q0, q0, q3 veor q0, q0, q3
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
beq .Lxtsencout beq .Lxtsencout
next_tweak q3, q3, q7, q6 next_tweak q3, q3, q7, q6
...@@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt) ...@@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt)
.Lxtsdec3x: .Lxtsdec3x:
subs r4, r4, #3 subs r4, r4, #3
bmi .Lxtsdec1x bmi .Lxtsdec1x
vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks
vld1.8 {q2}, [r1, :64]! vld1.8 {q2}, [r1]!
next_tweak q4, q3, q7, q6 next_tweak q4, q3, q7, q6
veor q0, q0, q3 veor q0, q0, q3
next_tweak q5, q4, q7, q6 next_tweak q5, q4, q7, q6
...@@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt) ...@@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt)
veor q0, q0, q3 veor q0, q0, q3
veor q1, q1, q4 veor q1, q1, q4
veor q2, q2, q5 veor q2, q2, q5
vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks
vst1.8 {q2}, [r0, :64]! vst1.8 {q2}, [r0]!
vmov q3, q5 vmov q3, q5
teq r4, #0 teq r4, #0
beq .Lxtsdecout beq .Lxtsdecout
...@@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt) ...@@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt)
adds r4, r4, #3 adds r4, r4, #3
beq .Lxtsdecout beq .Lxtsdecout
.Lxtsdecloop: .Lxtsdecloop:
vld1.8 {q0}, [r1, :64]! vld1.8 {q0}, [r1]!
veor q0, q0, q3 veor q0, q0, q3
add ip, r2, #32 @ 3rd round key add ip, r2, #32 @ 3rd round key
bl aes_decrypt bl aes_decrypt
veor q0, q0, q3 veor q0, q0, q3
vst1.8 {q0}, [r0, :64]! vst1.8 {q0}, [r0]!
subs r4, r4, #1 subs r4, r4, #1
beq .Lxtsdecout beq .Lxtsdecout
next_tweak q3, q3, q7, q6 next_tweak q3, q3, q7, q6
......
...@@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 *tsrc = walk.src.virt.addr; u8 *tsrc = walk.src.virt.addr;
/* /*
* Minimum alignment is 8 bytes, so if nbytes is <= 8, we need * Tell aes_ctr_encrypt() to process a tail block.
* to tell aes_ctr_encrypt() to only read half a block.
*/ */
blocks = (nbytes <= 8) ? -1 : 1; blocks = -1;
ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc,
num_rounds(ctx), blocks, walk.iv); num_rounds(ctx), blocks, walk.iv);
memcpy(tdst, tail, nbytes); if (tdst != tsrc)
memcpy(tdst, tsrc, nbytes);
crypto_xor(tdst, tail, nbytes);
err = skcipher_walk_done(&walk, 0); err = skcipher_walk_done(&walk, 0);
} }
kernel_neon_end(); kernel_neon_end();
...@@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1, .cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = 2 * AES_MIN_KEY_SIZE, .min_keysize = 2 * AES_MIN_KEY_SIZE,
......
/*
* Scalar AES core transform
*
* Copyright (C) 2017 Linaro Ltd.
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
.text
.align 5
rk .req r0
rounds .req r1
in .req r2
out .req r3
ttab .req ip
t0 .req lr
t1 .req r2
t2 .req r3
.macro __select, out, in, idx
.if __LINUX_ARM_ARCH__ < 7
and \out, \in, #0xff << (8 * \idx)
.else
ubfx \out, \in, #(8 * \idx), #8
.endif
.endm
.macro __load, out, in, idx
.if __LINUX_ARM_ARCH__ < 7 && \idx > 0
ldr \out, [ttab, \in, lsr #(8 * \idx) - 2]
.else
ldr \out, [ttab, \in, lsl #2]
.endif
.endm
.macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc
__select \out0, \in0, 0
__select t0, \in1, 1
__load \out0, \out0, 0
__load t0, t0, 1
.if \enc
__select \out1, \in1, 0
__select t1, \in2, 1
.else
__select \out1, \in3, 0
__select t1, \in0, 1
.endif
__load \out1, \out1, 0
__select t2, \in2, 2
__load t1, t1, 1
__load t2, t2, 2
eor \out0, \out0, t0, ror #24
__select t0, \in3, 3
.if \enc
__select \t3, \in3, 2
__select \t4, \in0, 3
.else
__select \t3, \in1, 2
__select \t4, \in2, 3
.endif
__load \t3, \t3, 2
__load t0, t0, 3
__load \t4, \t4, 3
eor \out1, \out1, t1, ror #24
eor \out0, \out0, t2, ror #16
ldm rk!, {t1, t2}
eor \out1, \out1, \t3, ror #16
eor \out0, \out0, t0, ror #8
eor \out1, \out1, \t4, ror #8
eor \out0, \out0, t1
eor \out1, \out1, t2
.endm
.macro fround, out0, out1, out2, out3, in0, in1, in2, in3
__hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
__hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
.endm
.macro iround, out0, out1, out2, out3, in0, in1, in2, in3
__hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
__hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
.endm
.macro __rev, out, in
.if __LINUX_ARM_ARCH__ < 6
lsl t0, \in, #24
and t1, \in, #0xff00
and t2, \in, #0xff0000
orr \out, t0, \in, lsr #24
orr \out, \out, t1, lsl #8
orr \out, \out, t2, lsr #8
.else
rev \out, \in
.endif
.endm
.macro __adrl, out, sym, c
.if __LINUX_ARM_ARCH__ < 7
ldr\c \out, =\sym
.else
movw\c \out, #:lower16:\sym
movt\c \out, #:upper16:\sym
.endif
.endm
.macro do_crypt, round, ttab, ltab
push {r3-r11, lr}
ldr r4, [in]
ldr r5, [in, #4]
ldr r6, [in, #8]
ldr r7, [in, #12]
ldm rk!, {r8-r11}
#ifdef CONFIG_CPU_BIG_ENDIAN
__rev r4, r4
__rev r5, r5
__rev r6, r6
__rev r7, r7
#endif
eor r4, r4, r8
eor r5, r5, r9
eor r6, r6, r10
eor r7, r7, r11
__adrl ttab, \ttab
tst rounds, #2
bne 1f
0: \round r8, r9, r10, r11, r4, r5, r6, r7
\round r4, r5, r6, r7, r8, r9, r10, r11
1: subs rounds, rounds, #4
\round r8, r9, r10, r11, r4, r5, r6, r7
__adrl ttab, \ltab, ls
\round r4, r5, r6, r7, r8, r9, r10, r11
bhi 0b
#ifdef CONFIG_CPU_BIG_ENDIAN
__rev r4, r4
__rev r5, r5
__rev r6, r6
__rev r7, r7
#endif
ldr out, [sp]
str r4, [out]
str r5, [out, #4]
str r6, [out, #8]
str r7, [out, #12]
pop {r3-r11, pc}
.align 3
.ltorg
.endm
ENTRY(__aes_arm_encrypt)
do_crypt fround, crypto_ft_tab, crypto_fl_tab
ENDPROC(__aes_arm_encrypt)
ENTRY(__aes_arm_decrypt)
do_crypt iround, crypto_it_tab, crypto_il_tab
ENDPROC(__aes_arm_decrypt)
/*
* Scalar AES core transform
*
* Copyright (C) 2017 Linaro Ltd.
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <crypto/aes.h>
#include <linux/crypto.h>
#include <linux/module.h>
asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
EXPORT_SYMBOL(__aes_arm_encrypt);
asmlinkage void __aes_arm_decrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
EXPORT_SYMBOL(__aes_arm_decrypt);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
__aes_arm_encrypt(ctx->key_enc, rounds, in, out);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
__aes_arm_decrypt(ctx->key_dec, rounds, in, out);
}
static struct crypto_alg aes_alg = {
.cra_name = "aes",
.cra_driver_name = "aes-arm",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
.cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE,
.cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE,
.cra_cipher.cia_setkey = crypto_aes_set_key,
.cra_cipher.cia_encrypt = aes_encrypt,
.cra_cipher.cia_decrypt = aes_decrypt,
#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
.cra_alignmask = 3,
#endif
};
static int __init aes_init(void)
{
return crypto_register_alg(&aes_alg);
}
static void __exit aes_fini(void)
{
crypto_unregister_alg(&aes_alg);
}
module_init(aes_init);
module_exit(aes_fini);
MODULE_DESCRIPTION("Scalar AES cipher for ARM");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("aes");
此差异已折叠。
/*
* Bit sliced AES using NEON instructions
*
* Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/cbc.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/xts.h>
#include <linux/module.h>
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("ecb(aes)");
MODULE_ALIAS_CRYPTO("cbc(aes)");
MODULE_ALIAS_CRYPTO("ctr(aes)");
MODULE_ALIAS_CRYPTO("xts(aes)");
asmlinkage void aesbs_convert_key(u8 out[], u32 const rk[], int rounds);
asmlinkage void aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks);
asmlinkage void aesbs_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks);
asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], u8 final[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void __aes_arm_encrypt(const u32 rk[], int rounds, const u8 in[],
u8 out[]);
struct aesbs_ctx {
int rounds;
u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32] __aligned(AES_BLOCK_SIZE);
};
struct aesbs_cbc_ctx {
struct aesbs_ctx key;
u32 enc[AES_MAX_KEYLENGTH_U32];
};
struct aesbs_xts_ctx {
struct aesbs_ctx key;
u32 twkey[AES_MAX_KEYLENGTH_U32];
};
static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
if (err)
return err;
ctx->rounds = 6 + key_len / 4;
kernel_neon_begin();
aesbs_convert_key(ctx->rk, rk.key_enc, ctx->rounds);
kernel_neon_end();
return 0;
}
static int __ecb_crypt(struct skcipher_request *req,
void (*fn)(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks))
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, true);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
ctx->rounds, blocks);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int ecb_encrypt(struct skcipher_request *req)
{
return __ecb_crypt(req, aesbs_ecb_encrypt);
}
static int ecb_decrypt(struct skcipher_request *req)
{
return __ecb_crypt(req, aesbs_ecb_decrypt);
}
static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
if (err)
return err;
ctx->key.rounds = 6 + key_len / 4;
memcpy(ctx->enc, rk.key_enc, sizeof(ctx->enc));
kernel_neon_begin();
aesbs_convert_key(ctx->key.rk, rk.key_enc, ctx->key.rounds);
kernel_neon_end();
return 0;
}
static void cbc_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
__aes_arm_encrypt(ctx->enc, ctx->key.rounds, src, dst);
}
static int cbc_encrypt(struct skcipher_request *req)
{
return crypto_cbc_encrypt_walk(req, cbc_encrypt_one);
}
static int cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, true);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u8 buf[AES_BLOCK_SIZE];
int err;
err = skcipher_walk_virt(&walk, req, true);
kernel_neon_begin();
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
if (dst != src)
memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk;
int err;
err = xts_verify_key(tfm, in_key, key_len);
if (err)
return err;
key_len /= 2;
err = crypto_aes_expand_key(&rk, in_key + key_len, key_len);
if (err)
return err;
memcpy(ctx->twkey, rk.key_enc, sizeof(ctx->twkey));
return aesbs_setkey(tfm, in_key, key_len);
}
static int __xts_crypt(struct skcipher_request *req,
void (*fn)(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]))
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, true);
__aes_arm_encrypt(ctx->twkey, ctx->key.rounds, walk.iv, walk.iv);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk,
ctx->key.rounds, blocks, walk.iv);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int xts_encrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_encrypt);
}
static int xts_decrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_decrypt);
}
static struct skcipher_alg aes_algs[] = { {
.base.cra_name = "__ecb(aes)",
.base.cra_driver_name = "__ecb-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_module = THIS_MODULE,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.setkey = aesbs_setkey,
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
}, {
.base.cra_name = "__cbc(aes)",
.base.cra_driver_name = "__cbc-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.base.cra_module = THIS_MODULE,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(aes)",
.base.cra_driver_name = "__ctr-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_module = THIS_MODULE,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.chunksize = AES_BLOCK_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
.base.cra_name = "__xts(aes)",
.base.cra_driver_name = "__xts-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
.base.cra_module = THIS_MODULE,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_xts_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
} };
static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
static void aes_exit(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(aes_simd_algs); i++)
if (aes_simd_algs[i])
simd_skcipher_free(aes_simd_algs[i]);
crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
}
static int __init aes_init(void)
{
struct simd_skcipher_alg *simd;
const char *basename;
const char *algname;
const char *drvname;
int err;
int i;
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
err = crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
if (err)
return err;
for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
continue;
algname = aes_algs[i].base.cra_name + 2;
drvname = aes_algs[i].base.cra_driver_name + 2;
basename = aes_algs[i].base.cra_driver_name;
simd = simd_skcipher_create_compat(algname, drvname, basename);
err = PTR_ERR(simd);
if (IS_ERR(simd))
goto unregister_simds;
aes_simd_algs[i] = simd;
}
return 0;
unregister_simds:
aes_exit();
return err;
}
module_init(aes_init);
module_exit(aes_exit);
/*
* Glue Code for the asm optimized version of the AES Cipher Algorithm
*/
#include <linux/module.h>
#include <linux/crypto.h>
#include <crypto/aes.h>
#include "aes_glue.h"
EXPORT_SYMBOL(AES_encrypt);
EXPORT_SYMBOL(AES_decrypt);
EXPORT_SYMBOL(private_AES_set_encrypt_key);
EXPORT_SYMBOL(private_AES_set_decrypt_key);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct AES_CTX *ctx = crypto_tfm_ctx(tfm);
AES_encrypt(src, dst, &ctx->enc_key);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct AES_CTX *ctx = crypto_tfm_ctx(tfm);
AES_decrypt(src, dst, &ctx->dec_key);
}
static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct AES_CTX *ctx = crypto_tfm_ctx(tfm);
switch (key_len) {
case AES_KEYSIZE_128:
key_len = 128;
break;
case AES_KEYSIZE_192:
key_len = 192;
break;
case AES_KEYSIZE_256:
key_len = 256;
break;
default:
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
if (private_AES_set_encrypt_key(in_key, key_len, &ctx->enc_key) == -1) {
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
/* private_AES_set_decrypt_key expects an encryption key as input */
ctx->dec_key = ctx->enc_key;
if (private_AES_set_decrypt_key(in_key, key_len, &ctx->dec_key) == -1) {
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
return 0;
}
static struct crypto_alg aes_alg = {
.cra_name = "aes",
.cra_driver_name = "aes-asm",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct AES_CTX),
.cra_module = THIS_MODULE,
.cra_list = LIST_HEAD_INIT(aes_alg.cra_list),
.cra_u = {
.cipher = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
}
}
};
static int __init aes_init(void)
{
return crypto_register_alg(&aes_alg);
}
static void __exit aes_fini(void)
{
crypto_unregister_alg(&aes_alg);
}
module_init(aes_init);
module_exit(aes_fini);
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm (ASM)");
MODULE_LICENSE("GPL");
MODULE_ALIAS_CRYPTO("aes");
MODULE_ALIAS_CRYPTO("aes-asm");
MODULE_AUTHOR("David McCullough <ucdevel@gmail.com>");
#define AES_MAXNR 14
struct AES_KEY {
unsigned int rd_key[4 * (AES_MAXNR + 1)];
int rounds;
};
struct AES_CTX {
struct AES_KEY enc_key;
struct AES_KEY dec_key;
};
asmlinkage void AES_encrypt(const u8 *in, u8 *out, struct AES_KEY *ctx);
asmlinkage void AES_decrypt(const u8 *in, u8 *out, struct AES_KEY *ctx);
asmlinkage int private_AES_set_decrypt_key(const unsigned char *userKey,
const int bits, struct AES_KEY *key);
asmlinkage int private_AES_set_encrypt_key(const unsigned char *userKey,
const int bits, struct AES_KEY *key);
此差异已折叠。
/*
* linux/arch/arm/crypto/aesbs-glue.c - glue code for NEON bit sliced AES
*
* Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/cbc.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <linux/module.h>
#include <crypto/xts.h>
#include "aes_glue.h"
#define BIT_SLICED_KEY_MAXSIZE (128 * (AES_MAXNR - 1) + 2 * AES_BLOCK_SIZE)
struct BS_KEY {
struct AES_KEY rk;
int converted;
u8 __aligned(8) bs[BIT_SLICED_KEY_MAXSIZE];
} __aligned(8);
asmlinkage void bsaes_enc_key_convert(u8 out[], struct AES_KEY const *in);
asmlinkage void bsaes_dec_key_convert(u8 out[], struct AES_KEY const *in);
asmlinkage void bsaes_cbc_encrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 iv[]);
asmlinkage void bsaes_ctr32_encrypt_blocks(u8 const in[], u8 out[], u32 blocks,
struct BS_KEY *key, u8 const iv[]);
asmlinkage void bsaes_xts_encrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 tweak[]);
asmlinkage void bsaes_xts_decrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 tweak[]);
struct aesbs_cbc_ctx {
struct AES_KEY enc;
struct BS_KEY dec;
};
struct aesbs_ctr_ctx {
struct BS_KEY enc;
};
struct aesbs_xts_ctx {
struct BS_KEY enc;
struct BS_KEY dec;
struct AES_KEY twkey;
};
static int aesbs_cbc_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
int bits = key_len * 8;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc)) {
crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
ctx->dec.rk = ctx->enc;
private_AES_set_decrypt_key(in_key, bits, &ctx->dec.rk);
ctx->dec.converted = 0;
return 0;
}
static int aesbs_ctr_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
int bits = key_len * 8;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc.rk)) {
crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
ctx->enc.converted = 0;
return 0;
}
static int aesbs_xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int bits = key_len * 4;
int err;
err = xts_verify_key(tfm, in_key, key_len);
if (err)
return err;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc.rk)) {
crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
ctx->dec.rk = ctx->enc.rk;
private_AES_set_decrypt_key(in_key, bits, &ctx->dec.rk);
private_AES_set_encrypt_key(in_key + key_len / 2, bits, &ctx->twkey);
ctx->enc.converted = ctx->dec.converted = 0;
return 0;
}
static inline void aesbs_encrypt_one(struct crypto_skcipher *tfm,
const u8 *src, u8 *dst)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
AES_encrypt(src, dst, &ctx->enc);
}
static int aesbs_cbc_encrypt(struct skcipher_request *req)
{
return crypto_cbc_encrypt_walk(req, aesbs_encrypt_one);
}
static inline void aesbs_decrypt_one(struct crypto_skcipher *tfm,
const u8 *src, u8 *dst)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
AES_decrypt(src, dst, &ctx->dec.rk);
}
static int aesbs_cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int nbytes;
int err;
for (err = skcipher_walk_virt(&walk, req, false);
(nbytes = walk.nbytes); err = skcipher_walk_done(&walk, nbytes)) {
u32 blocks = nbytes / AES_BLOCK_SIZE;
u8 *dst = walk.dst.virt.addr;
u8 *src = walk.src.virt.addr;
u8 *iv = walk.iv;
if (blocks >= 8) {
kernel_neon_begin();
bsaes_cbc_encrypt(src, dst, nbytes, &ctx->dec, iv);
kernel_neon_end();
nbytes %= AES_BLOCK_SIZE;
continue;
}
nbytes = crypto_cbc_decrypt_blocks(&walk, tfm,
aesbs_decrypt_one);
}
return err;
}
static void inc_be128_ctr(__be32 ctr[], u32 addend)
{
int i;
for (i = 3; i >= 0; i--, addend = 1) {
u32 n = be32_to_cpu(ctr[i]) + addend;
ctr[i] = cpu_to_be32(n);
if (n >= addend)
break;
}
}
static int aesbs_ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u32 blocks;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = walk.nbytes / AES_BLOCK_SIZE)) {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
__be32 *ctr = (__be32 *)walk.iv;
u32 headroom = UINT_MAX - be32_to_cpu(ctr[3]);
/* avoid 32 bit counter overflow in the NEON code */
if (unlikely(headroom < blocks)) {
blocks = headroom + 1;
tail = walk.nbytes - blocks * AES_BLOCK_SIZE;
}
kernel_neon_begin();
bsaes_ctr32_encrypt_blocks(walk.src.virt.addr,
walk.dst.virt.addr, blocks,
&ctx->enc, walk.iv);
kernel_neon_end();
inc_be128_ctr(ctr, blocks);
err = skcipher_walk_done(&walk, tail);
}
if (walk.nbytes) {
u8 *tdst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *tsrc = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
u8 ks[AES_BLOCK_SIZE];
AES_encrypt(walk.iv, ks, &ctx->enc.rk);
if (tdst != tsrc)
memcpy(tdst, tsrc, walk.nbytes);
crypto_xor(tdst, ks, walk.nbytes);
err = skcipher_walk_done(&walk, 0);
}
return err;
}
static int aesbs_xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, false);
/* generate the initial tweak */
AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
while (walk.nbytes) {
kernel_neon_begin();
bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
walk.nbytes, &ctx->enc, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int aesbs_xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, false);
/* generate the initial tweak */
AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
while (walk.nbytes) {
kernel_neon_begin();
bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
walk.nbytes, &ctx->dec, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static struct skcipher_alg aesbs_algs[] = { {
.base = {
.cra_name = "__cbc(aes)",
.cra_driver_name = "__cbc-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_set_key,
.encrypt = aesbs_cbc_encrypt,
.decrypt = aesbs_cbc_decrypt,
}, {
.base = {
.cra_name = "__ctr(aes)",
.cra_driver_name = "__ctr-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.chunksize = AES_BLOCK_SIZE,
.setkey = aesbs_ctr_set_key,
.encrypt = aesbs_ctr_encrypt,
.decrypt = aesbs_ctr_encrypt,
}, {
.base = {
.cra_name = "__xts(aes)",
.cra_driver_name = "__xts-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE,
},
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_xts_set_key,
.encrypt = aesbs_xts_encrypt,
.decrypt = aesbs_xts_decrypt,
} };
struct simd_skcipher_alg *aesbs_simd_algs[ARRAY_SIZE(aesbs_algs)];
static void aesbs_mod_exit(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(aesbs_simd_algs) && aesbs_simd_algs[i]; i++)
simd_skcipher_free(aesbs_simd_algs[i]);
crypto_unregister_skciphers(aesbs_algs, ARRAY_SIZE(aesbs_algs));
}
static int __init aesbs_mod_init(void)
{
struct simd_skcipher_alg *simd;
const char *basename;
const char *algname;
const char *drvname;
int err;
int i;
if (!cpu_has_neon())
return -ENODEV;
err = crypto_register_skciphers(aesbs_algs, ARRAY_SIZE(aesbs_algs));
if (err)
return err;
for (i = 0; i < ARRAY_SIZE(aesbs_algs); i++) {
algname = aesbs_algs[i].base.cra_name + 2;
drvname = aesbs_algs[i].base.cra_driver_name + 2;
basename = aesbs_algs[i].base.cra_driver_name;
simd = simd_skcipher_create_compat(algname, drvname, basename);
err = PTR_ERR(simd);
if (IS_ERR(simd))
goto unregister_simds;
aesbs_simd_algs[i] = simd;
}
return 0;
unregister_simds:
aesbs_mod_exit();
return err;
}
module_init(aesbs_mod_init);
module_exit(aesbs_mod_exit);
MODULE_DESCRIPTION("Bit sliced AES in CBC/CTR/XTS modes using NEON");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL");
此差异已折叠。
/*
* ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
*
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Based on:
* ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSE3 functions
*
* Copyright (C) 2015 Martin Willi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*/
#include <linux/linkage.h>
.text
.fpu neon
.align 5
ENTRY(chacha20_block_xor_neon)
// r0: Input state matrix, s
// r1: 1 data block output, o
// r2: 1 data block input, i
//
// This function encrypts one ChaCha20 block by loading the state matrix
// in four NEON registers. It performs matrix operation on four words in
// parallel, but requireds shuffling to rearrange the words after each
// round.
//
// x0..3 = s0..3
add ip, r0, #0x20
vld1.32 {q0-q1}, [r0]
vld1.32 {q2-q3}, [ip]
vmov q8, q0
vmov q9, q1
vmov q10, q2
vmov q11, q3
mov r3, #10
.Ldoubleround:
// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
vadd.i32 q0, q0, q1
veor q4, q3, q0
vshl.u32 q3, q4, #16
vsri.u32 q3, q4, #16
// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
vadd.i32 q2, q2, q3
veor q4, q1, q2
vshl.u32 q1, q4, #12
vsri.u32 q1, q4, #20
// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
vadd.i32 q0, q0, q1
veor q4, q3, q0
vshl.u32 q3, q4, #8
vsri.u32 q3, q4, #24
// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
vadd.i32 q2, q2, q3
veor q4, q1, q2
vshl.u32 q1, q4, #7
vsri.u32 q1, q4, #25
// x1 = shuffle32(x1, MASK(0, 3, 2, 1))
vext.8 q1, q1, q1, #4
// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
vext.8 q2, q2, q2, #8
// x3 = shuffle32(x3, MASK(2, 1, 0, 3))
vext.8 q3, q3, q3, #12
// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
vadd.i32 q0, q0, q1
veor q4, q3, q0
vshl.u32 q3, q4, #16
vsri.u32 q3, q4, #16
// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
vadd.i32 q2, q2, q3
veor q4, q1, q2
vshl.u32 q1, q4, #12
vsri.u32 q1, q4, #20
// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
vadd.i32 q0, q0, q1
veor q4, q3, q0
vshl.u32 q3, q4, #8
vsri.u32 q3, q4, #24
// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
vadd.i32 q2, q2, q3
veor q4, q1, q2
vshl.u32 q1, q4, #7
vsri.u32 q1, q4, #25
// x1 = shuffle32(x1, MASK(2, 1, 0, 3))
vext.8 q1, q1, q1, #12
// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
vext.8 q2, q2, q2, #8
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
vext.8 q3, q3, q3, #4
subs r3, r3, #1
bne .Ldoubleround
add ip, r2, #0x20
vld1.8 {q4-q5}, [r2]
vld1.8 {q6-q7}, [ip]
// o0 = i0 ^ (x0 + s0)
vadd.i32 q0, q0, q8
veor q0, q0, q4
// o1 = i1 ^ (x1 + s1)
vadd.i32 q1, q1, q9
veor q1, q1, q5
// o2 = i2 ^ (x2 + s2)
vadd.i32 q2, q2, q10
veor q2, q2, q6
// o3 = i3 ^ (x3 + s3)
vadd.i32 q3, q3, q11
veor q3, q3, q7
add ip, r1, #0x20
vst1.8 {q0-q1}, [r1]
vst1.8 {q2-q3}, [ip]
bx lr
ENDPROC(chacha20_block_xor_neon)
.align 5
ENTRY(chacha20_4block_xor_neon)
push {r4-r6, lr}
mov ip, sp // preserve the stack pointer
sub r3, sp, #0x20 // allocate a 32 byte buffer
bic r3, r3, #0x1f // aligned to 32 bytes
mov sp, r3
// r0: Input state matrix, s
// r1: 4 data blocks output, o
// r2: 4 data blocks input, i
//
// This function encrypts four consecutive ChaCha20 blocks by loading
// the state matrix in NEON registers four times. The algorithm performs
// each operation on the corresponding word of each state matrix, hence
// requires no word shuffling. For final XORing step we transpose the
// matrix by interleaving 32- and then 64-bit words, which allows us to
// do XOR in NEON registers.
//
// x0..15[0-3] = s0..3[0..3]
add r3, r0, #0x20
vld1.32 {q0-q1}, [r0]
vld1.32 {q2-q3}, [r3]
adr r3, CTRINC
vdup.32 q15, d7[1]
vdup.32 q14, d7[0]
vld1.32 {q11}, [r3, :128]
vdup.32 q13, d6[1]
vdup.32 q12, d6[0]
vadd.i32 q12, q12, q11 // x12 += counter values 0-3
vdup.32 q11, d5[1]
vdup.32 q10, d5[0]
vdup.32 q9, d4[1]
vdup.32 q8, d4[0]
vdup.32 q7, d3[1]
vdup.32 q6, d3[0]
vdup.32 q5, d2[1]
vdup.32 q4, d2[0]
vdup.32 q3, d1[1]
vdup.32 q2, d1[0]
vdup.32 q1, d0[1]
vdup.32 q0, d0[0]
mov r3, #10
.Ldoubleround4:
// x0 += x4, x12 = rotl32(x12 ^ x0, 16)
// x1 += x5, x13 = rotl32(x13 ^ x1, 16)
// x2 += x6, x14 = rotl32(x14 ^ x2, 16)
// x3 += x7, x15 = rotl32(x15 ^ x3, 16)
vadd.i32 q0, q0, q4
vadd.i32 q1, q1, q5
vadd.i32 q2, q2, q6
vadd.i32 q3, q3, q7
veor q12, q12, q0
veor q13, q13, q1
veor q14, q14, q2
veor q15, q15, q3
vrev32.16 q12, q12
vrev32.16 q13, q13
vrev32.16 q14, q14
vrev32.16 q15, q15
// x8 += x12, x4 = rotl32(x4 ^ x8, 12)
// x9 += x13, x5 = rotl32(x5 ^ x9, 12)
// x10 += x14, x6 = rotl32(x6 ^ x10, 12)
// x11 += x15, x7 = rotl32(x7 ^ x11, 12)
vadd.i32 q8, q8, q12
vadd.i32 q9, q9, q13
vadd.i32 q10, q10, q14
vadd.i32 q11, q11, q15
vst1.32 {q8-q9}, [sp, :256]
veor q8, q4, q8
veor q9, q5, q9
vshl.u32 q4, q8, #12
vshl.u32 q5, q9, #12
vsri.u32 q4, q8, #20
vsri.u32 q5, q9, #20
veor q8, q6, q10
veor q9, q7, q11
vshl.u32 q6, q8, #12
vshl.u32 q7, q9, #12
vsri.u32 q6, q8, #20
vsri.u32 q7, q9, #20
// x0 += x4, x12 = rotl32(x12 ^ x0, 8)
// x1 += x5, x13 = rotl32(x13 ^ x1, 8)
// x2 += x6, x14 = rotl32(x14 ^ x2, 8)
// x3 += x7, x15 = rotl32(x15 ^ x3, 8)
vadd.i32 q0, q0, q4
vadd.i32 q1, q1, q5
vadd.i32 q2, q2, q6
vadd.i32 q3, q3, q7
veor q8, q12, q0
veor q9, q13, q1
vshl.u32 q12, q8, #8
vshl.u32 q13, q9, #8
vsri.u32 q12, q8, #24
vsri.u32 q13, q9, #24
veor q8, q14, q2
veor q9, q15, q3
vshl.u32 q14, q8, #8
vshl.u32 q15, q9, #8
vsri.u32 q14, q8, #24
vsri.u32 q15, q9, #24
vld1.32 {q8-q9}, [sp, :256]
// x8 += x12, x4 = rotl32(x4 ^ x8, 7)
// x9 += x13, x5 = rotl32(x5 ^ x9, 7)
// x10 += x14, x6 = rotl32(x6 ^ x10, 7)
// x11 += x15, x7 = rotl32(x7 ^ x11, 7)
vadd.i32 q8, q8, q12
vadd.i32 q9, q9, q13
vadd.i32 q10, q10, q14
vadd.i32 q11, q11, q15
vst1.32 {q8-q9}, [sp, :256]
veor q8, q4, q8
veor q9, q5, q9
vshl.u32 q4, q8, #7
vshl.u32 q5, q9, #7
vsri.u32 q4, q8, #25
vsri.u32 q5, q9, #25
veor q8, q6, q10
veor q9, q7, q11
vshl.u32 q6, q8, #7
vshl.u32 q7, q9, #7
vsri.u32 q6, q8, #25
vsri.u32 q7, q9, #25
vld1.32 {q8-q9}, [sp, :256]
// x0 += x5, x15 = rotl32(x15 ^ x0, 16)
// x1 += x6, x12 = rotl32(x12 ^ x1, 16)
// x2 += x7, x13 = rotl32(x13 ^ x2, 16)
// x3 += x4, x14 = rotl32(x14 ^ x3, 16)
vadd.i32 q0, q0, q5
vadd.i32 q1, q1, q6
vadd.i32 q2, q2, q7
vadd.i32 q3, q3, q4
veor q15, q15, q0
veor q12, q12, q1
veor q13, q13, q2
veor q14, q14, q3
vrev32.16 q15, q15
vrev32.16 q12, q12
vrev32.16 q13, q13
vrev32.16 q14, q14
// x10 += x15, x5 = rotl32(x5 ^ x10, 12)
// x11 += x12, x6 = rotl32(x6 ^ x11, 12)
// x8 += x13, x7 = rotl32(x7 ^ x8, 12)
// x9 += x14, x4 = rotl32(x4 ^ x9, 12)
vadd.i32 q10, q10, q15
vadd.i32 q11, q11, q12
vadd.i32 q8, q8, q13
vadd.i32 q9, q9, q14
vst1.32 {q8-q9}, [sp, :256]
veor q8, q7, q8
veor q9, q4, q9
vshl.u32 q7, q8, #12
vshl.u32 q4, q9, #12
vsri.u32 q7, q8, #20
vsri.u32 q4, q9, #20
veor q8, q5, q10
veor q9, q6, q11
vshl.u32 q5, q8, #12
vshl.u32 q6, q9, #12
vsri.u32 q5, q8, #20
vsri.u32 q6, q9, #20
// x0 += x5, x15 = rotl32(x15 ^ x0, 8)
// x1 += x6, x12 = rotl32(x12 ^ x1, 8)
// x2 += x7, x13 = rotl32(x13 ^ x2, 8)
// x3 += x4, x14 = rotl32(x14 ^ x3, 8)
vadd.i32 q0, q0, q5
vadd.i32 q1, q1, q6
vadd.i32 q2, q2, q7
vadd.i32 q3, q3, q4
veor q8, q15, q0
veor q9, q12, q1
vshl.u32 q15, q8, #8
vshl.u32 q12, q9, #8
vsri.u32 q15, q8, #24
vsri.u32 q12, q9, #24
veor q8, q13, q2
veor q9, q14, q3
vshl.u32 q13, q8, #8
vshl.u32 q14, q9, #8
vsri.u32 q13, q8, #24
vsri.u32 q14, q9, #24
vld1.32 {q8-q9}, [sp, :256]
// x10 += x15, x5 = rotl32(x5 ^ x10, 7)
// x11 += x12, x6 = rotl32(x6 ^ x11, 7)
// x8 += x13, x7 = rotl32(x7 ^ x8, 7)
// x9 += x14, x4 = rotl32(x4 ^ x9, 7)
vadd.i32 q10, q10, q15
vadd.i32 q11, q11, q12
vadd.i32 q8, q8, q13
vadd.i32 q9, q9, q14
vst1.32 {q8-q9}, [sp, :256]
veor q8, q7, q8
veor q9, q4, q9
vshl.u32 q7, q8, #7
vshl.u32 q4, q9, #7
vsri.u32 q7, q8, #25
vsri.u32 q4, q9, #25
veor q8, q5, q10
veor q9, q6, q11
vshl.u32 q5, q8, #7
vshl.u32 q6, q9, #7
vsri.u32 q5, q8, #25
vsri.u32 q6, q9, #25
subs r3, r3, #1
beq 0f
vld1.32 {q8-q9}, [sp, :256]
b .Ldoubleround4
// x0[0-3] += s0[0]
// x1[0-3] += s0[1]
// x2[0-3] += s0[2]
// x3[0-3] += s0[3]
0: ldmia r0!, {r3-r6}
vdup.32 q8, r3
vdup.32 q9, r4
vadd.i32 q0, q0, q8
vadd.i32 q1, q1, q9
vdup.32 q8, r5
vdup.32 q9, r6
vadd.i32 q2, q2, q8
vadd.i32 q3, q3, q9
// x4[0-3] += s1[0]
// x5[0-3] += s1[1]
// x6[0-3] += s1[2]
// x7[0-3] += s1[3]
ldmia r0!, {r3-r6}
vdup.32 q8, r3
vdup.32 q9, r4
vadd.i32 q4, q4, q8
vadd.i32 q5, q5, q9
vdup.32 q8, r5
vdup.32 q9, r6
vadd.i32 q6, q6, q8
vadd.i32 q7, q7, q9
// interleave 32-bit words in state n, n+1
vzip.32 q0, q1
vzip.32 q2, q3
vzip.32 q4, q5
vzip.32 q6, q7
// interleave 64-bit words in state n, n+2
vswp d1, d4
vswp d3, d6
vswp d9, d12
vswp d11, d14
// xor with corresponding input, write to output
vld1.8 {q8-q9}, [r2]!
veor q8, q8, q0
veor q9, q9, q4
vst1.8 {q8-q9}, [r1]!
vld1.32 {q8-q9}, [sp, :256]
// x8[0-3] += s2[0]
// x9[0-3] += s2[1]
// x10[0-3] += s2[2]
// x11[0-3] += s2[3]
ldmia r0!, {r3-r6}
vdup.32 q0, r3
vdup.32 q4, r4
vadd.i32 q8, q8, q0
vadd.i32 q9, q9, q4
vdup.32 q0, r5
vdup.32 q4, r6
vadd.i32 q10, q10, q0
vadd.i32 q11, q11, q4
// x12[0-3] += s3[0]
// x13[0-3] += s3[1]
// x14[0-3] += s3[2]
// x15[0-3] += s3[3]
ldmia r0!, {r3-r6}
vdup.32 q0, r3
vdup.32 q4, r4
adr r3, CTRINC
vadd.i32 q12, q12, q0
vld1.32 {q0}, [r3, :128]
vadd.i32 q13, q13, q4
vadd.i32 q12, q12, q0 // x12 += counter values 0-3
vdup.32 q0, r5
vdup.32 q4, r6
vadd.i32 q14, q14, q0
vadd.i32 q15, q15, q4
// interleave 32-bit words in state n, n+1
vzip.32 q8, q9
vzip.32 q10, q11
vzip.32 q12, q13
vzip.32 q14, q15
// interleave 64-bit words in state n, n+2
vswp d17, d20
vswp d19, d22
vswp d25, d28
vswp d27, d30
vmov q4, q1
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q8
veor q1, q1, q12
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q2
veor q1, q1, q6
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q10
veor q1, q1, q14
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q4
veor q1, q1, q5
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q9
veor q1, q1, q13
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]!
veor q0, q0, q3
veor q1, q1, q7
vst1.8 {q0-q1}, [r1]!
vld1.8 {q0-q1}, [r2]
veor q0, q0, q11
veor q1, q1, q15
vst1.8 {q0-q1}, [r1]
mov sp, ip
pop {r4-r6, pc}
ENDPROC(chacha20_4block_xor_neon)
.align 4
CTRINC: .word 0, 1, 2, 3
/*
* ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
*
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Based on:
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
*
* Copyright (C) 2015 Martin Willi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*/
#include <crypto/algapi.h>
#include <crypto/chacha20.h>
#include <crypto/internal/skcipher.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
unsigned int bytes)
{
u8 buf[CHACHA20_BLOCK_SIZE];
while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
chacha20_4block_xor_neon(state, dst, src);
bytes -= CHACHA20_BLOCK_SIZE * 4;
src += CHACHA20_BLOCK_SIZE * 4;
dst += CHACHA20_BLOCK_SIZE * 4;
state[12] += 4;
}
while (bytes >= CHACHA20_BLOCK_SIZE) {
chacha20_block_xor_neon(state, dst, src);
bytes -= CHACHA20_BLOCK_SIZE;
src += CHACHA20_BLOCK_SIZE;
dst += CHACHA20_BLOCK_SIZE;
state[12]++;
}
if (bytes) {
memcpy(buf, src, bytes);
chacha20_block_xor_neon(state, buf, buf);
memcpy(dst, buf, bytes);
}
}
static int chacha20_neon(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u32 state[16];
int err;
if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
return crypto_chacha20_crypt(req);
err = skcipher_walk_virt(&walk, req, true);
crypto_chacha20_init(state, ctx, walk.iv);
kernel_neon_begin();
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
kernel_neon_end();
return err;
}
static struct skcipher_alg alg = {
.base.cra_name = "chacha20",
.base.cra_driver_name = "chacha20-neon",
.base.cra_priority = 300,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct chacha20_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CHACHA20_KEY_SIZE,
.max_keysize = CHACHA20_KEY_SIZE,
.ivsize = CHACHA20_IV_SIZE,
.chunksize = CHACHA20_BLOCK_SIZE,
.walksize = 4 * CHACHA20_BLOCK_SIZE,
.setkey = crypto_chacha20_setkey,
.encrypt = chacha20_neon,
.decrypt = chacha20_neon,
};
static int __init chacha20_simd_mod_init(void)
{
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
return crypto_register_skcipher(&alg);
}
static void __exit chacha20_simd_mod_fini(void)
{
crypto_unregister_skcipher(&alg);
}
module_init(chacha20_simd_mod_init);
module_exit(chacha20_simd_mod_fini);
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("chacha20");
...@@ -516,4 +516,3 @@ CONFIG_CRYPTO_GHASH_ARM64_CE=y ...@@ -516,4 +516,3 @@ CONFIG_CRYPTO_GHASH_ARM64_CE=y
CONFIG_CRYPTO_AES_ARM64_CE_CCM=y CONFIG_CRYPTO_AES_ARM64_CE_CCM=y
CONFIG_CRYPTO_AES_ARM64_CE_BLK=y CONFIG_CRYPTO_AES_ARM64_CE_BLK=y
# CONFIG_CRYPTO_AES_ARM64_NEON_BLK is not set # CONFIG_CRYPTO_AES_ARM64_NEON_BLK is not set
CONFIG_CRYPTO_CRC32_ARM64=y
...@@ -37,10 +37,14 @@ config CRYPTO_CRCT10DIF_ARM64_CE ...@@ -37,10 +37,14 @@ config CRYPTO_CRCT10DIF_ARM64_CE
select CRYPTO_HASH select CRYPTO_HASH
config CRYPTO_CRC32_ARM64_CE config CRYPTO_CRC32_ARM64_CE
tristate "CRC32 and CRC32C digest algorithms using PMULL instructions" tristate "CRC32 and CRC32C digest algorithms using ARMv8 extensions"
depends on KERNEL_MODE_NEON && CRC32 depends on CRC32
select CRYPTO_HASH select CRYPTO_HASH
config CRYPTO_AES_ARM64
tristate "AES core cipher using scalar instructions"
select CRYPTO_AES
config CRYPTO_AES_ARM64_CE config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions" tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON depends on ARM64 && KERNEL_MODE_NEON
...@@ -67,9 +71,17 @@ config CRYPTO_AES_ARM64_NEON_BLK ...@@ -67,9 +71,17 @@ config CRYPTO_AES_ARM64_NEON_BLK
select CRYPTO_AES select CRYPTO_AES
select CRYPTO_SIMD select CRYPTO_SIMD
config CRYPTO_CRC32_ARM64 config CRYPTO_CHACHA20_NEON
tristate "CRC32 and CRC32C using optional ARMv8 instructions" tristate "NEON accelerated ChaCha20 symmetric cipher"
depends on ARM64 depends on KERNEL_MODE_NEON
select CRYPTO_HASH select CRYPTO_BLKCIPHER
select CRYPTO_CHACHA20
config CRYPTO_AES_ARM64_BS
tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_AES_ARM64_NEON_BLK
select CRYPTO_SIMD
endif endif
...@@ -41,15 +41,20 @@ sha256-arm64-y := sha256-glue.o sha256-core.o ...@@ -41,15 +41,20 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
sha512-arm64-y := sha512-glue.o sha512-core.o sha512-arm64-y := sha512-glue.o sha512-core.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o
aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o
aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
AFLAGS_aes-ce.o := -DINTERLEAVE=4 AFLAGS_aes-ce.o := -DINTERLEAVE=4
AFLAGS_aes-neon.o := -DINTERLEAVE=4 AFLAGS_aes-neon.o := -DINTERLEAVE=4
CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS
obj-$(CONFIG_CRYPTO_CRC32_ARM64) += crc32-arm64.o
CFLAGS_crc32-arm64.o := -mcpu=generic+crc
$(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE $(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE
$(call if_changed_rule,cc_o_c) $(call if_changed_rule,cc_o_c)
......
...@@ -258,7 +258,6 @@ static struct aead_alg ccm_aes_alg = { ...@@ -258,7 +258,6 @@ static struct aead_alg ccm_aes_alg = {
.cra_priority = 300, .cra_priority = 300,
.cra_blocksize = 1, .cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.ivsize = AES_BLOCK_SIZE, .ivsize = AES_BLOCK_SIZE,
......
/*
* Scalar AES core transform
*
* Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
.text
rk .req x0
out .req x1
in .req x2
rounds .req x3
tt .req x4
lt .req x2
.macro __pair, enc, reg0, reg1, in0, in1e, in1d, shift
ubfx \reg0, \in0, #\shift, #8
.if \enc
ubfx \reg1, \in1e, #\shift, #8
.else
ubfx \reg1, \in1d, #\shift, #8
.endif
ldr \reg0, [tt, \reg0, uxtw #2]
ldr \reg1, [tt, \reg1, uxtw #2]
.endm
.macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc
ldp \out0, \out1, [rk], #8
__pair \enc, w13, w14, \in0, \in1, \in3, 0
__pair \enc, w15, w16, \in1, \in2, \in0, 8
__pair \enc, w17, w18, \in2, \in3, \in1, 16
__pair \enc, \t0, \t1, \in3, \in0, \in2, 24
eor \out0, \out0, w13
eor \out1, \out1, w14
eor \out0, \out0, w15, ror #24
eor \out1, \out1, w16, ror #24
eor \out0, \out0, w17, ror #16
eor \out1, \out1, w18, ror #16
eor \out0, \out0, \t0, ror #8
eor \out1, \out1, \t1, ror #8
.endm
.macro fround, out0, out1, out2, out3, in0, in1, in2, in3
__hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
__hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
.endm
.macro iround, out0, out1, out2, out3, in0, in1, in2, in3
__hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
__hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
.endm
.macro do_crypt, round, ttab, ltab
ldp w5, w6, [in]
ldp w7, w8, [in, #8]
ldp w9, w10, [rk], #16
ldp w11, w12, [rk, #-8]
CPU_BE( rev w5, w5 )
CPU_BE( rev w6, w6 )
CPU_BE( rev w7, w7 )
CPU_BE( rev w8, w8 )
eor w5, w5, w9
eor w6, w6, w10
eor w7, w7, w11
eor w8, w8, w12
adr_l tt, \ttab
adr_l lt, \ltab
tbnz rounds, #1, 1f
0: \round w9, w10, w11, w12, w5, w6, w7, w8
\round w5, w6, w7, w8, w9, w10, w11, w12
1: subs rounds, rounds, #4
\round w9, w10, w11, w12, w5, w6, w7, w8
csel tt, tt, lt, hi
\round w5, w6, w7, w8, w9, w10, w11, w12
b.hi 0b
CPU_BE( rev w5, w5 )
CPU_BE( rev w6, w6 )
CPU_BE( rev w7, w7 )
CPU_BE( rev w8, w8 )
stp w5, w6, [out]
stp w7, w8, [out, #8]
ret
.endm
.align 5
ENTRY(__aes_arm64_encrypt)
do_crypt fround, crypto_ft_tab, crypto_fl_tab
ENDPROC(__aes_arm64_encrypt)
.align 5
ENTRY(__aes_arm64_decrypt)
do_crypt iround, crypto_it_tab, crypto_il_tab
ENDPROC(__aes_arm64_decrypt)
/*
* Scalar AES core transform
*
* Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <crypto/aes.h>
#include <linux/crypto.h>
#include <linux/module.h>
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
EXPORT_SYMBOL(__aes_arm64_encrypt);
asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
EXPORT_SYMBOL(__aes_arm64_decrypt);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
__aes_arm64_encrypt(ctx->key_enc, out, in, rounds);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
__aes_arm64_decrypt(ctx->key_dec, out, in, rounds);
}
static struct crypto_alg aes_alg = {
.cra_name = "aes",
.cra_driver_name = "aes-arm64",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
.cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE,
.cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE,
.cra_cipher.cia_setkey = crypto_aes_set_key,
.cra_cipher.cia_encrypt = aes_encrypt,
.cra_cipher.cia_decrypt = aes_decrypt
};
static int __init aes_init(void)
{
return crypto_register_alg(&aes_alg);
}
static void __exit aes_fini(void)
{
crypto_unregister_alg(&aes_alg);
}
module_init(aes_init);
module_exit(aes_fini);
MODULE_DESCRIPTION("Scalar AES cipher for arm64");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("aes");
/* /*
* linux/arch/arm64/crypto/aes-glue.c - wrapper code for ARMv8 AES * linux/arch/arm64/crypto/aes-glue.c - wrapper code for ARMv8 AES
* *
* Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org> * Copyright (C) 2013 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
* *
* This program is free software; you can redistribute it and/or modify * This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as * it under the terms of the GNU General Public License version 2 as
...@@ -11,6 +11,7 @@ ...@@ -11,6 +11,7 @@
#include <asm/neon.h> #include <asm/neon.h>
#include <asm/hwcap.h> #include <asm/hwcap.h>
#include <crypto/aes.h> #include <crypto/aes.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h> #include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h> #include <crypto/internal/skcipher.h>
#include <linux/module.h> #include <linux/module.h>
...@@ -31,6 +32,7 @@ ...@@ -31,6 +32,7 @@
#define aes_ctr_encrypt ce_aes_ctr_encrypt #define aes_ctr_encrypt ce_aes_ctr_encrypt
#define aes_xts_encrypt ce_aes_xts_encrypt #define aes_xts_encrypt ce_aes_xts_encrypt
#define aes_xts_decrypt ce_aes_xts_decrypt #define aes_xts_decrypt ce_aes_xts_decrypt
#define aes_mac_update ce_aes_mac_update
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions"); MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else #else
#define MODE "neon" #define MODE "neon"
...@@ -44,11 +46,15 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions"); ...@@ -44,11 +46,15 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#define aes_ctr_encrypt neon_aes_ctr_encrypt #define aes_ctr_encrypt neon_aes_ctr_encrypt
#define aes_xts_encrypt neon_aes_xts_encrypt #define aes_xts_encrypt neon_aes_xts_encrypt
#define aes_xts_decrypt neon_aes_xts_decrypt #define aes_xts_decrypt neon_aes_xts_decrypt
#define aes_mac_update neon_aes_mac_update
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON"); MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON");
MODULE_ALIAS_CRYPTO("ecb(aes)"); MODULE_ALIAS_CRYPTO("ecb(aes)");
MODULE_ALIAS_CRYPTO("cbc(aes)"); MODULE_ALIAS_CRYPTO("cbc(aes)");
MODULE_ALIAS_CRYPTO("ctr(aes)"); MODULE_ALIAS_CRYPTO("ctr(aes)");
MODULE_ALIAS_CRYPTO("xts(aes)"); MODULE_ALIAS_CRYPTO("xts(aes)");
MODULE_ALIAS_CRYPTO("cmac(aes)");
MODULE_ALIAS_CRYPTO("xcbc(aes)");
MODULE_ALIAS_CRYPTO("cbcmac(aes)");
#endif #endif
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>"); MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
...@@ -75,11 +81,25 @@ asmlinkage void aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], ...@@ -75,11 +81,25 @@ asmlinkage void aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[],
int rounds, int blocks, u8 const rk2[], u8 iv[], int rounds, int blocks, u8 const rk2[], u8 iv[],
int first); int first);
asmlinkage void aes_mac_update(u8 const in[], u32 const rk[], int rounds,
int blocks, u8 dg[], int enc_before,
int enc_after);
struct crypto_aes_xts_ctx { struct crypto_aes_xts_ctx {
struct crypto_aes_ctx key1; struct crypto_aes_ctx key1;
struct crypto_aes_ctx __aligned(8) key2; struct crypto_aes_ctx __aligned(8) key2;
}; };
struct mac_tfm_ctx {
struct crypto_aes_ctx key;
u8 __aligned(8) consts[];
};
struct mac_desc_ctx {
unsigned int len;
u8 dg[AES_BLOCK_SIZE];
};
static int skcipher_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key, static int skcipher_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len) unsigned int key_len)
{ {
...@@ -215,14 +235,15 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -215,14 +235,15 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 *tsrc = walk.src.virt.addr; u8 *tsrc = walk.src.virt.addr;
/* /*
* Minimum alignment is 8 bytes, so if nbytes is <= 8, we need * Tell aes_ctr_encrypt() to process a tail block.
* to tell aes_ctr_encrypt() to only read half a block.
*/ */
blocks = (nbytes <= 8) ? -1 : 1; blocks = -1;
aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, rounds, aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds,
blocks, walk.iv, first); blocks, walk.iv, first);
memcpy(tdst, tail, nbytes); if (tdst != tsrc)
memcpy(tdst, tsrc, nbytes);
crypto_xor(tdst, tail, nbytes);
err = skcipher_walk_done(&walk, 0); err = skcipher_walk_done(&walk, 0);
} }
kernel_neon_end(); kernel_neon_end();
...@@ -282,7 +303,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -282,7 +303,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -298,7 +318,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -298,7 +318,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -315,7 +334,22 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -315,7 +334,22 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1, .cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7, .cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.chunksize = AES_BLOCK_SIZE,
.setkey = skcipher_aes_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
.base = {
.cra_name = "ctr(aes)",
.cra_driver_name = "ctr-aes-" MODE,
.cra_priority = PRIO - 1,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -333,7 +367,6 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -333,7 +367,6 @@ static struct skcipher_alg aes_algs[] = { {
.cra_flags = CRYPTO_ALG_INTERNAL, .cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE, .cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
.cra_alignmask = 7,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
}, },
.min_keysize = 2 * AES_MIN_KEY_SIZE, .min_keysize = 2 * AES_MIN_KEY_SIZE,
...@@ -344,15 +377,228 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -344,15 +377,228 @@ static struct skcipher_alg aes_algs[] = { {
.decrypt = xts_decrypt, .decrypt = xts_decrypt,
} }; } };
static int cbcmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
unsigned int key_len)
{
struct mac_tfm_ctx *ctx = crypto_shash_ctx(tfm);
int err;
err = aes_expandkey(&ctx->key, in_key, key_len);
if (err)
crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return err;
}
static void cmac_gf128_mul_by_x(be128 *y, const be128 *x)
{
u64 a = be64_to_cpu(x->a);
u64 b = be64_to_cpu(x->b);
y->a = cpu_to_be64((a << 1) | (b >> 63));
y->b = cpu_to_be64((b << 1) ^ ((a >> 63) ? 0x87 : 0));
}
static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
unsigned int key_len)
{
struct mac_tfm_ctx *ctx = crypto_shash_ctx(tfm);
be128 *consts = (be128 *)ctx->consts;
u8 *rk = (u8 *)ctx->key.key_enc;
int rounds = 6 + key_len / 4;
int err;
err = cbcmac_setkey(tfm, in_key, key_len);
if (err)
return err;
/* encrypt the zero vector */
kernel_neon_begin();
aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, rk, rounds, 1, 1);
kernel_neon_end();
cmac_gf128_mul_by_x(consts, consts);
cmac_gf128_mul_by_x(consts + 1, consts);
return 0;
}
static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key,
unsigned int key_len)
{
static u8 const ks[3][AES_BLOCK_SIZE] = {
{ [0 ... AES_BLOCK_SIZE - 1] = 0x1 },
{ [0 ... AES_BLOCK_SIZE - 1] = 0x2 },
{ [0 ... AES_BLOCK_SIZE - 1] = 0x3 },
};
struct mac_tfm_ctx *ctx = crypto_shash_ctx(tfm);
u8 *rk = (u8 *)ctx->key.key_enc;
int rounds = 6 + key_len / 4;
u8 key[AES_BLOCK_SIZE];
int err;
err = cbcmac_setkey(tfm, in_key, key_len);
if (err)
return err;
kernel_neon_begin();
aes_ecb_encrypt(key, ks[0], rk, rounds, 1, 1);
aes_ecb_encrypt(ctx->consts, ks[1], rk, rounds, 2, 0);
kernel_neon_end();
return cbcmac_setkey(tfm, key, sizeof(key));
}
static int mac_init(struct shash_desc *desc)
{
struct mac_desc_ctx *ctx = shash_desc_ctx(desc);
memset(ctx->dg, 0, AES_BLOCK_SIZE);
ctx->len = 0;
return 0;
}
static int mac_update(struct shash_desc *desc, const u8 *p, unsigned int len)
{
struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct mac_desc_ctx *ctx = shash_desc_ctx(desc);
int rounds = 6 + tctx->key.key_length / 4;
while (len > 0) {
unsigned int l;
if ((ctx->len % AES_BLOCK_SIZE) == 0 &&
(ctx->len + len) > AES_BLOCK_SIZE) {
int blocks = len / AES_BLOCK_SIZE;
len %= AES_BLOCK_SIZE;
kernel_neon_begin();
aes_mac_update(p, tctx->key.key_enc, rounds, blocks,
ctx->dg, (ctx->len != 0), (len != 0));
kernel_neon_end();
p += blocks * AES_BLOCK_SIZE;
if (!len) {
ctx->len = AES_BLOCK_SIZE;
break;
}
ctx->len = 0;
}
l = min(len, AES_BLOCK_SIZE - ctx->len);
if (l <= AES_BLOCK_SIZE) {
crypto_xor(ctx->dg + ctx->len, p, l);
ctx->len += l;
len -= l;
p += l;
}
}
return 0;
}
static int cbcmac_final(struct shash_desc *desc, u8 *out)
{
struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct mac_desc_ctx *ctx = shash_desc_ctx(desc);
int rounds = 6 + tctx->key.key_length / 4;
kernel_neon_begin();
aes_mac_update(NULL, tctx->key.key_enc, rounds, 0, ctx->dg, 1, 0);
kernel_neon_end();
memcpy(out, ctx->dg, AES_BLOCK_SIZE);
return 0;
}
static int cmac_final(struct shash_desc *desc, u8 *out)
{
struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct mac_desc_ctx *ctx = shash_desc_ctx(desc);
int rounds = 6 + tctx->key.key_length / 4;
u8 *consts = tctx->consts;
if (ctx->len != AES_BLOCK_SIZE) {
ctx->dg[ctx->len] ^= 0x80;
consts += AES_BLOCK_SIZE;
}
kernel_neon_begin();
aes_mac_update(consts, tctx->key.key_enc, rounds, 1, ctx->dg, 0, 1);
kernel_neon_end();
memcpy(out, ctx->dg, AES_BLOCK_SIZE);
return 0;
}
static struct shash_alg mac_algs[] = { {
.base.cra_name = "cmac(aes)",
.base.cra_driver_name = "cmac-aes-" MODE,
.base.cra_priority = PRIO,
.base.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct mac_tfm_ctx) +
2 * AES_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = AES_BLOCK_SIZE,
.init = mac_init,
.update = mac_update,
.final = cmac_final,
.setkey = cmac_setkey,
.descsize = sizeof(struct mac_desc_ctx),
}, {
.base.cra_name = "xcbc(aes)",
.base.cra_driver_name = "xcbc-aes-" MODE,
.base.cra_priority = PRIO,
.base.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct mac_tfm_ctx) +
2 * AES_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = AES_BLOCK_SIZE,
.init = mac_init,
.update = mac_update,
.final = cmac_final,
.setkey = xcbc_setkey,
.descsize = sizeof(struct mac_desc_ctx),
}, {
.base.cra_name = "cbcmac(aes)",
.base.cra_driver_name = "cbcmac-aes-" MODE,
.base.cra_priority = PRIO,
.base.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct mac_tfm_ctx),
.base.cra_module = THIS_MODULE,
.digestsize = AES_BLOCK_SIZE,
.init = mac_init,
.update = mac_update,
.final = cbcmac_final,
.setkey = cbcmac_setkey,
.descsize = sizeof(struct mac_desc_ctx),
} };
static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)]; static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
static void aes_exit(void) static void aes_exit(void)
{ {
int i; int i;
for (i = 0; i < ARRAY_SIZE(aes_simd_algs) && aes_simd_algs[i]; i++) for (i = 0; i < ARRAY_SIZE(aes_simd_algs); i++)
simd_skcipher_free(aes_simd_algs[i]); if (aes_simd_algs[i])
simd_skcipher_free(aes_simd_algs[i]);
crypto_unregister_shashes(mac_algs, ARRAY_SIZE(mac_algs));
crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs)); crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
} }
...@@ -369,7 +615,14 @@ static int __init aes_init(void) ...@@ -369,7 +615,14 @@ static int __init aes_init(void)
if (err) if (err)
return err; return err;
err = crypto_register_shashes(mac_algs, ARRAY_SIZE(mac_algs));
if (err)
goto unregister_ciphers;
for (i = 0; i < ARRAY_SIZE(aes_algs); i++) { for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
continue;
algname = aes_algs[i].base.cra_name + 2; algname = aes_algs[i].base.cra_name + 2;
drvname = aes_algs[i].base.cra_driver_name + 2; drvname = aes_algs[i].base.cra_driver_name + 2;
basename = aes_algs[i].base.cra_driver_name; basename = aes_algs[i].base.cra_driver_name;
...@@ -385,6 +638,8 @@ static int __init aes_init(void) ...@@ -385,6 +638,8 @@ static int __init aes_init(void)
unregister_simds: unregister_simds:
aes_exit(); aes_exit();
unregister_ciphers:
crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
return err; return err;
} }
...@@ -392,5 +647,7 @@ static int __init aes_init(void) ...@@ -392,5 +647,7 @@ static int __init aes_init(void)
module_cpu_feature_match(AES, aes_init); module_cpu_feature_match(AES, aes_init);
#else #else
module_init(aes_init); module_init(aes_init);
EXPORT_SYMBOL(neon_aes_ecb_encrypt);
EXPORT_SYMBOL(neon_aes_cbc_encrypt);
#endif #endif
module_exit(aes_exit); module_exit(aes_exit);
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -122,23 +122,39 @@ ...@@ -122,23 +122,39 @@
#include <linux/linkage.h> #include <linux/linkage.h>
#include <asm/inst.h> #include <asm/inst.h>
.data # constants in mergeable sections, linker can reorder and merge
.section .rodata.cst16.POLY, "aM", @progbits, 16
.align 16 .align 16
POLY: .octa 0xC2000000000000000000000000000001 POLY: .octa 0xC2000000000000000000000000000001
.section .rodata.cst16.POLY2, "aM", @progbits, 16
.align 16
POLY2: .octa 0xC20000000000000000000001C2000000 POLY2: .octa 0xC20000000000000000000001C2000000
TWOONE: .octa 0x00000001000000000000000000000001
# order of these constants should not change. .section .rodata.cst16.TWOONE, "aM", @progbits, 16
# more specifically, ALL_F should follow SHIFT_MASK, and ZERO should follow ALL_F .align 16
TWOONE: .octa 0x00000001000000000000000000000001
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
.align 16
SHUF_MASK: .octa 0x000102030405060708090A0B0C0D0E0F SHUF_MASK: .octa 0x000102030405060708090A0B0C0D0E0F
SHIFT_MASK: .octa 0x0f0e0d0c0b0a09080706050403020100
ALL_F: .octa 0xffffffffffffffffffffffffffffffff .section .rodata.cst16.ONE, "aM", @progbits, 16
ZERO: .octa 0x00000000000000000000000000000000 .align 16
ONE: .octa 0x00000000000000000000000000000001 ONE: .octa 0x00000000000000000000000000000001
.section .rodata.cst16.ONEf, "aM", @progbits, 16
.align 16
ONEf: .octa 0x01000000000000000000000000000000 ONEf: .octa 0x01000000000000000000000000000000
# order of these constants should not change.
# more specifically, ALL_F should follow SHIFT_MASK, and zero should follow ALL_F
.section .rodata, "a", @progbits
.align 16
SHIFT_MASK: .octa 0x0f0e0d0c0b0a09080706050403020100
ALL_F: .octa 0xffffffffffffffffffffffffffffffff
.octa 0x00000000000000000000000000000000
.text .text
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册