Loading crypto/aes/asm/aesni-sha1-x86_64.pl +2 −0 Original line number Diff line number Diff line Loading @@ -25,6 +25,7 @@ # Sandy Bridge 5.05[+5.0(6.1)] 10.06(11.15) 5.98(7.05) +68%(+58%) # Ivy Bridge 5.05[+4.6] 9.65 5.54 +74% # Haswell 4.43[+3.6(4.2)] 8.00(8.58) 4.55(5.21) +75%(+65%) # Skylake 2.63[+3.5(4.1)] 6.17(6.69) 4.23(4.44) +46%(+51%) # Bulldozer 5.77[+6.0] 11.72 6.37 +84% # # AES-192-CBC Loading @@ -39,6 +40,7 @@ # Sandy Bridge 7.05 12.06(13.15) 7.12(7.72) +69%(+70%) # Ivy Bridge 7.05 11.65 7.12 +64% # Haswell 6.19 9.76(10.34) 6.21(6.25) +57%(+65%) # Skylake 3.62 7.16(7.68) 4.56(4.76) +57%(+61$) # Bulldozer 8.00 13.95 8.25 +69% # # (*) There are two code paths: SSSE3 and AVX. See sha1-568.pl for Loading crypto/aes/asm/aesni-sha256-x86_64.pl +1 −0 Original line number Diff line number Diff line Loading @@ -25,6 +25,7 @@ # Sandy Bridge 5.05/6.05/7.05+11.6 13.0 +28%/36%/43% # Ivy Bridge 5.05/6.05/7.05+10.3 11.6 +32%/41%/50% # Haswell 4.43/5.29/6.19+7.80 8.79 +39%/49%/59% # Skylake 2.62/3.14/3.62+7.70 8.10 +27%/34%/40% # Bulldozer 5.77/6.89/8.00+13.7 13.7 +42%/50%/58% # # (*) there are XOP, AVX1 and AVX2 code pathes, meaning that Loading crypto/aes/asm/aesni-x86_64.pl +1 −0 Original line number Diff line number Diff line Loading @@ -165,6 +165,7 @@ # Westmere 3.77/1.25 1.25 1.25 1.26 # * Bridge 5.07/0.74 0.75 0.90 0.85 # Haswell 4.44/0.63 0.63 0.73 0.63 # Skylake 2.62/0.63 0.63 0.63 0.63 # Silvermont 5.75/3.54 3.56 4.12 3.87(*) # Bulldozer 5.77/0.70 0.72 0.90 0.70 # Loading crypto/modes/asm/aesni-gcm-x86_64.pl +5 −4 Original line number Diff line number Diff line Loading @@ -22,10 +22,11 @@ # [1] and [2], with MOVBE twist suggested by Ilya Albrekht and Max # Locktyukhin of Intel Corp. who verified that it reduces shuffles # pressure with notable relative improvement, achieving 1.0 cycle per # byte processed with 128-bit key on Haswell processor, and 0.74 - # on Broadwell. [Mentioned results are raw profiled measurements for # favourable packet size, one divisible by 96. Applications using the # EVP interface will observe a few percent worse performance.] # byte processed with 128-bit key on Haswell processor, 0.74 - on # Broadwell, 0.63 - on Skylake... [Mentioned results are raw profiled # measurements for favourable packet size, one divisible by 96. # Applications using the EVP interface will observe a few percent # worse performance.] # # [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest # [2] http://www.intel.com/content/dam/www/public/us/en/documents/software-support/enabling-high-performance-gcm.pdf Loading crypto/modes/asm/ghash-x86_64.pl +3 −2 Original line number Diff line number Diff line Loading @@ -64,6 +64,7 @@ # Ivy Bridge 1.80(+7%) # Haswell 0.55(+93%) (if system doesn't support AVX) # Broadwell 0.45(+110%)(if system doesn't support AVX) # Skylake 0.44(+110%)(if system doesn't support AVX) # Bulldozer 1.49(+27%) # Silvermont 2.88(+13%) Loading @@ -74,8 +75,8 @@ # CPUs such as Sandy and Ivy Bridge can execute it, the code performs # sub-optimally in comparison to above mentioned version. But thanks # to Ilya Albrekht and Max Locktyukhin of Intel Corp. we knew that # it performs in 0.41 cycles per byte on Haswell processor, and in # 0.29 on Broadwell. # it performs in 0.41 cycles per byte on Haswell processor, in # 0.29 on Broadwell, and in 0.36 on Skylake. # # [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest Loading Loading
crypto/aes/asm/aesni-sha1-x86_64.pl +2 −0 Original line number Diff line number Diff line Loading @@ -25,6 +25,7 @@ # Sandy Bridge 5.05[+5.0(6.1)] 10.06(11.15) 5.98(7.05) +68%(+58%) # Ivy Bridge 5.05[+4.6] 9.65 5.54 +74% # Haswell 4.43[+3.6(4.2)] 8.00(8.58) 4.55(5.21) +75%(+65%) # Skylake 2.63[+3.5(4.1)] 6.17(6.69) 4.23(4.44) +46%(+51%) # Bulldozer 5.77[+6.0] 11.72 6.37 +84% # # AES-192-CBC Loading @@ -39,6 +40,7 @@ # Sandy Bridge 7.05 12.06(13.15) 7.12(7.72) +69%(+70%) # Ivy Bridge 7.05 11.65 7.12 +64% # Haswell 6.19 9.76(10.34) 6.21(6.25) +57%(+65%) # Skylake 3.62 7.16(7.68) 4.56(4.76) +57%(+61$) # Bulldozer 8.00 13.95 8.25 +69% # # (*) There are two code paths: SSSE3 and AVX. See sha1-568.pl for Loading
crypto/aes/asm/aesni-sha256-x86_64.pl +1 −0 Original line number Diff line number Diff line Loading @@ -25,6 +25,7 @@ # Sandy Bridge 5.05/6.05/7.05+11.6 13.0 +28%/36%/43% # Ivy Bridge 5.05/6.05/7.05+10.3 11.6 +32%/41%/50% # Haswell 4.43/5.29/6.19+7.80 8.79 +39%/49%/59% # Skylake 2.62/3.14/3.62+7.70 8.10 +27%/34%/40% # Bulldozer 5.77/6.89/8.00+13.7 13.7 +42%/50%/58% # # (*) there are XOP, AVX1 and AVX2 code pathes, meaning that Loading
crypto/aes/asm/aesni-x86_64.pl +1 −0 Original line number Diff line number Diff line Loading @@ -165,6 +165,7 @@ # Westmere 3.77/1.25 1.25 1.25 1.26 # * Bridge 5.07/0.74 0.75 0.90 0.85 # Haswell 4.44/0.63 0.63 0.73 0.63 # Skylake 2.62/0.63 0.63 0.63 0.63 # Silvermont 5.75/3.54 3.56 4.12 3.87(*) # Bulldozer 5.77/0.70 0.72 0.90 0.70 # Loading
crypto/modes/asm/aesni-gcm-x86_64.pl +5 −4 Original line number Diff line number Diff line Loading @@ -22,10 +22,11 @@ # [1] and [2], with MOVBE twist suggested by Ilya Albrekht and Max # Locktyukhin of Intel Corp. who verified that it reduces shuffles # pressure with notable relative improvement, achieving 1.0 cycle per # byte processed with 128-bit key on Haswell processor, and 0.74 - # on Broadwell. [Mentioned results are raw profiled measurements for # favourable packet size, one divisible by 96. Applications using the # EVP interface will observe a few percent worse performance.] # byte processed with 128-bit key on Haswell processor, 0.74 - on # Broadwell, 0.63 - on Skylake... [Mentioned results are raw profiled # measurements for favourable packet size, one divisible by 96. # Applications using the EVP interface will observe a few percent # worse performance.] # # [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest # [2] http://www.intel.com/content/dam/www/public/us/en/documents/software-support/enabling-high-performance-gcm.pdf Loading
crypto/modes/asm/ghash-x86_64.pl +3 −2 Original line number Diff line number Diff line Loading @@ -64,6 +64,7 @@ # Ivy Bridge 1.80(+7%) # Haswell 0.55(+93%) (if system doesn't support AVX) # Broadwell 0.45(+110%)(if system doesn't support AVX) # Skylake 0.44(+110%)(if system doesn't support AVX) # Bulldozer 1.49(+27%) # Silvermont 2.88(+13%) Loading @@ -74,8 +75,8 @@ # CPUs such as Sandy and Ivy Bridge can execute it, the code performs # sub-optimally in comparison to above mentioned version. But thanks # to Ilya Albrekht and Max Locktyukhin of Intel Corp. we knew that # it performs in 0.41 cycles per byte on Haswell processor, and in # 0.29 on Broadwell. # it performs in 0.41 cycles per byte on Haswell processor, in # 0.29 on Broadwell, and in 0.36 on Skylake. # # [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest Loading