aes/asm/aesni-x86_64.pl: further optimization for Atom Silvermont.
Improve CBC decrypt and CTR by ~13/16%, which adds up to ~25/33%
improvement over "pre-Silvermont" version. [Add performance table to
aesni-x86.pl].
(cherry picked from commit 5599c733)