There is a ppro flag in cast-586 which turns on/off
generation of pentium pro/II friendly code

This flag makes the inner loop one cycle longer, but generates 
code that runs %30 faster on the pentium pro/II, while only %7 slower
on the pentium.  By default, this flag is on.