Though the ideal max_splits for
million (or so) on x86 seems
to be substantially larger, enabling a roughly 15% speedup for such tests,
this optimization isn't general, and doesn't apply for
million. A too large max_splits
can cause sort to take more than twice as long, so it should be set on
the low end of the reasonable range, where it is right now.