• Tom Lane's avatar
    Apply auto-vectorization to the inner loop of numeric multiplication. · 88709176
    Tom Lane authored
    Compile numeric.c with -ftree-vectorize where available, and adjust
    the innermost loop of mul_var() so that it is amenable to being
    auto-vectorized.  (Mainly, that involves making it process the arrays
    left-to-right not right-to-left.)
    
    Applying -ftree-vectorize actually makes numeric.o smaller, at least
    with my compiler (gcc 8.3.1 on x86_64), and it's a little faster too.
    Independently of that, fixing the inner loop to be vectorizable also
    makes things a bit faster.  But doing both is a huge win for
    multiplications with lots of digits.  For me, the numeric regression
    test is the same speed to within measurement noise, but numeric_big
    is a full 45% faster.
    
    We also looked into applying -funroll-loops, but that makes numeric.o
    bloat quite a bit, and the additional speed improvement is very
    marginal.
    
    Amit Khandekar, reviewed and edited a little by me
    
    Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
    88709176
Makefile 2.24 KB