• Andres Freund's avatar
    tableam: Add and use scan APIs. · c2fe139c
    Andres Freund authored
    Too allow table accesses to be not directly dependent on heap, several
    new abstractions are needed. Specifically:
    
    1) Heap scans need to be generalized into table scans. Do this by
       introducing TableScanDesc, which will be the "base class" for
       individual AMs. This contains the AM independent fields from
       HeapScanDesc.
    
       The previous heap_{beginscan,rescan,endscan} et al. have been
       replaced with a table_ version.
    
       There's no direct replacement for heap_getnext(), as that returned
       a HeapTuple, which is undesirable for a other AMs. Instead there's
       table_scan_getnextslot().  But note that heap_getnext() lives on,
       it's still used widely to access catalog tables.
    
       This is achieved by new scan_begin, scan_end, scan_rescan,
       scan_getnextslot callbacks.
    
    2) The portion of parallel scans that's shared between backends need
       to be able to do so without the user doing per-AM work. To achieve
       that new parallelscan_{estimate, initialize, reinitialize}
       callbacks are introduced, which operate on a new
       ParallelTableScanDesc, which again can be subclassed by AMs.
    
       As it is likely that several AMs are going to be block oriented,
       block oriented callbacks that can be shared between such AMs are
       provided and used by heap. table_block_parallelscan_{estimate,
       intiialize, reinitialize} as callbacks, and
       table_block_parallelscan_{nextpage, init} for use in AMs. These
       operate on a ParallelBlockTableScanDesc.
    
    3) Index scans need to be able to access tables to return a tuple, and
       there needs to be state across individual accesses to the heap to
       store state like buffers. That's now handled by introducing a
       sort-of-scan IndexFetchTable, which again is intended to be
       subclassed by individual AMs (for heap IndexFetchHeap).
    
       The relevant callbacks for an AM are index_fetch_{end, begin,
       reset} to create the necessary state, and index_fetch_tuple to
       retrieve an indexed tuple.  Note that index_fetch_tuple
       implementations need to be smarter than just blindly fetching the
       tuples for AMs that have optimizations similar to heap's HOT - the
       currently alive tuple in the update chain needs to be fetched if
       appropriate.
    
       Similar to table_scan_getnextslot(), it's undesirable to continue
       to return HeapTuples. Thus index_fetch_heap (might want to rename
       that later) now accepts a slot as an argument. Core code doesn't
       have a lot of call sites performing index scans without going
       through the systable_* API (in contrast to loads of heap_getnext
       calls and working directly with HeapTuples).
    
       Index scans now store the result of a search in
       IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
       target is not generally a HeapTuple anymore that seems cleaner.
    
    To be able to sensible adapt code to use the above, two further
    callbacks have been introduced:
    
    a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
       slots capable of holding a tuple of the AMs
       type. table_slot_callbacks() and table_slot_create() are based
       upon that, but have additional logic to deal with views, foreign
       tables, etc.
    
       While this change could have been done separately, nearly all the
       call sites that needed to be adapted for the rest of this commit
       also would have been needed to be adapted for
       table_slot_callbacks(), making separation not worthwhile.
    
    b) tuple_satisfies_snapshot checks whether the tuple in a slot is
       currently visible according to a snapshot. That's required as a few
       places now don't have a buffer + HeapTuple around, but a
       slot (which in heap's case internally has that information).
    
    Additionally a few infrastructure changes were needed:
    
    I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
       internally uses a slot to keep track of tuples. While
       systable_getnext() still returns HeapTuples, and will so for the
       foreseeable future, the index API (see 1) above) now only deals with
       slots.
    
    The remainder, and largest part, of this commit is then adjusting all
    scans in postgres to use the new APIs.
    
    Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
    Discussion:
        https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
        https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
    c2fe139c
spgscan.c 25.1 KB