Re: Reduce heap tuple header size

Поиск
Список
Период
Сортировка
От Manfred Koizar
Тема Re: Reduce heap tuple header size
Дата
Msg-id dalkgugaoqd8hqdjvvjn0ord2vtfo68irt@4ax.com
обсуждение исходный текст
Ответ на Re: Reduce heap tuple header size  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches
On Fri, 14 Jun 2002 10:16:22 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>As I commented before, I am not in favor of this.  I don't think that a
>four-byte savings justifies a forced initdb with no chance of
>pg_upgrade,

As I don't know other users' preferences, I cannot comment on this.
I just think that four bytes per tuple can amount to a sum that justifies
this effort.  Disk space is not my argument here, but reduced disk IO is.

>plus significantly slower access to
>several critical header fields.  The tqual.c routines are already
>hotspots in many scenarios.  I believe this will make them noticeably
>slower.

Significantly slower?  I tried to analyze HeapTupleSatisfiesUpdate(),
as I think it is the most complicated of these Satisfies functions
(look for the !! comments):

/*
!! HeapTupleHeaderGetXmin  no slowdown
!! HeapTupleHeaderGetXmax  one infomask compare
!! HeapTupleHeaderGetCmin  two infomask compares
!! HeapTupleHeaderGetCMax  two infomask compares
!! HeapTupleHeaderGetXvac  no slowdown
*/
int
HeapTupleSatisfiesUpdate(HeapTuple htuple, CommandId curcid)
{
    HeapTupleHeader tuple = htuple->t_data;

    if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
    {
        if (tuple->t_infomask & HEAP_XMIN_INVALID)
            /*
            !! no slowdown
            */
            return HeapTupleInvisible;

        if (tuple->t_infomask & HEAP_MOVED_OFF)
        {
            if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXvac(tuple)))
                /*
                !! no slowdown
                */
                return HeapTupleInvisible;
            if (!TransactionIdIsInProgress(HeapTupleHeaderGetXvac(tuple)))
            {
                if (TransactionIdDidCommit(HeapTupleHeaderGetXvac(tuple)))
                {
                    tuple->t_infomask |= HEAP_XMIN_INVALID;
                    /*
                    !! no slowdown
                    */
                    return HeapTupleInvisible;
                }
                tuple->t_infomask |= HEAP_XMIN_COMMITTED;
            }
        }
        else if (tuple->t_infomask & HEAP_MOVED_IN)
        {
            if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXvac(tuple)))
            {
                if (TransactionIdIsInProgress(HeapTupleHeaderGetXvac(tuple)))
                    /*
                    !! no slowdown
                    */
                    return HeapTupleInvisible;
                if (TransactionIdDidCommit(HeapTupleHeaderGetXvac(tuple)))
                    tuple->t_infomask |= HEAP_XMIN_COMMITTED;
                else
                {
                    tuple->t_infomask |= HEAP_XMIN_INVALID;
                    /*
                    !! no slowdown
                    */
                    return HeapTupleInvisible;
                }
            }
        }
        /*
        !! no slowdown up to here
        */
        else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple)))
        {
            /*
            !! GetCmin does 2 infomask compares
            */
            if (HeapTupleHeaderGetCmin(tuple) >= curcid)
                /*
                !! returning with 2 additional infomask compares
                */
                return HeapTupleInvisible;        /* inserted after scan
                                                 * started */

            if (tuple->t_infomask & HEAP_XMAX_INVALID)        /* xid invalid */
                /*
                !! returning with 2 additional infomask compares
                */
                return HeapTupleMayBeUpdated;

            /*
            !! assertions turned off in production: no slowdown
            */
            Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)));

            if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE)
                /*
                !! returning with 2 additional infomask compares
                */
                return HeapTupleMayBeUpdated;

            /*
            !! GetCmax does 2 infomask compares
            */
            if (HeapTupleHeaderGetCmax(tuple) >= curcid)
                /*
                !! returning with 4 additional infomask compares
                */
                return HeapTupleSelfUpdated;    /* updated after scan
                                                 * started */
            else
                /*
                !! returning with 4 additional infomask compares
                */
                return HeapTupleInvisible;        /* updated before scan
                                                 * started */
        }
        /*
        !! no slowdown up to here
        */
        else if (!TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
        {
            if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(tuple)))
                tuple->t_infomask |= HEAP_XMIN_INVALID;    /* aborted */
            /*
            !! no slowdown
            */
            return HeapTupleInvisible;
        }
        else
            tuple->t_infomask |= HEAP_XMIN_COMMITTED;
    }

    /*
    !! no slowdown
    */
    /* by here, the inserting transaction has committed */

    if (tuple->t_infomask & HEAP_XMAX_INVALID)        /* xid invalid or aborted */
        /*
        !! no slowdown
        */
        return HeapTupleMayBeUpdated;

    if (tuple->t_infomask & HEAP_XMAX_COMMITTED)
    {
        if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE)
            /*
            !! no slowdown
            ?? BTW, shouldn't we set HEAP_MAX_INVALID here?
            */
            return HeapTupleMayBeUpdated;
        /*
        !! no slowdown
        */
        return HeapTupleUpdated;    /* updated by other */
    }

    /*
    !! no slowdown up to here,
    !! one infomask compare to follow in GetXmax(),
    !! additional waste of precious CPU cycles could be avoided by:
    !! TransactionId xmax = HeapTupleHeaderGetXmax(tuple);
    */
    if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)))
    {
        if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE)
            /*
            !! returning with 1 additional infomask compare
            */
            return HeapTupleMayBeUpdated;
        /*
        !! GetCmax does 2 infomask compares
        */
        if (HeapTupleHeaderGetCmax(tuple) >= curcid)
            /*
            !! returning with 3 additional infomask compares
            */
            return HeapTupleSelfUpdated;        /* updated after scan
                                                 * started */
        else
            /*
            !! returning with 3 additional infomask compares
            */
            return HeapTupleInvisible;    /* updated before scan started */
    }

    /*
    !! 1 infomask compare up to here,
    !! another infomask compare ...
    !! or use xmax
    */
    if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
    {
        /*
        !! and a third infomask compare
        */
        if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
        {
            tuple->t_infomask |= HEAP_XMAX_INVALID;        /* aborted */
            /*
            !! returning with 1 or 3 additional infomask compares
            */
            return HeapTupleMayBeUpdated;
        }
        /* running xact */
        /*
        !! returning with 1 or 3 additional infomask compares
        */
        return HeapTupleBeingUpdated;    /* in updation by other */
    }

    /*
    !! 2 (or 1 with a local xmax) infomask compares up to here
    */
    /* xmax transaction committed */
    tuple->t_infomask |= HEAP_XMAX_COMMITTED;

    if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE)
        /*
        ?? BTW, shouldn't we set HEAP_MAX_INVALID here?
        */
        return HeapTupleMayBeUpdated;

    return HeapTupleUpdated;    /* updated by other */
}

So in the worst case we return after having done four more
compares than without the patch.  Note that in the most common
cases there is no additional cost at all.  If you still think
we have a performance problem here, we could replace GetCmin
and GetCmax by cheaper macros:

#define HeapTupleHeaderGetCminKnowingThatNotMoved(tup) \
( \
    AssertMacro(!((tup)->t_infomask & HEAP_MOVED)),
    ( \
        ((tup)->t_infomask & (HEAP_XMIN_IS_XMAX | HEAP_XMAX_INVALID)) ? \
            (CommandId) (tup)->t_cid \
        : \
            FirstCommandId \
    ) \
)

thus reducing the additional cost to one t_infomask compare,
because the Satisfies functions only access Cmin and Cmax,
when HEAP_MOVED is known to be not set.

OTOH experimenting with a moderatly sized "out of production"
database I got the following results:
                         | pages | pages |
relkind | count | tuples | before| after | savings
--------+-------+--------+-------+-------+--------
i       |    31 | 808146 |  8330 |  8330 |   0.00%
r       |    32 | 612968 | 13572 | 13184 |   2.86%
all     |    63 |        | 21902 | 21514 |   1.77%

2.86% fewer heap pages mean 2.86% less disk IO caused by heap pages.
Considering that index pages tend to benefit more from caching
we conclude that heap pages contribute more to the overall
IO load, so the total savings in the number of disk IOs should
be better than the 1.77% shown in the table above.  I think
this outweighs a few CPU cycles now and then.

Servus
 Manfred

В списке pgsql-patches по дате отправления:

Предыдущее
От: Barry Lind
Дата:
Сообщение: Re: New Patch For CallableStmt (against current CVS)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: libpq++ fixes