Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Дата
Msg-id 20170125163648.w3q24ifflvcemjpj@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Ответы Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Список pgsql-hackers
Reading 0001_track_root_lp_v9.patch again:

> +/*
> + * We use the same HEAP_LATEST_TUPLE flag to check if the tuple's t_ctid field
> + * contains the root line pointer. We can't use the same
> + * HeapTupleHeaderIsHeapLatest macro because that also checks for TID-equality
> + * to decide whether a tuple is at the of the chain
> + */
> +#define HeapTupleHeaderHasRootOffset(tup) \
> +( \
> +    ((tup)->t_infomask2 & HEAP_LATEST_TUPLE) != 0 \
> +)
>
> +#define HeapTupleHeaderGetRootOffset(tup) \
> +( \
> +    AssertMacro(((tup)->t_infomask2 & HEAP_LATEST_TUPLE) != 0), \
> +    ItemPointerGetOffsetNumber(&(tup)->t_ctid) \
> +)

Interesting stuff; it took me a bit to see why these macros are this
way.  I propose the following wording which I think is clearer:
 Return whether the tuple has a cached root offset.  We don't use HeapTupleHeaderIsHeapLatest because that one also
considersthe slow case of scanning the whole block.
 

Please flag the macros that have multiple evaluation hazards -- there
are a few of them.  

> +/*
> + * If HEAP_LATEST_TUPLE is set in the last tuple in the update chain. But for
> + * clusters which are upgraded from pre-10.0 release, we still check if c_tid
> + * is pointing to itself and declare such tuple as the latest tuple in the
> + * chain
> + */
> +#define HeapTupleHeaderIsHeapLatest(tup, tid) \
> +( \
> +  (((tup)->t_infomask2 & HEAP_LATEST_TUPLE) != 0) || \
> +  ((ItemPointerGetBlockNumber(&(tup)->t_ctid) == ItemPointerGetBlockNumber(tid)) && \
> +   (ItemPointerGetOffsetNumber(&(tup)->t_ctid) == ItemPointerGetOffsetNumber(tid))) \
> +)

I suggest rewording this comment as: Starting from PostgreSQL 10, the latest tuple in an update chain has
HEAP_LATEST_TUPLEset; but tuples upgraded from earlier versions do not.  For those, we determine whether a tuple is
latestby testing that its t_ctid points to itself.
 
(as discussed, there is no "10.0 release"; it's called the "10 release"
only, no ".0".  Feel free to use "v10" or "pg10").

> +/*
> + * Get TID of next tuple in the update chain. Caller should have checked that
> + * we are not already at the end of the chain because in that case t_ctid may
> + * actually store the root line pointer of the HOT chain whose member this
> + * tuple is.
> + */
> +#define HeapTupleHeaderGetNextTid(tup, next_ctid) \
> +do { \
> +    AssertMacro(!((tup)->t_infomask2 & HEAP_LATEST_TUPLE)); \
> +    ItemPointerCopy(&(tup)->t_ctid, (next_ctid)); \
> +} while (0)

Actually, I think this macro could just return the TID so that it can be
used as struct assignment, just like ItemPointerCopy does internally --
callers can doctid = HeapTupleHeaderGetNextTid(tup);

or more precisely, this pattern
> +        if (!HeapTupleHeaderIsHeapLatest(tp.t_data, &tp.t_self))
> +            HeapTupleHeaderGetNextTid(tp.t_data, &hufd->ctid);
> +        else
> +            ItemPointerCopy(&tp.t_self, &hufd->ctid);

becomes    hufd->ctid = HeapTupleHeaderIsHeapLatest(foo) ?        HeapTupleHeaderGetNextTid(foo) : &tp->t_self;
or something like that.  I further wonder if it'd make sense to hide
this into yet another macro.


The API of RelationPutHeapTuple appears a bit contorted, where
root_offnum is both input and output.  I think it's cleaner to have the
argument be the input, and have the output offset be the return value --
please check whether that simplifies things; for example I think this:

> +            root_offnum = InvalidOffsetNumber;
> +            RelationPutHeapTuple(relation, buffer, heaptup, false,
> +                    &root_offnum);

becomes
root_offnum = RelationPutHeapTuple(relation, buffer, heaptup, false,        InvalidOffsetNumber);


Please remove the words "must have" in this comment:

> +    /*
> +     * Also mark both copies as latest and set the root offset information. If
> +     * we're doing a HOT/WARM update, then we just copy the information from
> +     * old tuple, if available or computed above. For regular updates,
> +     * RelationPutHeapTuple must have returned us the actual offset number
> +     * where the new version was inserted and we store the same value since the
> +     * update resulted in a new HOT-chain
> +     */

Many comments lack finishing periods in complete sentences, which looks
odd.  Please fix.


I have not looked at the other patch yet.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: [HACKERS] COPY as a set returning function
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check