Re: a fast bloat measurement tool (was Re: Measuring relation free space)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: a fast bloat measurement tool (was Re: Measuring relation free space)
Дата
Msg-id 20150509002051.GD12950@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: a fast bloat measurement tool (was Re: Measuring relation free space)  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Ответы Re: a fast bloat measurement tool (was Re: Measuring relation free space)  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Список pgsql-hackers
Hi,

On 2015-04-24 08:46:48 +0530, Abhijit Menon-Sen wrote:
> diff --git a/contrib/pgstattuple/pgstatbloat.c b/contrib/pgstattuple/pgstatbloat.c
> new file mode 100644
> index 0000000..612e22b
> --- /dev/null
> +++ b/contrib/pgstattuple/pgstatbloat.c
> @@ -0,0 +1,389 @@
> +/*
> + * contrib/pgstattuple/pgstatbloat.c
> + *
> + * Abhijit Menon-Sen <ams@2ndQuadrant.com>
> + * Portions Copyright (c) 2001,2002    Tatsuo Ishii (from pgstattuple)

I think for new extension we don't really add authors and such anymore.

> + * Permission to use, copy, modify, and distribute this software and
> + * its documentation for any purpose, without fee, and without a
> + * written agreement is hereby granted, provided that the above
> + * copyright notice and this paragraph and the following two
> + * paragraphs appear in all copies.
> + *
> + * IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT,
> + * INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
> + * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
> + * DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED
> + * OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + * THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS
> + * IS" BASIS, AND THE AUTHOR HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE,
> + * SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
> + */

Shouldn't be here in a contrib module.

> +PG_FUNCTION_INFO_V1(pgstatbloat);

> +CREATE FUNCTION pgstatbloat(IN reloid regclass,
> +    OUT table_len BIGINT,               -- physical table length in bytes
> +    OUT scanned_percent FLOAT8,         -- what percentage of the table's pages was scanned
> +    OUT approx_tuple_count BIGINT,      -- estimated number of live tuples
> +    OUT approx_tuple_len BIGINT,        -- estimated total length in bytes of live tuples
> +    OUT approx_tuple_percent FLOAT8,    -- live tuples in % (based on estimate)
> +    OUT dead_tuple_count BIGINT,        -- exact number of dead tuples
> +    OUT dead_tuple_len BIGINT,          -- exact total length in bytes of dead tuples
> +    OUT dead_tuple_percent FLOAT8,      -- dead tuples in % (based on estimate)
> +    OUT free_space BIGINT,              -- exact free space in bytes
> +    OUT free_percent FLOAT8)            -- free space in %

Hm. I do wonder if this should really be called 'statbloat'. Perhaps
it'd more appropriately be called pg_estimate_bloat or somesuch?

Also, is free_space really exact? The fsm isn't byte exact.

> +static Datum
> +build_output_type(pgstatbloat_output_type *stat, FunctionCallInfo fcinfo)
> +{
> +#define NCOLUMNS    10
> +#define NCHARS        32
> +
> +    HeapTuple    tuple;
> +    char       *values[NCOLUMNS];
> +    char        values_buf[NCOLUMNS][NCHARS];
> +    int            i;
> +    double        tuple_percent;
> +    double        dead_tuple_percent;
> +    double        free_percent;    /* free/reusable space in % */
> +    double        scanned_percent;
> +    TupleDesc    tupdesc;
> +    AttInMetadata *attinmeta;
> +
> +    /* Build a tuple descriptor for our result type */
> +    if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> +        elog(ERROR, "return type must be a row type");
> +
> +    /*
> +     * Generate attribute metadata needed later to produce tuples from raw C
> +     * strings
> +     */
> +    attinmeta = TupleDescGetAttInMetadata(tupdesc);
> +
> +    if (stat->table_len == 0)
> +    {
> +        tuple_percent = 0.0;
> +        dead_tuple_percent = 0.0;
> +        free_percent = 0.0;
> +    }
> +    else
> +    {
> +        tuple_percent = 100.0 * stat->tuple_len / stat->table_len;
> +        dead_tuple_percent = 100.0 * stat->dead_tuple_len / stat->table_len;
> +        free_percent = 100.0 * stat->free_space / stat->table_len;
> +    }
> +
> +    scanned_percent = 0.0;
> +    if (stat->total_pages != 0)
> +    {
> +        scanned_percent = 100 * stat->scanned_pages / stat->total_pages;
> +    }
> +
> +    for (i = 0; i < NCOLUMNS; i++)
> +        values[i] = values_buf[i];
> +    i = 0;
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->table_len);
> +    snprintf(values[i++], NCHARS, "%.2f", scanned_percent);
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->tuple_count);
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->tuple_len);
> +    snprintf(values[i++], NCHARS, "%.2f", tuple_percent);
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->dead_tuple_count);
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->dead_tuple_len);
> +    snprintf(values[i++], NCHARS, "%.2f", dead_tuple_percent);
> +    snprintf(values[i++], NCHARS, INT64_FORMAT, stat->free_space);
> +    snprintf(values[i++], NCHARS, "%.2f", free_percent);
> +
> +    tuple = BuildTupleFromCStrings(attinmeta, values);
> +
> +    return HeapTupleGetDatum(tuple);
> +}

Why go through C strings?  I personally think we really shouldn't add
more callers to BuildTupleFromCStrings, it's such an absurd interface.

> +    switch (rel->rd_rel->relkind)
> +    {
> +        case RELKIND_RELATION:
> +        case RELKIND_MATVIEW:
> +            PG_RETURN_DATUM(pgstatbloat_heap(rel, fcinfo));
> +        case RELKIND_TOASTVALUE:
> +            err = "toast value";
> +            break;
> +        case RELKIND_SEQUENCE:
> +            err = "sequence";
> +            break;
> +        case RELKIND_INDEX:
> +            err = "index";
> +            break;
> +        case RELKIND_VIEW:
> +            err = "view";
> +            break;
> +        case RELKIND_COMPOSITE_TYPE:
> +            err = "composite type";
> +            break;
> +        case RELKIND_FOREIGN_TABLE:
> +            err = "foreign table";
> +            break;
> +        default:
> +            err = "unknown";
> +            break;
> +    }
>

This breaks translateability. I'm not sure that's worthy of concern. I
think it'd actually be fine to just say that the relation has to be a
table or matview.

> +static Datum
> +pgstatbloat_heap(Relation rel, FunctionCallInfo fcinfo)
> +{
> +    /*
> +     * We use the visibility map to skip over SKIP_PAGES_THRESHOLD or
> +     * more contiguous all-visible pages. See the explanation in
> +     * lazy_scan_heap for the rationale.
> +     */

I don't believe that rationale is really true these days. I'd much
rather just rip this out here than copy the rather complex logic.

> +    for (blkno = 0; blkno < nblocks; blkno++)
> +    {

> +        stat.free_space += PageGetHeapFreeSpace(page);
> +
> +        if (PageIsNew(page) || PageIsEmpty(page))
> +        {
> +            UnlockReleaseBuffer(buf);
> +            continue;
> +        }

I haven't checked, but I'm not sure that it's safe/meaningful to call
PageGetHeapFreeSpace() on a new page.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: GSSAPI, SSPI - include_realm default
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: Async execution of postgres_fdw.