Re: Allow a per-tablespace effective_io_concurrency setting

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Allow a per-tablespace effective_io_concurrency setting
Дата
Msg-id 55E71E9E.4090403@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Allow a per-tablespace effective_io_concurrency setting  (Andres Freund <andres@anarazel.de>)
Ответы Re: Allow a per-tablespace effective_io_concurrency setting  (Andres Freund <andres@anarazel.de>)
Re: Allow a per-tablespace effective_io_concurrency setting  (Julien Rouhaud <julien.rouhaud@dalibo.com>)
Список pgsql-hackers
Hi

On 09/02/2015 03:53 PM, Andres Freund wrote:
>
> Hi,
>
> On 2015-07-18 12:17:39 +0200, Julien Rouhaud wrote:
>> I didn't know that the thread must exists on -hackers to be able to add
>> a commitfest entry, so I transfer the thread here.
>
> Please, in the future, also update the title of the thread to something
> fitting.
>
>> @@ -539,6 +541,9 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
>>   {
>>       BitmapHeapScanState *scanstate;
>>       Relation    currentRelation;
>> +#ifdef USE_PREFETCH
>> +    int new_io_concurrency;
>> +#endif
>>
>>       /* check for unsupported flags */
>>       Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
>> @@ -598,6 +603,25 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
>>        */
>>       currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
>>
>> +#ifdef USE_PREFETCH
>> +    /* check if the effective_io_concurrency has been overloaded for the
>> +     * tablespace storing the relation and compute the target_prefetch_pages,
>> +     * or just get the current target_prefetch_pages
>> +     */
>> +    new_io_concurrency = get_tablespace_io_concurrency(
>> +            currentRelation->rd_rel->reltablespace);
>> +
>> +
>> +    scanstate->target_prefetch_pages = target_prefetch_pages;
>> +
>> +    if (new_io_concurrency != effective_io_concurrency)
>> +    {
>> +        double prefetch_pages;
>> +       if (compute_io_concurrency(new_io_concurrency, &prefetch_pages))
>> +            scanstate->target_prefetch_pages = rint(prefetch_pages);
>> +    }
>> +#endif
>
> Maybe it's just me - but imo there should be as few USE_PREFETCH
> dependant places in the code as possible. It'll just be 0 when not
> supported, that's fine?

It's not just you. Dealing with code with plenty of ifdefs is annoying - 
it compiles just fine most of the time, until you compile it with some 
rare configuration. Then it either starts producing strange warnings or 
the compilation fails entirely.

It might make a tiny difference on builds without prefetching support 
because of code size, but realistically I think it's noise.

> Especially changing the size of externally visible structs depending
> ona configure detected ifdef seems wrong to me.

+100 to that

>> +    /*----------
>> +     * The user-visible GUC parameter is the number of drives (spindles),
>> +     * which we need to translate to a number-of-pages-to-prefetch target.
>> +     * The target value is stashed in *extra and then assigned to the actual
>> +     * variable by assign_effective_io_concurrency.
>> +     *
>> +     * The expected number of prefetch pages needed to keep N drives busy is:
>> +     *
>> +     * drives |   I/O requests
>> +     * -------+----------------
>> +     *        1 |   1
>> +     *        2 |   2/1 + 2/2 = 3
>> +     *        3 |   3/1 + 3/2 + 3/3 = 5 1/2
>> +     *        4 |   4/1 + 4/2 + 4/3 + 4/4 = 8 1/3
>> +     *        n |   n * H(n)
>
> I know you just moved this code. But: I don't buy this formula. Like at
> all. Doesn't queuing and reordering entirely invalidate the logic here?

Well, even the comment right next after the formula says that:
    * Experimental results show that both of these formulas aren't    * aggressiveenough, but we don't really have any
betterproposals.
 

That's the reason why users generally either use 0 or some rather high 
value (16 or 32 are the most common values see). The problem is that we 
don't really care about the number of spindles (and not just because 
SSDs don't have them at all), but about the target queue length per 
device. Spinning rust uses TCQ/NCQ to optimize the head movement, SSDs 
are parallel by nature (stacking multiple chips with separate channels).

I doubt we can really improve the formula, except maybe for saying "we 
want 16 requests per device" and multiplying the number by that. We 
don't really have the necessary introspection to determine better values 
(and it's not really possible, because the devices may be hidden behind 
a RAID controller (or a SAN). So we can't really do much.

Maybe the best thing we can do is just completely abandon the "number of 
spindles" idea, and just say "number of I/O requests to prefetch". 
Possibly with an explanation of how to estimate it (devices * queue length).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Improving test coverage of extensions with pg_dump
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)