Re: Detoasting optionally to make Explain-Analyze less misleading

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Detoasting optionally to make Explain-Analyze less misleading
Дата
Msg-id 14746b40-16a8-b53e-18a6-f2872e696e34@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Detoasting optionally to make Explain-Analyze less misleading  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: Detoasting optionally to make Explain-Analyze less misleading  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers

On 11/2/23 22:33, Matthias van de Meent wrote:
> On Thu, 2 Nov 2023 at 22:25, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 11/2/23 21:02, Matthias van de Meent wrote:
>>> On Thu, 2 Nov 2023 at 20:32, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>>> On 11/2/23 20:09, stepan rutz wrote:
>>>>> db1=# explain (analyze, serialize) select * from test;
>>>>>                                             QUERY PLAN
>>>>> ---------------------------------------------------------------------------------------------------
>>>>>  Seq Scan on test  (cost=0.00..22.00 rows=1200 width=40) (actual
>>>>> time=0.023..0.027 rows=1 loops=1)
>>>>>  Planning Time: 0.077 ms
>>>>>  Execution Time: 303.281 ms
>>>>>  Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
>>>> [...]
>>>> BTW if you really want to print amount of memory, maybe print it in
>>>> kilobytes, like every other place in explain.c?
>>>
>>> Isn't node width in bytes, or is it an opaque value not to be
>>> interpreted by users? I've never really investigated that part of
>>> Postgres' explain output...
>>>
>>
>> Right, "width=" is always in bytes. But fields like amount of sorted
>> data is in kB, and this seems closer to that.
>>
>>>> Also, explain generally
>>>> prints stuff in "key: value" style (in text format).
>>>
>>> That'd be key: metrickey=metricvalue for expanded values like those in
>>> plan nodes and the buffer usage, no?
>>>
>>
>> Possibly. But the proposed output does neither. Also, it starts with
>> "Serialized Bytes" but then prints info about bandwidth.
>>
>>
>>>>>  Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
>>>
>>> I was thinking more along the lines of something like this:
>>>
>>> [...]
>>> Execution Time: xxx ms
>>> Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text
>>> (or binary)
>>>> This is significantly different from your output, as it doesn't hide
>>> the measured time behind a lossy calculation of bandwidth, but gives
>>> the measured data to the user; allowing them to derive their own
>>> precise bandwidth if they're so inclined.
>>>
>>
>> Might work. I'm still not convinced we need to include the mode, or that
>> the size is that interesting/useful, though.
> 
> I'd say size is interesting for systems where network bandwidth is
> constrained, but CPU isn't. We currently only show estimated widths &
> accurate number of tuples returned, but that's not an accurate
> explanation of why your 30-row 3GB resultset took 1h to transmit on a
> 10mbit line - that is only explained by the bandwidth of your
> connection and the size of the dataset. As we can measure the size of
> the returned serialized dataset here, I think it's in the interest of
> any debugability to also present it to the user. Sadly, we don't have
> good measures of bandwidth without sending that data across, so that's
> the only metric that we can't show here, but total query data size is
> definitely something that I'd be interested in here.

Yeah, I agree with that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Pre-proposal: unicode normalized text
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Remove distprep