Re: Detoasting optionally to make Explain-Analyze less misleading

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Detoasting optionally to make Explain-Analyze less misleading
Дата
Msg-id 78507aa1-f850-50cd-458a-3986d973d092@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Detoasting optionally to make Explain-Analyze less misleading  (stepan rutz <stepan.rutz@gmx.de>)
Ответы Re: Detoasting optionally to make Explain-Analyze less misleading  (stepan rutz <stepan.rutz@gmx.de>)
Re: Detoasting optionally to make Explain-Analyze less misleading  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers

On 11/2/23 20:09, stepan rutz wrote:
> Hi Thomas,
> 
> you are right of course. Thanks!
> 
> I have attached a new version of the patch that supports the syntax like
> suggested. The previous patch was insonsistent in style indeed.
> 
> explain (analyze, serialize)
> 
> and
> 
> explain (analyze, serialize binary)
> 
> That doesn't make too much of a difference for most scenarios I am
> certain. However the the seralize option itself does. Mostly because it
> performs the detoasting and that was a trap for me in the past with just
> plain analyze.
> 
> 
> Eg this scenario really is not too far fetched in a world where people
> have large JSONB values.
> 
> 
> db1=# create table test(id bigint, val text);
> 
> db1=# insert into test(val) select string_agg(s::text, ',') from (select
> generate_series(1, 10_000_000) as s) as a1;
> 
> now we have a cell that has roughly 80Mb in it. A large detoasting that
> will happen in reallife but in explain(analyze).
> 
> and then...
> 
> db1=# explain (analyze) select * from test;
>                                             QUERY PLAN
> ---------------------------------------------------------------------------------------------------
>  Seq Scan on test  (cost=0.00..22.00 rows=1200 width=40) (actual
> time=0.018..0.020 rows=1 loops=1)
>  Planning Time: 0.085 ms
>  Execution Time: 0.044 ms
> (3 rows)
> 
> db1=# explain (analyze, serialize) select * from test;
>                                             QUERY PLAN
> ---------------------------------------------------------------------------------------------------
>  Seq Scan on test  (cost=0.00..22.00 rows=1200 width=40) (actual
> time=0.023..0.027 rows=1 loops=1)
>  Planning Time: 0.077 ms
>  Execution Time: 303.281 ms
>  Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
> (4 rows)
> 
> db1=#
> 
> So the explain(analyze) does not process the ~80 MB in 0.044ms in any
> way of course.

Honestly, I see absolutely no point in printing this. I have little idea
what to do with the "bytes". We have to transfer this over network, but
surely there's other data not included in this sum, right?

But the bandwidth seems pretty bogus/useless, as it's calculated from
execution time, which includes everything else, not just serialization.
So what does it say? It certainly does not include the network transfer.

IMO we should either print nothing or just the bytes. I don't see the
point in printing the mode, which is determined by the command.

> 
> Actually I could print the serialized bytes using 1. grouping-separators
> (eg 78_888_953) or 2. in the way pg_size_pretty does it.
> 
> If doing it the pg_size_pretty way I am uncertain if it would be ok to
> query the actual pg_size_pretty function via its (certainly frozen) oid
> of 3166 and do OidFunctionCall1(3166...) to invoke it. Otherwise I'd say
> it would be nice if the code from that function would be made available
> as a utility function for all c-code.  Any suggestions on this topic?
> 

I'm rather skeptical about this proposal, mostly because it makes it
harder to process the explain output in scripts etc.

But more importantly, it's a completely separate matter from what this
patch does, so if you want to pursue this, I suggest you start a
separate thread. If you want to introduce separators, surely this is not
the only place that should do it (e.g. why not to do that for "rows" or
"cost" estimates)?

BTW if you really want to print amount of memory, maybe print it in
kilobytes, like every other place in explain.c? Also, explain generally
prints stuff in "key: value" style (in text format).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Add new option 'all' to pg_stat_reset_shared()
Следующее
От: stepan rutz
Дата:
Сообщение: Re: Detoasting optionally to make Explain-Analyze less misleading