Re: Detoasting optionally to make Explain-Analyze less misleading
От | stepan rutz |
---|---|
Тема | Re: Detoasting optionally to make Explain-Analyze less misleading |
Дата | |
Msg-id | 03f952d6-b773-be93-e5d5-8f9ea9c2d461@gmx.de обсуждение исходный текст |
Ответ на | Re: Detoasting optionally to make Explain-Analyze less misleading (stepan rutz <stepan.rutz@gmx.de>) |
Ответы |
Re: Detoasting optionally to make Explain-Analyze less misleading
(Tomas Vondra <tomas.vondra@enterprisedb.com>)
|
Список | pgsql-hackers |
Hi, please see a revised version yesterday's mail. The patch attached now provides the following: EXPLAIN(ANALYZE,SERIALIZE) and EXPLAIN(ANALYZE,SERIALIZEBINARY) and timing output. Both options perform the serialization during analyze and provide an additional output in the plan like this: template1=# explain (analyze,serialize) select * from t12 limit 1; QUERY PLAN ------------------------------------------------------------------------------------------------------- ... Serialized Bytes: 36 bytes Execution Time: 0.035 ms (5 rows) or also this template1=# explain (analyze,serialize) select * from t1; QUERY PLAN --------------------------------------------------------------------------------------------- Seq Scan on t1 (cost=0.00..1.02 rows=2 width=19) (actual time=0.101..0.111 rows=5 loops=1) Planning Time: 0.850 ms Serialized Bytes: 85777978 bytes Execution Time: 354.284 ms (4 rows) Its tempting to divide Serialized-Bytes by Execution-Time to get an idea of the serialization bandwidth. This is /dev/null serialization though. The results are length-counted and then discarded. Since detoasting happens implicitly during serialization, the number of bytes becomes huge in this case and accounts for the detoasted lengths as well. I tried to get the number of bytes send for the protocol's messages and the attribute headers correctly. For the actual values I am quite sure I get the correct measures, as one can really tell by sending more values across. Null is 4 bytes on the wire interestingly. I didn't know that, but it makes sense, since its using the same prefix length-field as all values do. I have checked the JBDC driver and it uses binary and text formats depending on an attribute's type oid. So having the SERIALIZEBINARY option is not accurate, as in reality both formats can be occur for the same tuple. Please provide some feedback on the new patch and let me know if this makes sense. In general this kind of option for EXPLAIN is a good thing for sure. Greetings, Stepan On 14.09.23 21:27, stepan rutz wrote: > Hi Tom, Hi Matthias, > > you are right of course. I have looked at the code from printtup.c and > made a new version of the patch. > > Thanks for the MemoryContextReset hint too (@Matthias) > > This time is called EXPLAIN(ANALYZE,SERIALIZE) (hey, it also sounds > nicer phonetically) > > If the option SERIALIZE is active, the output functions are called and > they perform the detoasting, which I have even checked. > > So things are better this way, however I hardcoded the output option > "Text" (format=0). In printtup.c there is an incoming array which > applies Text (format=0) or Binary (format=1) for each column > individually. I am not sure whether this is even needed. I left in the > if-statement from printtup.c which calls the binary output method of a > given type. The result of the output is ignored and apparently free'd > because of the memory-context-reset at the end. > > Please also note, that I added a call to DestReceiver's rDestroy hook, > which was missing from explain.c before altogether. > > Feedback is appreciated. > > /Stepan > > > On 12.09.23 17:26, Tom Lane wrote: >> Matthias van de Meent <boekewurm+postgres@gmail.com> writes: >>> Hmm, maybe we should measure the overhead of serializing the tuples >>> instead. >>> The difference between your patch and "serializing the tuples, but not >>> sending them" is that serializing also does the detoasting, but also >>> includes any time spent in the serialization functions of the type. So >>> an option "SERIALIZE" which measures all the time the server spent on >>> the query (except the final step of sending the bytes to the client) >>> would likely be more useful than "just" detoasting. >> +1, that was my immediate reaction to the proposal as well. Some >> output functions are far from cheap. Doing only the detoast part >> seems like it's still misleading. >> >> Do we need to go as far as offering both text-output and binary-output >> options? >> >> regards, tom lane
Вложения
В списке pgsql-hackers по дате отправления: