RE: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

Поиск
Список
Период
Сортировка
От ldh@laurent-hasson.com
Тема RE: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4
Дата
Msg-id MN2PR15MB2560B122A49C4C06483183D785CB9@MN2PR15MB2560.namprd15.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4  (Julien Rouhaud <rjuju123@gmail.com>)
Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-performance

   >  -----Original Message-----
   >  From: Tom Lane <tgl@sss.pgh.pa.us>
   >  Sent: Sunday, August 29, 2021 12:19
   >  To: ldh@laurent-hasson.com
   >  Cc: Ranier Vilela <ranier.vf@gmail.com>; Andrew Dunstan
   >  <andrew@dunslane.net>; Justin Pryzby <pryzby@telsasoft.com>; pgsql-
   >  performance@postgresql.org
   >  Subject: Re: Big Performance drop of Exceptions in UDFs between V11.2
   >  and 13.4
   >  
   >  "ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
   >  > Is it possible that the client I am using or the way I am creating the test
   >  database might affect this scenario? I use DBeaver and use the default
   >  settings to create the database:
   >  > - default encoding: UTF8
   >  > - collate: English_United States.1252
   >  > - ctype: English_United States.1252
   >  
   >  Yeah, I was thinking of quizzing you about that.  I wonder whether
   >  something is thinking it needs to transcode to WIN1252 encoding and
   >  then back to UTF8, based on the .1252 property of the LC_XXX settings.
   >  That shouldn't account for any 500X factor either, but we're kind of
   >  grasping at straws here.
   >  
   >  Does Windows have any locale choices that imply UTF8 encoding
   >  exactly, and if so, do your results change when using that?  Alternatively,
   >  try creating a database with WIN1252 encoding and those locale
   >  settings.
   >  
   >              regards, tom lane

Yeah, grasping at straws... and no material changes 😊 This is mystifying.

show lc_messages;
-- English_United States.1252

create table sampletest (a varchar, b varchar);
insert into sampletest (a, b)
select substr(md5(random()::text), 0, 15), (100000000*random())::integer::varchar
  from generate_series(1,100000);

CREATE OR REPLACE FUNCTION toFloat(str varchar, val real)
RETURNS real AS $$
BEGIN
  RETURN case when str is null then val else str::real end;
EXCEPTION WHEN OTHERS THEN
  RETURN val;
END;
$$ LANGUAGE plpgsql COST 1 IMMUTABLE;

explain (analyze,buffers,COSTS,TIMING) 
select MAX(toFloat(a, null)) as "a" from sampletest
--Aggregate  (cost=1477.84..1477.85 rows=1 width=4) (actual time=89527.032..89527.033 rows=1 loops=1)
--  Buffers: shared hit=647
--  ->  Seq Scan on sampletest  (cost=0.00..1197.56 rows=56056 width=32) (actual time=0.024..37.811 rows=100000
loops=1)
--        Buffers: shared hit=637
--Planning:
--  Buffers: shared hit=24
--Planning Time: 0.347 ms
--Execution Time: 89527.501 ms

explain (analyze,buffers,COSTS,TIMING) 
select MAX(toFloat(b, null)) as "b" from sampletest
--Aggregate  (cost=2137.00..2137.01 rows=1 width=4) (actual time=186.605..186.606 rows=1 loops=1)
--  Buffers: shared hit=637
--  ->  Seq Scan on sampletest  (cost=0.00..1637.00 rows=100000 width=8) (actual time=0.008..9.679 rows=100000
loops=1)
--        Buffers: shared hit=637
--Planning:
--  Buffers: shared hit=4
--Planning Time: 0.339 ms
--Execution Time: 186.641 ms


At this point, I am not sure how to proceed except to rethink that toFloat() function and many other places where we
useexceptions. We get such dirty data that I need a "safe" way to convert a string to float without throwing an
exception.BTW, I tried other combinations in case there may have been some weird interactions with the ::REAL
conversionoperator, but nothing made any change. Could you recommend another approach off the top of your head? I could
useregexes for testing etc... Or maybe there is another option like a no-throw conversion that's built in or in some
extensionthat you may know of? Like the "SAFE." Prefix in BigQuery.
 

Thank you,
Laurent.




В списке pgsql-performance по дате отправления:

Предыдущее
От: "ldh@laurent-hasson.com"
Дата:
Сообщение: RE: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4
Следующее
От: Ranier Vilela
Дата:
Сообщение: Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4