RE: text fields and performance for ETL

Поиск
Список
Период
Сортировка
От Grega Jesih
Тема RE: text fields and performance for ETL
Дата
Msg-id 70cf9fe110c246afb4eab054caab0c04@actual-it.si
обсуждение исходный текст
Ответ на Re: text fields and performance for ETL  ("David G. Johnston" <david.g.johnston@gmail.com>)
Ответы Re: text fields and performance for ETL
Список pgsql-docs

David,

 

> Can you demonstrate, with actual numbers, using today's implementation, a situation where defining a column as char(3) or varchar(3) instead of text has a significant performance improvement?

Sure I can.


But first, what am I to prove ?  A performance in dataflow from server A to server B.

 

What is that performance change based on ? It is based on a fact that the tool that pumps the data may calculate block size and thus work with several

rows as a time instead of a row at a time.

 

Is this Postgres performance related ? NO. It is model design related.


So why would it then be published ? Because inexperienced programmers take your statement that "it's the same performance" from a wrong perspective, so it would be fair to note, that the remark "it's the same performance" is meant "within any operation inside Postgres database". In the moment we want to take this data elsewhere,

the problematic lack of model design comes out.

Now for the demo: I can make a video to see it, but here is a column in context that talks about the thing:

https://dba.stackexchange.com/questions/102830/avoiding-row-by-row-fetch-method-when-dealing-with-source-lob-columns

 

If this contribution from stackexchange isn't enough, let me know.

BR
Grega

 





 

From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Wednesday, November 3, 2021 7:40 PM
To: Grega Jesih <Grega.Jesih@actual-it.si>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; grega.jesih@gmail.com; Pg Docs <pgsql-docs@lists.postgresql.org>
Subject: Re: text fields and performance for ETL

 

On Wed, Nov 3, 2021 at 11:09 AM Grega Jesih <Grega.Jesih@actual-it.si> wrote:

The new architectures include more and more data exchange among databases. 
Now when you deal with bigger data sizes that go from millions to billions, this fixed size vs of text - undefined size becomes very  relevant. 

Can you demonstrate, with actual numbers, using today's implementation, a situation where defining a column as char(3) or varchar(3) instead of text has a significant performance improvement?  Without a concrete example to examine I'm unable to be convinced to move away from the status quo.

 

You also need to convince me as to why constraints are an insufficient feature.  i.e., why is char(3) better than (check length(val) = 3)?

 

Even with all that I'd probably still not do anything beyond reviewing a proposed patch (i.e, I wouldn't try to write one myself from scratch...I don't have authority to commit regardless).

 

David J.

 


NOTICE - NOT TO BE REMOVED.
This e-mail and any attachments are confidential and may contain legally privileged information and/or copyright material of Actual I.T. or third parties. If you are not an authorised recipient of this e-mail, please contact Actual I.T. immediately by return email or by telephone or facsimile on the above numbers.
You should not read, print, re-transmit, store or act in reliance on this email or any attachments and you should destroy all copies of them.

В списке pgsql-docs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: text fields and performance for ETL
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: text fields and performance for ETL