Re: Array, bytea and large objects

Поиск
Список
Период
Сортировка
От David Wall
Тема Re: Array, bytea and large objects
Дата
Msg-id 498B2B84.2050407@computer.org
обсуждение исходный текст
Ответ на Re: Array, bytea and large objects  (Filip Rembiałkowski <plk.zuber@gmail.com>)
Ответы Re: Array, bytea and large objects  (Filip Rembiałkowski <plk.zuber@gmail.com>)
Список pgsql-general
Thanks, Filip.
If I have an "unlimited" number of name-value pairs that I'd like to get easy access to for flexible reports, could I store these in two arrays (one for name, the other for value) in a table so that if I had 10 name-value pairs or 200 name-value pairs, I could store these into a single row using arrays so I could retrieve all name-value pairs in a single SELECT from the db?

yes you could, but what for?
what's wrong with many rows? create table kvstore(key text primary key,value text);
what stops you from using single SELECT to get all these kv pairs?

We basically do it that way now, but was thinking we might run tests to see if it's faster.  When we run reports, only 2-3 of the name-value pairs are used in search criteria, so these we'd like to keep in such a table, but other fields (typically 10-30 name-value pairs) are just listed in the report, so we thought it might make sense to keep these in a single row for efficiency sake as we do retrieve them in a group and don't need to sort or select based on their values.  "Single SELECT" was poor word choice as we were thinking more about retrieving a single row with 10-30 name-values stored in an ARRAY would be faster than retrieve 10-30 rows from a joined table.


At what size does it make more sense to store in LO instead of a BYTEA (because of all the escaping and such)?
How many BYTEA fields can I have in a database?
no limit (other than limits mentioned in the FAQ)
 
Are the BYTEA fields stored in the same table as the rest of the data?  

yes - and the TOAST tables if it's larger than 1/3 of a page or so. search for TOAST details if you're interested.
Hmm...  So a page is 8192 bytes, and it leaves your regular table and goes to TOAST if the BYTEA is more than 2730 bytes.  I thought it only went to TOAST when it exceed the page size, not just one-third of its size.  I am sure we have lots of encrypted, compressed XML (so it's all binary at this point, no longer text) that would be less than that.  So perhaps it makes sense to use BYTEA for these smaller binary objects as the data is stored with the row, is simpler to deal with, easy to handle in memory (unlike really big LOs), and the cost of escaping each byte may not be too high.

I have seen a comparison (http://zephid.dk/2008/08/09/oid-vs-bytea-in-postgresql/) that show BYTEA uses more memory (up to 10x more) and is slower (about 4x slower) than LOs, which indicate that most of this is due to escaping the bytes.

I'd avoid LO unless you really need streaming (block-wise) access.
This is interesting only because we've done the opposite.  That is, we store all binary data (mostly compressed, encrypted XML name-values) in LOs today and it works well.  But we are concerned about the pg_largeobject table being a bottleneck, becoming an issue for vaccum/vacuumlo/pg_dump as our database grows.

We'd like to do streaming for large files being uploaded, but today we're not doing that and have a java.sql.Blob interface class that essentially reads/writes using a byte array so we're not getting any benefits of streaming for very large objects, though as I said, most of our LOs are really not that big and thus not an issue for us.  We'll see what it means for us to change this to better support streaming for our truly large objects that we store.

Since you'd avoid LOs, what are the main advantages of BYTEA (since we use JDBC, we can use both with equal ease as both currently work for us using byte arrays in our Java code)?  I'm still thinking we may find that based on the size of the binary data, it may be best to choose BYTEA for smaller (< 8196 or < 2730) data and LOs elsewhere.

Thanks,
David

В списке pgsql-general по дате отправления:

Предыдущее
От: Dimitri Fontaine
Дата:
Сообщение: Re: Pet Peeves?
Следующее
От: Filip Rembiałkowski
Дата:
Сообщение: Re: Array, bytea and large objects