Re: byte-size of column values

Поиск
Список
Период
Сортировка
От Dominique Devienne
Тема Re: byte-size of column values
Дата
Msg-id CAFCRh-8vRSkJ9T1oTM6m9f9nPd9Af+r3j2a4VNT+kDBnryX5CA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: byte-size of column values  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: byte-size of column values  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On Tue, Oct 18, 2022 at 6:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dominique Devienne <ddevienne@gmail.com> writes:
> > I'm surprised by the result for bit(3) and char
> > The doc does mention 5-8 bytes overhead, but I expected
> > those for varying bit, not fixed-sized bit typed values.
>
> Your expectation is incorrect.  Postgres always treats these types
> as variable-length, whether or not the column has a length constraint.

OK. Still, wasn't such a stretch to assume that, no?
Now I know better, thanks to you and David.
I'm not sure the doc on types talks about that either.
Didn't see it for sure (but could still be there and I missed it).

> Thus, there's always a header to store the actual length.  That can
> be either 1 or 4 bytes (I think the doc you are looking at might be
> a little out of date on that point).

Even the doc on v15 (or devel) still says 5-to-8.
https://www.postgresql.org/docs/15/datatype-bit.html

And on my v12, that's born out from my experimentation.
Being used to SQLite using varints,
I'd have expected fewer overhead bytes for the size, like your 1-to-4.

> Because of the popularity of variable-width character encodings,
> a column declared as N characters wide isn't necessarily a fixed
> number of bytes wide, making it a lot less useful than you might
> think to have optimizations for fixed-width storage.  Between that
> and the fact that most Postgres developers regard CHAR(N) as an
> obsolete hangover from the days of punched cards, no such
> optimizations have been attempted.

Thanks for the background. I definitely appreciate PostgreSQL's large 1GB
limit on text and bytea columns, coming from Oracle's tiny 4K one, which
created us all kind of headaches.

For kicks, I looked at bpchar, blank-padded-char, and its extra byte, which
I assume is again some kind of length, there at least the overhead is small
compared to bit(n). 1 bytes versus 5 bytes is no small difference.

ddevienne=> create table bar (bpc bpchar(16));
CREATE TABLE
ddevienne=> insert into bar values ('foo'), ('bar baz');
INSERT 0 2
ddevienne=> select length(bpc), pg_column_size(bpc), '<'||bpc||'>' from bar;
 length | pg_column_size | ?column?
--------+----------------+-----------
      3 |             17 | <foo>
      7 |             17 | <bar baz>
(2 rows)



В списке pgsql-general по дате отправления:

Предыдущее
От: Dominique Devienne
Дата:
Сообщение: Re: byte-size of column values
Следующее
От: Dominique Devienne
Дата:
Сообщение: How to store "blobs" efficiently for small and large sizes, with random access