Re: Sequential vs. random values - number of pages in B-tree

Поиск
Список
Период
Сортировка
От Francisco Olarte
Тема Re: Sequential vs. random values - number of pages in B-tree
Дата
Msg-id CA+bJJbzHT4SSzE2MTw=1d_u1TFo12w5GzRnbuJR0BjH3eHPrLw@mail.gmail.com
обсуждение исходный текст
Ответ на Sequential vs. random values - number of pages in B-tree  (pinker <pinker@onet.eu>)
Ответы Re: Sequential vs. random values - number of pages in B-tree  ("Daniel Verite" <daniel@manitou-mail.org>)
Re: Sequential vs. random values - number of pages in B-tree  (pinker <pinker@onet.eu>)
Список pgsql-general
Hi:

On Thu, Aug 18, 2016 at 1:32 PM, pinker <pinker@onet.eu> wrote:
...
> create table t01 (id bigint);
> create index i01 on t01(id);
> insert into t01 SELECT s from generate_series(1,10000000) as s;
>
> and random values:
> create table t02 (id bigint);
> create index i02 on t02(id);
> insert into t02 SELECT random()*100 from generate_series(1,10000000) as s;

It's already been told that btrees work that way, if you find it
strange read a bit about them, this is completely normal, but ...

... what I come to point is your test is severely flawed. It probably
does not matter in this case, but you are inserting 10M DIFFERENT
VALUES in the first case and only 100 in the second one, which an
average of 100K DUPLICATES of each. This affects btrees too. You could
try using random*1G, or at least 100M, for a better test ( which may
have even worse behaviour, ideally I would just write 10M integers to
a disk file, then shuffle it and compare COPY FROM times from both ) (
unless you know of an easy way to generate a random permutation on the
fly without using a lot of memory, I do not ).

Francisco Olarte.


В списке pgsql-general по дате отправления:

Предыдущее
От: Francisco Olarte
Дата:
Сообщение: Re: SQL help - multiple aggregates
Следующее
От: "Daniel Verite"
Дата:
Сообщение: Re: Sequential vs. random values - number of pages in B-tree