Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)

Поиск
Список
Период
Сортировка
От Daniele Varrazzo
Тема Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)
Дата
Msg-id CA+mi_8YZ_O2OukZ+TaKimak=-Wn-xC1JNm7Dvamg4PYMmZ_WGA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)  (Alban Hertroys <haramrae@gmail.com>)
Ответы Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On Mon, Jul 16, 2012 at 10:30 PM, Chris Bartlett
<c.bartlett@paradise.net.nz> wrote:
> At 8:35 PM +0100 16/7/12, Daniele Varrazzo wrote:
>>
>> Even without this quirk, the problem of mapping timestamps to other
>> languages data types could be an even stronger design factor. I've
>> personally settled for 9999-12-31 which is python's datetime.max, maps
>> ok to doubles and won't create problems for almost 8000 years.

> Stephane Faroult ('The Art of SQL') says that dates so far in the future can
> throw off query planners. He explains it nicely here:
> http://www.youtube.com/watch?v=gu0WJJXgEFM

Nice "article". It made me worry about what I've put into my db... So
I've made a test: I've tried creating a table with about 100K records
uniformly spread across one year:

    => create table datetest (ts timestamp);
    => insert into datetest (ts) select
generate_series('2012-1-1'::timestamp, '2012-12-31'::timestamp, '5
minutes'::interval) ;
    INSERT 0 105121
    => analyze datetest ;
    => select * from pg_stats where tablename = 'datetest' and attname = 'ts';

you get a nice uniform histogram. I've tried screwing it up inserting
10, 100, 1000, 10K, 100K "infinity" and/or "9999-12-31": the histogram
doesn't move of an inch (the values get recorded in most_common_vals
but not in histogram_bounds).

This already confirms that postgres doesn't suffer of the problem
highlighted in the article. I've made another test, adding first 365,
then about 10K distinct values in the year 3000-3001 range:

    => insert into datetest (ts) select
generate_series('3000-1-1'::timestamp, '3000-12-31'::timestamp, '1
day'::interval) ;
    INSERT 0 365
    => analyze datetest ;
    => select * from pg_stats where tablename = 'datetest' and attname = 'ts';

    => insert into datetest (ts) select
generate_series('3000-1-1'::timestamp, '3000-12-31'::timestamp, '1
hour'::interval) ;
    INSERT 0 8737
    => analyze datetest ;
    => select * from pg_stats where tablename = 'datetest' and attname = 'ts';

and it moves... of an inch. In the first case only one bin (of the 100
PG 9.1 defaults) contains dates around the 3000, in the second case
only about 8. And the values around the Y3K are in fact the 8% of the
table. No bin is wasted to record data between the 2013 and the 3000.
This suggests me that the PG histogram doesn't count the occurrence
into bins of the same size over the domain, but instead records the
boundaries on the domain of bins containing the same amount of data.

Pretty solid!

-- Daniele

В списке pgsql-general по дате отправления:

Предыдущее
От: Michael Nolan
Дата:
Сообщение: Re: Replication/cloning: rsync vs modification dates?
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)