Re: Performace comparison of indexes over timestamp fields
От | Alexander Staubo |
---|---|
Тема | Re: Performace comparison of indexes over timestamp fields |
Дата | |
Msg-id | 88daf38c0705220539q5807d2c2t1f20a03922bf8d33@mail.gmail.com обсуждение исходный текст |
Ответ на | Performace comparison of indexes over timestamp fields (Arnau <arnaulist@andromeiberica.com>) |
Ответы |
Re: Performace comparison of indexes over timestamp fields
("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
|
Список | pgsql-performance |
On 5/22/07, Arnau <arnaulist@andromeiberica.com> wrote: > On older versions of PostgreSQL, at least in my experience, queries > on timestamps fields even having indexes where performing quite bad > mainly sequential scans where performed. PostgreSQL uses B-tree indexes for scalar values. For an expression such as "t between a and b", I believe it's going to match both sides of the table independently (ie., t >= a and t <= b) and intersect these subsets. This is inefficient. You should get better performance by mapping timestamps to a one-dimensional plane and indexing them using GiST. GiST implements an R-tree-like structure that supports bounding-box searches. This involves setting up a functional index: create index ... on payment_transactions using gist ( box(point(extract(epoch from time), 0), point(extract(epoch from time), 0)) box_ops) I'm using box() here because GiST doesn't have a concept of points. Then insert as usual, and then query with something like: select ... from payment_transactions where box( point(extract(epoch from '2006-04-01'::date), 0), point(extract(epoch from '2006-08-01'::date), 0)) && box( point(extract(epoch from time), 0), point(extract(epoch from time), 0)); PostgreSQL should be able to exploit the GiST index by recognizing that the result of box() expression operand is already computed in the index. This much less inconvenient and portable -- I would love for PostgreSQL to be provide syntactic sugar and special-casing to make this transparent -- but worth it if you are dealing with a lot of range searches. > Now I have a newer version of PostgreSQL and I've done some tests > comparing the performance of an index over a timestamp field with a > numeric field. To do so, I have the following table: > > Table "public.payment_transactions" > Column | Type | Modifiers > ----------------+-----------------------------+--------------------------------- > transaction_id | character varying(32) | not null > timestamp_in | timestamp without time zone | default now() > credits | integer | > epoch_in | bigint | > epoch_in2 | double precision | [snip] A timestamp is stored internally as an 8-byte double-precision float. Therefore, timestamp_in and epoch_in2 should behave identically. > While doing the tests this table has about 100.000 entries. Make sure PostgreSQL is able to keep the entire table in memory by setting shared_buffers; you don't want to be hitting to the disk. Make sure you run "analyze" on the table before you execute the test. > To test the diferent indexes I have executed the following: Your query plans are roughly identical. The difference in the timings implies that you only ran the queries once. I suggest you run each query at least 10 times, and report the individual numbers (the "total runtime" parts of the output) you get. Arithmetic means are not that interesting. Alexander.
В списке pgsql-performance по дате отправления:
Предыдущее
От: cedricДата:
Сообщение: Re: Key/Value reference table generation: INSERT/UPDATE performance
Следующее
От: "Steinar H. Gunderson"Дата:
Сообщение: Re: Performace comparison of indexes over timestamp fields