Обсуждение: slow query - will CLUSTER help?

Поиск

Список

Период

Сортировка

slow query - will CLUSTER help?

От

Sev Zaslavsky

Дата:

19 декабря 2013 г., 19:41:35

Hello,

I've got a very simple table with a very simple SELECT query, but it takes longer on the initial run than I'd like, so I want to see if there is a strategy to optimize this.

Table rt_h_nbbo contains several hundred million rows. All rows for a given entry_date are appended to this table in an overnight process every night - on the order of several million rows per day.

The objective is to select all of the rows for a given product_id on a given entry_date.

There is a b-tree index on (product_id, entry_date). The index appears to be used correctly. I'm seeing that if the data pages are not in memory, nearly all of the time is spent on disk I/O. The first time, the query takes 21 sec. If I run this query a second time, it completes in approx 1-2 ms.

I perceive an inefficiency here and I'd like your input as to how to deal with it: The end result of the query is 1631 rows which is on the order of about a couple hundred Kb of data. Compare that to the amount of I/O that was done: 1634 buffers were loaded, 16Mb per page - that's about 24 Gb of data! Query completed in 21 sec. I'd like to be able to physically re-organize the data on disk so that the data for a given product_id on a entry_date is concentrated on a few pages instead of being scattered like I see here.

First question is: Does loading 24Gb of data in 21 sec seem "about right" (hardware specs at bottom of email)?

Second question: Is it possible to tell postgres to physically store the data in such a way that it parallels an index? I recall you can do this in Sybase with a CLUSTERED index. The answer for Postgresql seems to be "yes, use the CLUSTER command". But this command does a one-time clustering and requires periodic re-clustering. Is this the best approach? Are there considerations with respect to the type of index (B-tree, GIST, SP-GIST) being used for CLUSTER ?

Thanks

-Sev

Table (this is a fairly large table - hundreds of millions of rows):

CREATE TABLE rt_h_nbbo
(
product_id integer NOT NULL,
bid_price double precision NOT NULL DEFAULT 0.0,
bid_size integer NOT NULL DEFAULT 0,
ask_price double precision NOT NULL DEFAULT 0.0,
ask_size integer NOT NULL DEFAULT 0,
last_price double precision NOT NULL DEFAULT 0.0,
entry_date date NOT NULL,
entry_time time without time zone NOT NULL,
event_time time without time zone NOT NULL,
day_volume bigint NOT NULL DEFAULT 0,
day_trade_ct integer,
entry_id bigint NOT NULL,
CONSTRAINT rt_h_nbbo_pkey PRIMARY KEY (entry_id),
CONSTRAINT rt_h_nbbo_pfkey FOREIGN KEY (product_id)
REFERENCES product (product_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE rt_h_nbbo
OWNER TO postgres;

Index:

CREATE INDEX rt_h_nbbo_idx
ON rt_h_nbbo
USING btree
(product_id, entry_date DESC);

Test:

SET track_io_timing = on;
EXPLAIN (ANALYZE,BUFFERS,VERBOSE,COSTS,TIMING) select * from rt_h_nbbo where product_id=6508 and entry_date='2013-11-26';

Output:

"Index Scan using rt_h_nbbo_idx on public.rt_h_nbbo (cost=0.00..12768.21 rows=3165 width=76) (actual time=12.549..21654.547 rows=1631 loops=1)"
" Output: product_id, bid_price, bid_size, ask_price, ask_size, last_price, entry_date, entry_time, event_time, day_volume, day_trade_ct, entry_id"
" Index Cond: ((rt_h_nbbo.product_id = 6508) AND (rt_h_nbbo.entry_date = '2013-11-26'::date))"
" Buffers: shared hit=4 read=1634"
" I/O Timings: read=21645.468"
"Total runtime: 21655.002 ms"

Hardware

Top of the line HP DL380 G7 server with 32 Gb Ram, P410i RAID, 10K SAS drives in Raid-1 config. Wal on separate Raid-1 volume with 15K SAS drives.
The only unusual thing here is that I'm running on Windows Server 2008 R2.

slow loading of pages for SELECT query - will CLUSTER help?

От

Sev Zaslavsky

Дата:

19 декабря 2013 г., 19:41:39

Hello,

I've got a very simple table with a very simple SELECT query, but it takes longer on the initial run than I'd like, so I want to see if there is a strategy to optimize this.

Table rt_h_nbbo contains several hundred million rows. All rows for a given entry_date are appended to this table in an overnight process every night - on the order of several million rows each night.

The objective is to select all of the rows for a given product_id on a given entry_date.

There is a b-tree index on (product_id, entry_date) called rt_h_nbbo_idx. The index appears to be used correctly. I'm seeing that if the data pages are not in memory, nearly all of the time is spent on disk I/O. The first time, the query takes 21 sec. If I run this query a second time, it completes in approx 1-2 ms.

Running select pg_relation_size( 'rt_h_nbbo') / reltuples FROM pg_class WHERE relname = 'rt_h_nbbo'; yields roughly 135 bytes/row.

I perceive an inefficiency here and I'd like your input as to how to deal with it: The end result of the query is 1631 rows which is approx 220 kb of data (at 135 bytes/row). Compare that to the amount of I/O that was done: 1634 buffers were loaded, 8 kb per buffer - that's about 13Mb of data! Query completed in 21 sec.

So 13Mb of data was read from disk, but only 220Kb was useful - about 1.7%. I'd like to make this work faster and hopefully more efficiently.

First question is: Does loading 13Mb of data in 21 sec seem kinda slow or about right (hardware specs at bottom of email)?

Second question: Perhaps I can reduce the number of pages that contain the data I want by physically storing the data in such a way that it parallels the rt_h_nbbo_idx index? I recall you can do this in Sybase with a CLUSTERED index. The answer for Postgresql seems to be "yes, use the CLUSTER command". But this command does a one-time clustering and requires periodic re-clustering. Is this the best approach? Are there considerations with respect to the type of index (B-tree, GIST, SP-GIST) being used for CLUSTER ?

Thanks

-Sev

Table (this is a fairly large table - hundreds of millions of rows):

CREATE TABLE rt_h_nbbo
(
product_id integer NOT NULL,
bid_price double precision NOT NULL DEFAULT 0.0,
bid_size integer NOT NULL DEFAULT 0,
ask_price double precision NOT NULL DEFAULT 0.0,
ask_size integer NOT NULL DEFAULT 0,
last_price double precision NOT NULL DEFAULT 0.0,
entry_date date NOT NULL,
entry_time time without time zone NOT NULL,
event_time time without time zone NOT NULL,
day_volume bigint NOT NULL DEFAULT 0,
day_trade_ct integer,
entry_id bigint NOT NULL,
CONSTRAINT rt_h_nbbo_pkey PRIMARY KEY (entry_id),
CONSTRAINT rt_h_nbbo_pfkey FOREIGN KEY (product_id)
REFERENCES product (product_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE rt_h_nbbo
OWNER TO postgres;

Index:

CREATE INDEX rt_h_nbbo_idx
ON rt_h_nbbo
USING btree
(product_id, entry_date DESC);

Test:

SET track_io_timing = on;
EXPLAIN (ANALYZE,BUFFERS,VERBOSE,COSTS,TIMING) select * from rt_h_nbbo where product_id=6508 and entry_date='2013-11-26';

Output:

"Index Scan using rt_h_nbbo_idx on public.rt_h_nbbo (cost=0.00..12768.21 rows=3165 width=76) (actual time=12.549..21654.547 rows=1631 loops=1)"
" Output: product_id, bid_price, bid_size, ask_price, ask_size, last_price, entry_date, entry_time, event_time, day_volume, day_trade_ct, entry_id"
" Index Cond: ((rt_h_nbbo.product_id = 6508) AND (rt_h_nbbo.entry_date = '2013-11-26'::date))"
" Buffers: shared hit=4 read=1634"
" I/O Timings: read=21645.468"
"Total runtime: 21655.002 ms"

Hardware

Top of the line HP DL380 G7 server with 32 Gb Ram, P410i RAID, 10K SAS drives in Raid-1 config. Wal on separate Raid-1 volume with 15K SAS drives.
The only unusual thing here is that I'm running on Windows Server 2008 R2.

Re: slow query - will CLUSTER help?

От

Shaun Thomas

Дата:

19 декабря 2013 г., 21:19:27

On 12/12/2013 11:30 AM, Sev Zaslavsky wrote:

> _First question_ is: Does loading 24Gb of data in 21 sec seem "about
>  right" (hardware specs at bottom of email)?

That's actually pretty good. 24GB is a lot of data to process.

> _Second question_: Is it possible to tell postgres to physically store
> the data in such a way that it parallels an index?

Yes and no. Unlike Sybase or SQL Server, CLUSTERed indexes in PostgreSQL
are not maintained in the index pages. When you CLUSTER a table by a
particular index, it's only sorted in that order initially. New inserts
and updates no longer honor that ordering.

However, since you said you're inserting data by date, your data should
already be naturally sorted. Your query plan also looked right to me.
You may have some excess expectations for your hardware, though.

A RAID-1 of 15K drives can deliver, at most, 1000 reads per second
depending on your drives and the controller cache. That's a very
optimistic assumption. The plan said it fetched 1631 rows from the
index. In order to weed out dead pages, it verifies data by checking the
data pages, which is another 1631 fetches at minimum. All by itself,
that's about three seconds of IO from a cold cache.

I agree that 21 seconds is rather high for this workload, but Windows
handles data caching and data elevator algorithms much differently than
Linux, so I can't say if anything else is going on.

> Top of the line HP DL380 G7 server with 32 Gb Ram, P410i RAID, 10K
> SAS drives in Raid-1 config. Wal on separate Raid-1 volume with 15K
> SAS drives.The only unusual thing here is that I'm running on Windows
> Server 2008 R2.

In any case, you should really consider upgrading both your hardware,
and switching your DB server to Linux. If you are handling millions of
rows on a regular basis, 32GB will not be sufficient for longer than a
few months. Eventually your data will no longer fit in memory, and
you'll see more and more disk-related delays.

Further, a RAID1 is not good enough for that kind of data volume. If you
cant afford a RAID1+0 consisting of several spindles, NVRAM-based
solution (SSD or PCIe card), or a SAN, you simply do not have enough
IOPS to supply a fast database of any description.

I only suggest Linux as your OS because that's the primary use case.
Most testing, development, and users have that setup. You're much more
likely to get meaningful feedback if you follow the herd. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Re: slow query - will CLUSTER help?

От

Kevin Grittner

Дата:

19 декабря 2013 г., 23:14:25

Sev Zaslavsky <sevzas@gmail.com> wrote:

I want to agree with everything Shaun said and add a tiny bit.

> Does loading 24Gb of data in 21 sec seem "about right"?

It's a little on the slow side.  You said 1634 page reads.  At 9 ms
per read that would be 14.7 seconds.  But I'm basing the 9 ms per
page read on my Linux experience, and I remember benchmarking the
same application hitting PostgreSQL on the same hardware as about
30% faster on Linux than on Windows, so that *almost* makes up for
the difference.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: slow query - will CLUSTER help?

От

Jeff Janes

Дата:

19 декабря 2013 г., 23:33:19

On Thu, Dec 12, 2013 at 9:30 AM, Sev Zaslavsky <sevzas@gmail.com> wrote:

Hello,

I've got a very simple table with a very simple SELECT query, but it takes longer on the initial run than I'd like, so I want to see if there is a strategy to optimize this.

Table rt_h_nbbo contains several hundred million rows. All rows for a given entry_date are appended to this table in an overnight process every night - on the order of several million rows per day.

The objective is to select all of the rows for a given product_id on a given entry_date.

There is a b-tree index on (product_id, entry_date). The index appears to be used correctly. I'm seeing that if the data pages are not in memory, nearly all of the time is spent on disk I/O. The first time, the query takes 21 sec. If I run this query a second time, it completes in approx 1-2 ms.

I perceive an inefficiency here and I'd like your input as to how to deal with it: The end result of the query is 1631 rows which is on the order of about a couple hundred Kb of data. Compare that to the amount of I/O that was done: 1634 buffers were loaded, 16Mb per page - that's about 24 Gb of data!

A page is usually 8KB, not 16MB (nor 16Mb).

Query completed in 21 sec. I'd like to be able to physically re-organize the data on disk so that the data for a given product_id on a entry_date is concentrated on a few pages instead of being scattered like I see here.

If you load the data in daily batches, it is probably already fairly well clustered by entry_date. If you sort the batch by product_id before bulk loading it, then it should stay pretty well clustered on (entry_date, product_id).

Cheers,

Jeff

Re: slow query - will CLUSTER help?

От

Sergey Konoplev

Дата:

19 декабря 2013 г., 23:34:38

On Thu, Dec 12, 2013 at 9:30 AM, Sev Zaslavsky <sevzas@gmail.com> wrote:
[...]

> Table rt_h_nbbo contains several hundred million rows.  All rows for a given
> entry_date are appended to this table in an overnight process every night -
> on the order of several million rows per day.

[...]

> I perceive an inefficiency here and I'd like your input as to how to deal
> with it: The end result of the query is 1631 rows which is on the order of
> about a couple hundred Kb of data.  Compare that to the amount of I/O that
> was done: 1634 buffers were loaded, 16Mb per page - that's about 24 Gb of
> data!  Query completed in 21 sec.  I'd like to be able to physically
> re-organize the data on disk so that the data for a given product_id on a
> entry_date is concentrated on a few pages instead of being scattered like I
> see here.

Do you perform a regular cleaning of the table with DELETEs or may be
you use UPDATEs for some another reason?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: slow query - will CLUSTER help?

От

Sergey Konoplev

Дата:

20 декабря 2013 г., 00:24:54

On Thu, Dec 19, 2013 at 12:54 PM, Sev Zaslavsky <sevzas@gmail.com> wrote:
> On 12/19/2013 3:34 PM, Sergey Konoplev wrote:
>> On Thu, Dec 12, 2013 at 9:30 AM, Sev Zaslavsky <sevzas@gmail.com> wrote:
>>> Table rt_h_nbbo contains several hundred million rows.  All rows for a
>>> given
>>> entry_date are appended to this table in an overnight process every night
>>> -
>>> on the order of several million rows per day.
>>
>> Do you perform a regular cleaning of the table with DELETEs or may be
>> you use UPDATEs for some another reason?
>
> At this point we're neither deleting nor updating the data once written to
> the db.

Than I can see two reasons of the problem:

1. The indexed data is too big and index search is getting worth day by day

I would try to create a partial index for one day and repeat the
EXPLAIN ANALYZE with this day. If there will be some significant
improvements then I would start creating partial indexes for every new
day before it starts and drop them after some time when they became
obsolete.

2. You are limited with IO

I would also suggest you to upgrade your storage in this case.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: slow query - will CLUSTER help?

От

Shaun Thomas

Дата:

20 декабря 2013 г., 17:42:29

On 12/19/2013 03:24 PM, Sergey Konoplev wrote:

> 2. You are limited with IO
> I would also suggest you to upgrade your storage in this case.

I think this is the case. If I recall correctly, his setup includes a
single RAID-1 for everything, and he only has 32GB of RAM. In fact, the
WAL traffic from those inserts alone are likely saturating the write IO,
especially if it starts a checkpoint while the load is still going on. I
wouldn't want to be around for that.

Even with a fairly selective index, just the fetches necessary to
identify the rows and verify the data pages will choke a RAID-1 with
almost every query. Any table with several hundred million rows is also
too big to fit in cache if any significant portion of it is fetched on a
regular basis. The cache turnover is probably extremely high, too.

That workload is just too high for a system of that description. It
would be fine for a prototype, development, or possibly a QA system, but
if that's intended to be a production resource, it needs more memory and IO.

Also since I can't see part of this conversation and it doesn't seem
anyone else mentioned it, the WAL directory *must* be moved to a
separate set of disks for a workload of this volume. The amount of
writes here will constantly degrade read IO and further increase fetch
times.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Re: slow query - will CLUSTER help?

От

Shaun Thomas

Дата:

20 декабря 2013 г., 19:11:47

On 12/20/2013 09:57 AM, Sev Zaslavsky wrote:

> There is a separate RAID-1 for WAL, another for tablespace and another
> for operating system.

I tend to stick to DB-size / 10 as a minimum, but I also have an OLTP
system. For a more OLAP-type, the ratio is negotiable.

The easiest way to tell is to monitor your disk IO stats. If you're
seeing a READ-based utilization percentage over 50% consistently, you
need more RAM. On our system, we average 10% through the day except for
maintenance and loading phases.

Of course, that's only for the current DB size. A good trick is to
monitor your DB size changes on a daily basis, plot the growth
percentage for a week, and apply compounding growth to estimate the size
in three years. Most companies I've seen are on a 3-year replacement
cycle, so that gives you how much you'll have to buy in order to avoid
another spend until the next iteration.

For example, say you have a 800GB database, and it grows at 10GB per
week, so that's 40GB per month. In three years, you could need up to:

800 * (1 + 40/800)^36 = 4632GB of space, which translates to roughly
480-512 GB of RAM. You can probably find a comfortable middle ground
with 240GB.

Of course, don't forget to buy modules in multiples of four, otherwise
you're not taking advantage of all the CPU's memory channels. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Re: slow query - will CLUSTER help?

От

Gavin Flower

Дата:

21 декабря 2013 г., 00:23:19

On 21/12/13 05:11, Shaun Thomas wrote:
[...]
> .
>
> Of course, don't forget to buy modules in multiples of four, otherwise
> you're not taking advantage of all the CPU's memory channels. :)
>
Note some processors have 3 (three) memory channels!  And I know of some
with 4 memory channels.  So it is important to check your processor &
mother board.

The desktop I got when I joined a university on contract had 12GB about
2 years ago.


Cheers,
Gavin

Re: slow query - will CLUSTER help?

От

Shaun Thomas

Дата:

23 декабря 2013 г., 19:41:29

> What are your thoughts on the right way to use SSDs in a RAID to
> enhance postgres I/O performance?  In an earlier reply, you
>  indicated one of a "RAID1+0 consisting of several spindles,
>  NVRAM-based solution (SSD or PCIe card), or a SAN"

Well, it's a tiered approach. If you can identify your tables with the most critical OLTP needs, you can create a
separatetablespace specifically for SSD storage to give them the performance they need. After that, you might consider
partitioningeven those tables, as older data won't be accessed as often, so won't need those kind of IOPS long-term.
Olderpartitions could be slated toward the RAID. 

Regarding what kind of SSD, just make sure the drives themselves are capacitor-backed. Current SSDs have only a few
microsecondsof write delay, but it's enough of a race condition to lead to corruption in power outages without some
assurancein-transit data is committed. 

If you have the money, stuff like FusionIO PCIe cards are extremely fast, on the order of 10x faster than a standard
SSD.I'd personally reserve these for performance-critical things like online trading platforms, since they're so
costly.

Then of course, SANs can mix the world of RAID and SSD, because they often have internal mechanisms to deliver
requestedIOPS by spreading storage allocations along installed components necessary to match them. This is probably the
mostexpensive overall approach, but many larger companies either already have SANs, or will eventually need one anyway. 

That's just a bird's eye view of everything. There's obviously more involved. :)


______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: slow query - will CLUSTER help?