Re: Help with tuning this query (with explain analyze finally)

Поиск
Список
Период
Сортировка
От John Arbash Meinel
Тема Re: Help with tuning this query (with explain analyze finally)
Дата
Msg-id 422895D7.80006@arbash-meinel.com
обсуждение исходный текст
Ответ на Re: Help with tuning this query (with explain analyze finally)  ("Ken Egervari" <ken@upfactor.com>)
Список pgsql-performance
Ken Egervari wrote:

> Josh,
>
...

> I thought about this, but it's very important since shipment and
> shipment_status are both updated in real time 24/7/365.  I think I
> might be able to cache it within the application for 60 seconds at
> most, but it would make little difference since people tend to refresh
> within that time anyway. It's very important that real-time
> inforamtion exists though.
>
Is 60s real-time enough for you? That's what it sounds like. It would be
nice if you could have 1hr, but there's still a lot of extra work you
can do in 60s.

>> You could also always throw more hardware at it. :) If the
>> shipment_status is one of the bottlenecks, create a 4-disk raid10 and
>> move the table over.
>> I don't remember what your hardware is, but I don't remember it being a
>> quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
>> log on a solid state disk. :)
>
>
> That sounds like an awesome system.  I loved to have something like
> that. Unfortunately, the production server is just a single processor
> machine with 1 GB ram.  I think throwing more disks at it is probably
> the best bet, moving the shipment and shipment_status tables over as
> you suggested. That's great advice.
>
Well, disk I/O is one side, but probably sticking another 1GB (2GB
total) also would be a fairly economical upgrade for performance.

You are looking for query performance, not really update performance,
right? So buy a 4-port SATA controller, and some WD Raptor 10k SATA
disks. With this you can create a RAID10 for < $2k (probably like $1k).

> 30ms is a good target, although I guess I was naive for setting that
> goal perhaps.  I've just taken queries that ran at 600ms and with 1 or
> 2 indexes, they went down to 15ms.

It all depends on your query. If you have a giant table (1M rows), and
you are doing a seqscan for only 5 rows, then adding an index will give
you enormous productivity gains. But you are getting 30k rows, and
combining them with 6k rows, plus a bunch of other stuff. I think we've
tuned the query about as far as we can.

>
> Let's say we have 200 users signed into the application at the same
> time. The application refreshes their shipment information
> automatically to make sure it's up to date on the user's screen.  The
> application will execute the query we are trying to tune every 60
> seconds for most of these users.  Users can set the refresh time to be
> higher, but 60 is the lowest amount so I'm just assuming everyone has
> it at 60.
>
> Anyway, if you have 200 users logged in, that's 200 queries in the 60
> second period, which is about 3-4 queries every second.  As you can
> see, it's getting maxed out, and because of bad luck, the queries are
> bunched together and are being called at the same time, making 8-9
> queries in the same second and that's where the performance is
> starting to degrade.  I just know that if I could get this down to 30
> ms, or even 100, we'd be okay for a few months without throwing
> hardware at the problem.   Also keep in mind that other application
> logic and Hibernate mapping is occuring to, so 3-4 queries a second is
> already no good when everything is running on a single machine.
>
The other query I just sent, where you do the query for all users at
once, and then cache the result, *might* be cheaper than doing a bunch
of different queries.
However, you may find that doing the query for *all* users takes to
long. So you could keep another table indicating who the most recent
people logged in are, and then only cache the info for those people.
This does start getting a little more involved, so see if you can do all
users before heading down this road.

> This isn't the best setup, but it's the best we can afford.  We are
> just a new startup company.  Cheaper servers and open source keep our
> costs low. But money is starting to come in after 10 months of hard
> work, so we'll be able to replace our server within the next 2
> months.  It'll be a neccessity because we are signing on some big
> clientsnow and they'll have 40 or 50 users for a single company.  If
> they are all logged in at the same time, that's a lot of queries.
>
Sure. Just realize you can't really support 200 concurrent connections
with a single P4 and 1GB of ram.

John
=:->


Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: John Arbash Meinel
Дата:
Сообщение: Re: Help with tuning this query (with explain analyze finally)
Следующее
От: Charles Joseph
Дата:
Сообщение: Select in FOR LOOP Performance