Re: [GENERAL] Expensive query

Поиск
Список
Период
Сортировка
От Dustin Sallings
Тема Re: [GENERAL] Expensive query
Дата
Msg-id Pine.SGI.3.95.981030140928.28572A-100000@bleu.west.spy.net
обсуждение исходный текст
Ответ на Expensive query  (Christophe Pettus <pettus@postdirect.com>)
Ответы Re: [GENERAL] Expensive query  (Herouth Maoz <herouth@oumail.openu.ac.il>)
Список pgsql-general
On Fri, 30 Oct 1998, Christophe Pettus wrote:

    That users table looks kinda useless.  I'd recommend changing id
to name and making id an int, then putting your events users in as ints.
Currently, it doesn't do anything at all, and you might as well be doing
this query:

    select distinct id from events where code = 'Whatever'
        and age('now',when) <= ' 1 day';

    I would also recommend not making code a char(10).  If you put an
index on code and an index on date, you should be able to get your results
pretty quick.  If you change the userid and the code to integers and do a
three table join, it might be faster, but it would use *much* less disk
space.

# I have two tables, structured as:
#
# users:
#
#     id        varchar(70) not null unique
#
# events:
#
#     userid        varchar(70) not null,
#     code        char(10) not null,
#     when        datetime not null
#
# The query I need to perform answers the question, "Which users do NOT
# have a particular event (selected by code), and which do not have ANY
# event for the last day?"  The query I use is:
#
#     select id from users
#        where
#           id not in (
#              select unique id from events
#              where code = 'some code'
#           )
#           and
#           id not in (
#              select unique id from events
#              where age('now',when) <= '1 day'
#           );
#
# This query is *very* expensive.  With 10,000 users and 40,000 events,
# it can take up to 20 minutes (!) to execute, and the postgres process
# grows to 40 megabytes (!!) of memory (on a Pentium II system running
# BSDI).  This seems surprising, since each of the subqueries only needs
# to be evaluated once, rather than once per row of users.  Is there a
# way to reformulate this query to make it less expensive?
#
#

--
SA, beyond.com           My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________


В списке pgsql-general по дате отправления:

Предыдущее
От: "Michael A. Koerber SR"
Дата:
Сообщение: Re: [GENERAL] Creating web images from postgres data
Следующее
От: Adam Miller
Дата:
Сообщение: thrashing like a beached shark