Re: Subquery WHERE IN or WHERE EXISTS faster?

От: Ulrich
Тема: Re: Subquery WHERE IN or WHERE EXISTS faster?
Дата: ,
Msg-id: 48688B44.6070407@gmx.net
(см: обсуждение, исходный текст)
Ответ на: Re: Subquery WHERE IN or WHERE EXISTS faster?  (Rusty Conover)
Ответы: Re: Subquery WHERE IN or WHERE EXISTS faster?  (Rusty Conover)
Список: pgsql-performance

Скрыть дерево обсуждения

Subquery WHERE IN or WHERE EXISTS faster?  (Ulrich, )
 Re: Subquery WHERE IN or WHERE EXISTS faster?  (Tom Lane, )
  Re: Subquery WHERE IN or WHERE EXISTS faster?  (Ulrich, )
   Re: Subquery WHERE IN or WHERE EXISTS faster?  (Gregory Stark, )
    Re: Subquery WHERE IN or WHERE EXISTS faster?  (Ulrich, )
     Re: Subquery WHERE IN or WHERE EXISTS faster?  (Gregory Stark, )
   Re: Subquery WHERE IN or WHERE EXISTS faster?  (Rusty Conover, )
    Re: Subquery WHERE IN or WHERE EXISTS faster?  (Ulrich, )
     Re: Subquery WHERE IN or WHERE EXISTS faster?  (Rusty Conover, )
      Re: Subquery WHERE IN or WHERE EXISTS faster?  ("Sergio Gabriel Rodriguez", )
   Re: Subquery WHERE IN or WHERE EXISTS faster?  ("Sergio Gabriel Rodriguez", )
  Re: Subquery WHERE IN or WHERE EXISTS faster?  ("Jaime Casanova", )
   Re: Subquery WHERE IN or WHERE EXISTS faster?  (Tom Lane, )

I think it will be fast, because the "IN set", which is the result of
"SELECT processorid FROM users_processors WHERE userid=4040", is limited
to a maximum of ~500 processors which is not very big. Increasing
Postgres' RAM would be difficult for me, because I am only running a
very small server with 256MB RAM and the webserver also likes to use
some RAM.

Does Postgre cache the HASH-Table for later use? For example when the
user reloads the website.

Kind regards
Ulrich

Rusty Conover wrote:
> This is what I've found with tables ranging in the millions of rows.
>
> Using IN is better when you've got lots of rows to check against the
> IN set and the IN set may be large and possibly complicated to
> retrieve (i.e. lots of joins, or expensive functions).
>
> Postgres will normally build a hash table of the IN set and just
> search that hash table. It's especially fast if the entire hash table
> that is built can fit into RAM.  The cpu/io cost of building the IN
> set can be quite large because it needs to fetch every tuple to hash
> it, but this can be faster then searching tuple by tuple through
> possibly many indexes and tables like EXISTS does.  I like to increase
> work_mem a lot (512mb and up) if I know I'm going to be doing a lot of
> matches against a large IN set of rows because I'd prefer for that
> hash table to never to be written to disk.
>
> EXISTS is better when you're doing fewer matches because it will pull
> the rows out one at a time from its query possibly using indexes, its
> main advantage is that it doesn't pull all of the tuples before it
> starts processing matches.
>
> So in summary both are good to know how to use, but choosing which one
> to use can really depend on your data set and resources.
>
> Cheers,
>
> Rusty
> --
> Rusty Conover
> InfoGears Inc.
> http://www.infogears.com
>



В списке pgsql-performance по дате сообщения:

От: "Marko Kreen"
Дата:
Сообщение: Re: Federated Postgresql architecture ?
От: "Jonah H. Harris"
Дата:
Сообщение: Re: Federated Postgresql architecture ?