Re: PostgreSQL General Digest V1 #156

Поиск
Список
Период
Сортировка
От Paul Condon
Тема Re: PostgreSQL General Digest V1 #156
Дата
Msg-id 3922CF61.A2F0AAA1@quiknet.com
обсуждение исходный текст
Список pgsql-general
>

I don't know the internals of PostgreSQL, but logically INTERSECT is a join, but with an automatic
default generation of the WHERE clause. As such, there should not be any difference in performance
on tables that are large enough to mask small differences in the time it takes to parse the
command. If you have an example which shows that there is a big difference in performance in
PostgreSQL, this should be looked at by the relavant hackers, IMHO.

Concerning INTERSECT vs. join: SQL is a turgid, goofy, redundant language. Somebody should invent
a better one. Maybe this should be put on the TODO list of the PostgreSQL organization.

>
> Subject: Re: Am I really stupid???
> Date: Wed, 17 May 2000 10:57:45 +0200
> From: Dragos Stoichita <ddd@genesis.homeip.net>
> To: pgsql-general@hub.org
>
> Original message from: Raul Chirea
> >Anyway, try to learn some SQL bases before asking that kind of things
> >(especially the "join between two or more tables" notion and what is an
> >"index" and how to use it) !
> >You'll save a lot of time, yours and other's.
>
> 1) I intensively use indexes
> 2) I already used joins but my problem is a lot more complicated than this one, I have
> to do around 20 intersects between very complicated SQL queries, so rewriting this in
> joins is very difficult if not near impossible.
> 3) I think I have given some very strong arguments in my message.
>
> I have to write a database system for an online employment site and there is a
> search with more than 20 criteria. There should be around 10000 candidates after 1
> year, but I prefer to be sure the search is very fast with 100000 or 1000000 so that
> in the future there will be no problems. For each one of the search criteria I have done simple
> tables with 2 columns, one being the index and the other an integer indicating the
> candidate identifier. After having done multiple selects, I need to do an intersect.
> Of course I can't say it is impossible to write the same thing in joins, but believe me
> it would be a lot slower, here's my idea:
>
> Each of the selects returns around 2000 integers. I have a 400 Mhz pc for the
> development, but the final machine will be an IBM Netfinity server with several
> pentium 3 processors. What I do is a very high quality work and the server must
> be able to handle a huge demand.
>
> Now of course every people in this forum will tell me to rewrite the query in a join,
> because I gave a simple example, but I could tell you a SQL request that's near
> 1 page long, between multiple intersects and unions.
>
> Now that's ONE thing that I think nobody here will be able to excuse:
>
> Sorting integers on today's 400+ Mhz pc's, especially 10000 ones, is really fast.
> Doing an unique on sorted integers is really fast too.
> Doing an intersect on sorted, unique integers is really fast.
> So intersecting 2000x2000x3000x2000x5000 on today's 400+ Mhz pc's should
> always take less than 1 second (a lot less).
>
> Nobody really answered my question. I did not ask you to tell me how to rewrite
> my question, because I already know that, don't think I do not read the docs,
> tutorials, etc. I need this fast intersect because if I did not have it the complexity
> of the problem would have been multiplied by at least a factor of 10 believe me.


В списке pgsql-general по дате отправления:

Предыдущее
От: Patrick Robin
Дата:
Сообщение: pg_dump return failed sanity check
Следующее
От: mikeo
Дата:
Сообщение: line type