Re: Q: Table scans on set difference

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Q: Table scans on set difference
Дата
Msg-id 44B7A7D1.5010504@magproductions.nl
обсуждение исходный текст
Ответ на Q: Table scans on set difference  ("G. Ralph Kuntz, MD" <grk@usa.net>)
Ответы Re: Q: Table scans on set difference  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
G. Ralph Kuntz, MD wrote:
> What's happening here?
>
> I have two tables, encounter_properties_table with about 100000 rows and
> xfiles with about 500000 rows. The structures of these tables is as follows:

>  file_name      | character varying(255)   |

>              Table "public.xfiles"
>   Column  |          Type          | Modifiers
> ----------+------------------------+-----------
>  filename | character varying(100) | not null

These columns are of different types, you're forcing a typecast on every
row comparison; I think the varchar(100)'s will be upscaled to
varchar(255) on comparison.

My advice: use the text type. It's more flexible (practically no size
limit) and faster.

> explain select file_name from encounter_properties_table where file_name not
> in (select filename from xfiles);

What about:
explain select file_name from encounter_properties_table
where not exists (
    select file_name from xfiles where filename = file_name);

I often even use "select 1" - a constant - because I'm not interested in
the value, but apparently selecting a column is marginally faster than
selecting a constant. Testing will prove it, I thought I'd mention the
possibilit.

> I ran vacumm analyze on both tables.
>
> We aborted this query when it had not finished after 4 hours.

Probably due to the type cast.
We used to run into this problem when using bigint index columns. We
changed them into int (which was sufficient) and the speed went up a lot.
Later we determined - with input from this list - that the cause wasn't
the size of the column but the type casting required to match the
constant integer values in our queries. In our case explicit casting of
our constant values helped.

> We ran the same query on SQLServer 2005 with the same data and it took under
> one second to finish.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

В списке pgsql-general по дате отправления:

Предыдущее
От: "Merlin Moncure"
Дата:
Сообщение: Re: Performance problem with query
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Timestamp vs timestamptz