Обсуждение: Index on two columns not used

От:
Arnaud Lesauvage
Дата:

Hi list !

I have two table with a 2-column index on both of them.
In the first table, the first colum of the index is the primary key,
the second one is an integer field.
In the second table, the two columns are the primary key.
When I join these two tables, the 2-column index of the first table is
not used.
Why does the query planner think that this plan  is better ?

ALTER TABLE geo.subcities_names
   ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
language_id);

CREATE INDEX subcities_gid_language_id
   ON geo.subcities
   USING btree
   (gid, official_language_id);

EXPLAIN ANALYZE
SELECT * FROM geo.subcities sc, geo.subcities_names scn
WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
scn.language_id;

Result :

  Merge Join  (cost=0.00..4867.91 rows=37917 width=240) (actual
time=0.037..149.022 rows=39323 loops=1)
    Merge Cond: ("outer".gid = "inner".subcity_gid)
    Join Filter: ("outer".official_language_id = "inner".language_id)
    ->  Index Scan using subcities_pkey on subcities sc
(cost=0.00..1893.19 rows=39357 width=200) (actual time=0.015..43.430
rows=39357 loops=1)
    ->  Index Scan using subcities_names_pkey on subcities_names scn
(cost=0.00..2269.39 rows=40517 width=40) (actual time=0.012..35.465
rows=40517 loops=1)
  Total runtime: 157.389 ms
(6 rows)


Thanks for your suggestions !
Regards
--
Arnaud

От:
"Heikki Linnakangas"
Дата:

Arnaud Lesauvage wrote:
> I have two table with a 2-column index on both of them.
> In the first table, the first colum of the index is the primary key, the
> second one is an integer field.
> In the second table, the two columns are the primary key.
> When I join these two tables, the 2-column index of the first table is
> not used.
> Why does the query planner think that this plan  is better ?
>
> ALTER TABLE geo.subcities_names
>   ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
> language_id);
>
> CREATE INDEX subcities_gid_language_id
>   ON geo.subcities
>   USING btree
>   (gid, official_language_id);
>
> EXPLAIN ANALYZE
> SELECT * FROM geo.subcities sc, geo.subcities_names scn
> WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
> scn.language_id;

My theory:

There's no additional restrictions besides the join condition, so the
system has to scan both tables completely. It chooses to use a full
index scan instead of a seq scan to be able to do a merge join. Because
it's going to have to scan the indexes completely anyway, it chooses the
smallest index which is subcities_pkey.

You'd think that the system could do the merge using just the indexes,
and only fetch the heap tuples for matches. If that were the case, using
the 2-column index would indeed be a good idea. However, PostgreSQL
can't use the values stored in the index to check the join condition, so
all the heap tuples are fetched anyway. There was just recently
discussion about this on this list:
http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

От:
Arnaud Lesauvage
Дата:

Heikki Linnakangas a écrit :
> Arnaud Lesauvage wrote:
>> I have two table with a 2-column index on both of them.
>> In the first table, the first colum of the index is the primary key, the
>> second one is an integer field.
>> In the second table, the two columns are the primary key.
>> When I join these two tables, the 2-column index of the first table is
>> not used.
>> Why does the query planner think that this plan  is better ?
>
> You'd think that the system could do the merge using just the indexes,
> and only fetch the heap tuples for matches. If that were the case, using
> the 2-column index would indeed be a good idea. However, PostgreSQL
> can't use the values stored in the index to check the join condition, so
> all the heap tuples are fetched anyway. There was just recently
> discussion about this on this list:
> http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.
>


Thanks for your answer Heikki.
I did not know that joins were not using index values, and
that PostgreSQL had to fecth the heap tuples anyway.
Does this mean that this 2-column index is useless ? (I
created it for the join, I don't often filter on both
columns otherwise)

This query was taken from my "adminsitrative areas" model
(continents, countries, etc...). Whenever I query this
model, I have to join many tables.
I don't really know what the overhead of reading the
heap-tuples is, but would it be a good idea to add
data-redundancy in my tables to avoid joins ? (adding
country_id, continent_id, etc... in the "cities" table)

Regards
--
Arnaud


От:
"Heikki Linnakangas"
Дата:

Arnaud Lesauvage wrote:
> I did not know that joins were not using index values, and that
> PostgreSQL had to fecth the heap tuples anyway.
> Does this mean that this 2-column index is useless ? (I created it for
> the join, I don't often filter on both columns otherwise)

Well, if no-one is using the index, it is useless..

> This query was taken from my "adminsitrative areas" model (continents,
> countries, etc...). Whenever I query this model, I have to join many
> tables.
> I don't really know what the overhead of reading the heap-tuples is, but
> would it be a good idea to add data-redundancy in my tables to avoid
> joins ? (adding country_id, continent_id, etc... in the "cities" table)

It depends. I would advise not to denormalize unless you really really
have to. It's hard to say without more knowledge of the application.

Is the query you showed a typical one? It ran in about 160 ms, is that
good enough? It's doesn't sound too bad, considering that it returned
almost 40000 rows.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

От:
Arnaud Lesauvage
Дата:

Heikki Linnakangas a écrit :
> Arnaud Lesauvage wrote:
>> This query was taken from my "adminsitrative areas" model (continents,
>> countries, etc...). Whenever I query this model, I have to join many
>> tables.
>> I don't really know what the overhead of reading the heap-tuples is, but
>> would it be a good idea to add data-redundancy in my tables to avoid
>> joins ? (adding country_id, continent_id, etc... in the "cities" table)
>
> It depends. I would advise not to denormalize unless you really really
> have to. It's hard to say without more knowledge of the application.
>
> Is the query you showed a typical one? It ran in about 160 ms, is that
> good enough? It's doesn't sound too bad, considering that it returned
> almost 40000 rows.


It is quite typical, yes. It is the base query of a view. In
fact, most views have a lot more joins (they join with all
the upper-level tables).
But 150ms is OK, indeed.

От:
"Heikki Linnakangas"
Дата:

Arnaud Lesauvage wrote:
> It is quite typical, yes. It is the base query of a view. In fact, most
> views have a lot more joins (they join with all the upper-level tables).
> But 150ms is OK, indeed.

If the query using the view does anything more than a "SELECT * FROM
view", you should do an explain analyze of the query instead of the
definition of the view. The access plan might look very different.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

От:
Arnaud Lesauvage
Дата:

Heikki Linnakangas a écrit :
> Arnaud Lesauvage wrote:
>> It is quite typical, yes. It is the base query of a view. In fact, most
>> views have a lot more joins (they join with all the upper-level tables).
>> But 150ms is OK, indeed.
>
> If the query using the view does anything more than a "SELECT * FROM
> view", you should do an explain analyze of the query instead of the
> definition of the view. The access plan might look very different.

The views are used as linked tables in an Access Frontend.
Some accesses are "select * from view", others might filter
on a country_id or something similar.
For the moment performance is good, so I think I'll keep a
normalized database as long as it is possible !

Thanks for your help Heikki !

От:
Tom Lane
Дата:

Arnaud Lesauvage <> writes:
> When I join these two tables, the 2-column index of the first table is
> not used.
> Why does the query planner think that this plan  is better ?

Hm, is gid by itself nearly unique in these tables?  If so, the merge
join would get only marginally more efficient by using both columns as
merge conditions.  Heikki's probably right to guess that the planner
thinks it's better to use the smaller index.

However, if there are lots of duplicate gids, then it ought to have
preferred the two-column merge ...

            regards, tom lane

От:
Arnaud Lesauvage
Дата:

Tom Lane a écrit :
> Arnaud Lesauvage <> writes:
>> When I join these two tables, the 2-column index of the first table is
>> not used.
>> Why does the query planner think that this plan  is better ?
>
> Hm, is gid by itself nearly unique in these tables?  If so, the merge
> join would get only marginally more efficient by using both columns as
> merge conditions.  Heikki's probably right to guess that the planner
> thinks it's better to use the smaller index.
>
> However, if there are lots of duplicate gids, then it ought to have
> preferred the two-column merge ...

gid is the primary key of the first table, so absolutely unique.
Thanks for the information Tom !

--
Arnaud

От:
Péter Kovács
Дата:

Sorry for the amateurish question, but what are "heap tuples"?

Also, my understanding is that the following statement applies only for
composite indexes: "PostgreSQL can't use the values stored in the index
to check the join condition". I assume that PostgreSQL will be able to
use single-column-indexes for join conditions. Is this correct?

Thank you,
Peter

Heikki Linnakangas wrote:
> Arnaud Lesauvage wrote:
>> I have two table with a 2-column index on both of them.
>> In the first table, the first colum of the index is the primary key,
>> the second one is an integer field.
>> In the second table, the two columns are the primary key.
>> When I join these two tables, the 2-column index of the first table
>> is not used.
>> Why does the query planner think that this plan  is better ?
>>
>> ALTER TABLE geo.subcities_names
>>   ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
>> language_id);
>>
>> CREATE INDEX subcities_gid_language_id
>>   ON geo.subcities
>>   USING btree
>>   (gid, official_language_id);
>>
>> EXPLAIN ANALYZE
>> SELECT * FROM geo.subcities sc, geo.subcities_names scn
>> WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
>> scn.language_id;
>
> My theory:
>
> There's no additional restrictions besides the join condition, so the
> system has to scan both tables completely. It chooses to use a full
> index scan instead of a seq scan to be able to do a merge join.
> Because it's going to have to scan the indexes completely anyway, it
> chooses the smallest index which is subcities_pkey.
>
> You'd think that the system could do the merge using just the indexes,
> and only fetch the heap tuples for matches. If that were the case,
> using the 2-column index would indeed be a good idea. However,
> PostgreSQL can't use the values stored in the index to check the join
> condition, so all the heap tuples are fetched anyway. There was just
> recently discussion about this on this list:
> http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.
>

От:
Markus Schaber
Дата:

Hi, Peter,

Péter Kovács wrote:
> Sorry for the amateurish question, but what are "heap tuples"?
>
> Also, my understanding is that the following statement applies only for
> composite indexes: "PostgreSQL can't use the values stored in the index
> to check the join condition". I assume that PostgreSQL will be able to
> use single-column-indexes for join conditions. Is this correct?

Both questions are tightly related:

First, the "heap" is the part of the table where the actual tuples are
stored.

PostgreSQL uses an MVCC system, that means that multiple versions (with
their transaction information) of a single row can coexist in the heap.
This allows for higher concurrency in the backend.

Now, the index basically stores pointers like "pages 23 and 42 contain
rows with value 'foo'", but version information is not replicated to the
index pages, this keeps the index' size requirements low.

Additionally, in most UPDATE cases, the new row version will fit into
the same page as the old version. In this case, the index does not have
to be changed, which is an additional speed improvement.

But when accessing the data via the index, it can only give a
preselection of pages that contain interesting data, and PostgreSQL has
to look into the actual heap pages to check whether there really are row
versions that are visible in the current transaction.


A further problem is that some GIST index types are lossy, that means
the index does not retain the full information, but only an
approximation, for efficiency reasons.

A prominent example are the PostGIS geometry indices, they only store
the bounding box (4 float values) instead of the whole geometry (may be
millions of double precision coordinates). So it may be necessary to
re-check the condition with the real data, using the lossy index for
preselection.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org


От:
Alvaro Herrera
Дата:

Markus Schaber wrote:

> Additionally, in most UPDATE cases, the new row version will fit into
> the same page as the old version. In this case, the index does not have
> to be changed, which is an additional speed improvement.

Actually, when the UPDATE puts a new row version in the same heap page,
the index must be updated anyway.  All the rest of what you said is
correct.

There is another reason not to put visibility info in the index, which
is that it would be extremely complex to update all indexes to contain
the right visibility (and maybe impossible without risking deadlocks).
Updating only the heap is very simple.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

От:
Markus Schaber
Дата:

Hi, Alvaro,

Alvaro Herrera wrote:

>> Additionally, in most UPDATE cases, the new row version will fit into
>> the same page as the old version. In this case, the index does not have
>> to be changed, which is an additional speed improvement.
> Actually, when the UPDATE puts a new row version in the same heap page,
> the index must be updated anyway.

AFAICS only, when the index covers (directly or via function) a column
that's actually changed.

Changing columns the index does not depend on should not need any write
access to that index.

Correct me if I'm wrong.

Thanks,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

От:
Tom Lane
Дата:

Markus Schaber <> writes:
> Alvaro Herrera wrote:
>> Actually, when the UPDATE puts a new row version in the same heap page,
>> the index must be updated anyway.

> AFAICS only, when the index covers (directly or via function) a column
> that's actually changed.
> Changing columns the index does not depend on should not need any write
> access to that index.
> Correct me if I'm wrong.

You're wrong.  An UPDATE always writes a new version of the row (if it
overwrote the row in-place, it wouldn't be rollback-able).  The new
version has a different TID and therefore the index entry must change.
To support MVCC, our approach is to always insert a new index entry
pointing at the new TID --- the old one remains in place so that the old
version can still be found by transactions that need it.  Once the old
row version is entirely dead, VACUUM is responsible for removing both it
and the index entry pointing at it.

Other DBMSes use other approaches that shift the overhead to other
places, but that's how Postgres does it.

            regards, tom lane

От:
Péter Kovács
Дата:

Markus,

Thank you for your kind explanation.

Peter

Markus Schaber wrote:
> Hi, Peter,
>
> Péter Kovács wrote:
>
>> Sorry for the amateurish question, but what are "heap tuples"?
>>
>> Also, my understanding is that the following statement applies only for
>> composite indexes: "PostgreSQL can't use the values stored in the index
>> to check the join condition". I assume that PostgreSQL will be able to
>> use single-column-indexes for join conditions. Is this correct?
>>
>
> Both questions are tightly related:
>
> First, the "heap" is the part of the table where the actual tuples are
> stored.
>
> PostgreSQL uses an MVCC system, that means that multiple versions (with
> their transaction information) of a single row can coexist in the heap.
> This allows for higher concurrency in the backend.
>
> Now, the index basically stores pointers like "pages 23 and 42 contain
> rows with value 'foo'", but version information is not replicated to the
> index pages, this keeps the index' size requirements low.
>
> Additionally, in most UPDATE cases, the new row version will fit into
> the same page as the old version. In this case, the index does not have
> to be changed, which is an additional speed improvement.
>
> But when accessing the data via the index, it can only give a
> preselection of pages that contain interesting data, and PostgreSQL has
> to look into the actual heap pages to check whether there really are row
> versions that are visible in the current transaction.
>
>
> A further problem is that some GIST index types are lossy, that means
> the index does not retain the full information, but only an
> approximation, for efficiency reasons.
>
> A prominent example are the PostGIS geometry indices, they only store
> the bounding box (4 float values) instead of the whole geometry (may be
> millions of double precision coordinates). So it may be necessary to
> re-check the condition with the real data, using the lossy index for
> preselection.
>
> HTH,
> Markus
>

От:
Markus Schaber
Дата:

Hi, Tom,

Tom Lane wrote:

> You're wrong.  An UPDATE always writes a new version of the row (if it
> overwrote the row in-place, it wouldn't be rollback-able).  The new
> version has a different TID and therefore the index entry must change.
> To support MVCC, our approach is to always insert a new index entry
> pointing at the new TID --- the old one remains in place so that the old
> version can still be found by transactions that need it.

OK, good you corrected me.

I had the weird impression that both row versions have the same tuple ID
(as they are different versions of the same tuple), and so an index
change is not necessary when both versions fit on the same page.

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org