Обсуждение: discarding duplicate indexes

Поиск
Список
Период
Сортировка

discarding duplicate indexes

От
Josh Kupershmidt
Дата:
I recently came across a scenario like this (tested on git head):

----
CREATE TABLE test (id int);
CREATE INDEX test_idx1 ON test (id);
CREATE INDEX test_idx2 ON test (id);

CREATE TABLE test_copycat (LIKE test INCLUDING ALL);
\d test_copycat
----

Why do we end up with only one index on test_copycat? The culprit
seems to be transformIndexConstraints(), which explains:
  * Scan the index list and remove any redundant index specifications. This  * can happen if, for instance, the user
writesUNIQUE PRIMARY KEY. A  * strict reading of SQL92 would suggest raising an error instead, but  * that strikes me
astoo anal-retentive. - tgl 2001-02-14
 

and this code happily throws out the second index statement in this
example, since its properties are identical to the first. (Side note:
some index properties, such as tablespace specification and comment,
are ignored when determining duplicates). This behavior does seem like
a minor POLA violation to me -- if we do not forbid duplicate indexes
on the original table, it seems surprising to do so silently with
INCLUDING INDEXES.

There was consideration of similar behavior when this patch was
proposed[1], so perhaps the behavior is as-designed, and I guess no
one else has complained. IMO this behavior should at least be
documented under the "LIKE source_table" section of CREATE TABLE's doc
page.

Josh

[1] http://archives.postgresql.org/pgsql-patches/2007-07/msg00173.php



Re: discarding duplicate indexes

От
Gavin Flower
Дата:
<div class="moz-cite-prefix">On 20/12/12 14:57, Josh Kupershmidt wrote:<br /></div><blockquote
cite="mid:CAK3UJRGf6pq-adXT6fFMGQOriwH04JOqE_69j3wZUaf0uC2HXg@mail.gmail.com"type="cite"><pre wrap="">CREATE TABLE test
(idint);
 
CREATE INDEX test_idx1 ON test (id);
CREATE INDEX test_idx2 ON test (id);</pre></blockquote> I initially misread your example code, but after I realised my
mistake,I thought of an alternative scenario that might be worth considering.<br /><pre wrap="">CREATE TABLE test (id
int,int sub, text payload);
 
CREATE INDEX test_idx1 ON test (id, sub);
CREATE INDEX test_idx2 ON test (id);


<big><font face="Liberation Serif, serif"><font size="3"><big>Now <font face="Liberation Mono,
monospace">test_idx2</font>is logically included in <font face="Liberation Mono, monospace">test_idx1</font>, but if
themajority of transactions only query on <font face="Liberation Mono, monospace">id</font>, then <font
face="LiberationMono, monospace">test_idx2</font> would be more better as it ties up less
RAM</big></font></font></big>

</pre><p class="western" style="margin-bottom: 0cm">Cheers,<br /> Gavin<pre wrap=""></pre>

Re: discarding duplicate indexes

От
John R Pierce
Дата:
<div class="moz-cite-prefix">On 12/20/2012 12:26 AM, Gavin Flower wrote:<br /></div><blockquote
cite="mid:50D2CBC4.6070006@archidevsys.co.nz"type="cite"><pre wrap="">CREATE TABLE test (id int, int sub, text
payload);
CREATE INDEX test_idx1 ON test (id, sub);
CREATE INDEX test_idx2 ON test (id);


<big><font face="Liberation Serif, serif"><font size="3"><big>Now <font face="Liberation Mono,
monospace">test_idx2</font>is logically included in <font face="Liberation Mono, monospace">test_idx1</font>, but if
themajority of transactions only query on <font face="Liberation Mono, monospace">id</font>, then <font
face="LiberationMono, monospace">test_idx2</font> would be more better as it ties up less
RAM</big></font></font></big></pre></blockquote><br/> if sub is an integer, that index isn't that much larger.  both
indexesneed to index all the rows, and with the header and block overhead, the extra word isn't that big of a deal.  
aslong as there are some transactions using the other index, most of both of them will likely want to be in memory, so
you'llend up using MORE memory.<br /><br /><br /> 

Re: discarding duplicate indexes

От
Josh Kupershmidt
Дата:
On Thu, Dec 20, 2012 at 1:26 AM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
> On 20/12/12 14:57, Josh Kupershmidt wrote:
>
> CREATE TABLE test (id int);
> CREATE INDEX test_idx1 ON test (id);
> CREATE INDEX test_idx2 ON test (id);
>
> I initially misread your example code, but after I realised my mistake, I
> thought of an alternative scenario that might be worth considering.
>
> CREATE TABLE test (id int, int sub, text payload);
> CREATE INDEX test_idx1 ON test (id, sub);
> CREATE INDEX test_idx2 ON test (id);
>
>
> Now test_idx2 is logically included in test_idx1, but if the majority of
> transactions only query on id, then test_idx2 would be more better as it
> ties up less RAM

Well, this situation works without any LIKE ... INCLUDING INDEXES
surprises. If you CREATE TABLE test_copycat (LIKE test INCLUDING INDEXES);

you should see test_copycat created with both indexes, since
indexParams is considered for this deduplicating.

Josh