Обсуждение: discarding duplicate indexes
I recently came across a scenario like this (tested on git head): ---- CREATE TABLE test (id int); CREATE INDEX test_idx1 ON test (id); CREATE INDEX test_idx2 ON test (id); CREATE TABLE test_copycat (LIKE test INCLUDING ALL); \d test_copycat ---- Why do we end up with only one index on test_copycat? The culprit seems to be transformIndexConstraints(), which explains: * Scan the index list and remove any redundant index specifications. This * can happen if, for instance, the user writesUNIQUE PRIMARY KEY. A * strict reading of SQL92 would suggest raising an error instead, but * that strikes me astoo anal-retentive. - tgl 2001-02-14 and this code happily throws out the second index statement in this example, since its properties are identical to the first. (Side note: some index properties, such as tablespace specification and comment, are ignored when determining duplicates). This behavior does seem like a minor POLA violation to me -- if we do not forbid duplicate indexes on the original table, it seems surprising to do so silently with INCLUDING INDEXES. There was consideration of similar behavior when this patch was proposed[1], so perhaps the behavior is as-designed, and I guess no one else has complained. IMO this behavior should at least be documented under the "LIKE source_table" section of CREATE TABLE's doc page. Josh [1] http://archives.postgresql.org/pgsql-patches/2007-07/msg00173.php
On 20/12/12 14:57, Josh Kupershmidt wrote:
I initially misread your example code, but after I realised my mistake, I thought of an alternative scenario that might be worth considering.CREATE TABLE test (id int); CREATE INDEX test_idx1 ON test (id); CREATE INDEX test_idx2 ON test (id);
CREATE TABLE test (id int, int sub, text payload); CREATE INDEX test_idx1 ON test (id, sub); CREATE INDEX test_idx2 ON test (id); Now test_idx2 is logically included in test_idx1, but if the majority of transactions only query on id, then test_idx2 would be more better as it ties up less RAM
Cheers,
Gavin
On 12/20/2012 12:26 AM, Gavin Flower wrote:
CREATE TABLE test (id int, int sub, text payload); CREATE INDEX test_idx1 ON test (id, sub); CREATE INDEX test_idx2 ON test (id); Now test_idx2 is logically included in test_idx1, but if the majority of transactions only query on id, then test_idx2 would be more better as it ties up less RAM
if sub is an integer, that index isn't that much larger. both indexes need to index all the rows, and with the header and block overhead, the extra word isn't that big of a deal. as long as there are some transactions using the other index, most of both of them will likely want to be in memory, so you'll end up using MORE memory.
On Thu, Dec 20, 2012 at 1:26 AM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote: > On 20/12/12 14:57, Josh Kupershmidt wrote: > > CREATE TABLE test (id int); > CREATE INDEX test_idx1 ON test (id); > CREATE INDEX test_idx2 ON test (id); > > I initially misread your example code, but after I realised my mistake, I > thought of an alternative scenario that might be worth considering. > > CREATE TABLE test (id int, int sub, text payload); > CREATE INDEX test_idx1 ON test (id, sub); > CREATE INDEX test_idx2 ON test (id); > > > Now test_idx2 is logically included in test_idx1, but if the majority of > transactions only query on id, then test_idx2 would be more better as it > ties up less RAM Well, this situation works without any LIKE ... INCLUDING INDEXES surprises. If you CREATE TABLE test_copycat (LIKE test INCLUDING INDEXES); you should see test_copycat created with both indexes, since indexParams is considered for this deduplicating. Josh