Обсуждение: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

Поиск
Список
Период
Сортировка

Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
Bruce Momjian
Дата:
Does anyone think it is a good idea to document that our indexes cannot
index arbirarily-long strings?  I see nothing in the documentation now
about it.

---------------------------------------------------------------------------

Juho Saarikko wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> >
> >> Juho Saarikko wrote:
> >>
> >>> While I didn't test, I'd imagine that this would also mean that any attempt
> >>> to insert such values to an already unique column would fail.
> >>>
> >
> >
> >> Works here in 8.3:
> >>
> >
> >
> >>     test=> create table test (x text unique);
> >>     NOTICE:  CREATE TABLE / UNIQUE will create implicit index "test_x_key" for table "test"
> >>     CREATE TABLE
> >>     test=> insert into test values (repeat('a', 50000));
> >>     INSERT 0 1
> >>
> >
> > That test only works because it's eminently compressible.
> >
> >
> > The short answer to this bug report is that we're not very concerned
> > about fixing this because there is seldom a good reason to have an
> > index (unique or not) on fields that can get so wide.  As was already
> > noted, if you do need a uniqueness check you can easily make a 99.9999%
> > solution by indexing the md5 hash (or some similar digest) of the
> > column.  It doesn't really seem worthwhile to expend development work
> > on something that would benefit so few people.
> >
> >             regards, tom lane
> >
> >
> But the documentation needs to be updated to mention this nonetheless.
> It is a nasty surprise if it hits unawares.
>
> Besides, it's not such an impossible scenario. I encountered this bug
> when making an Usenet image archival system. Since the same images tend
> to be reposted a lot, it makes sense to store them only once, and simply
> reference the stored image from each context it was posted in. Currently
> my program does the uniqueness constraining by itself; I was examining
> having the database enforce it when I ran into this issue.
>
> Such applications are not exactly rare: bayimg, img.google.com, etc. and
> of course the innumerable Usenet archival sites could all conceivably
> want to do something like this. So could any application which monitors
> potentially repeating phenomena, for that matter. After all, saving a
> single state of the system only once not only reduces the amount of data
> stored, but could also help in actual analysis of it, since it becomes
> trivial to recognize most and least often recurring states.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
Phil Frost
Дата:
On Mar 5, 2008, at 11:20 , Bruce Momjian wrote:

>
> Does anyone think it is a good idea to document that our indexes
> cannot
> index arbirarily-long strings?  I see nothing in the documentation now
> about it.

+1

I never even knew. I'd be especially interested to hear about
details, like how long strings can be until they are not guaranteed
unique.

Re: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
Bruce Momjian
Дата:
Phil Frost wrote:
> On Mar 5, 2008, at 11:20 , Bruce Momjian wrote:
>
> >
> > Does anyone think it is a good idea to document that our indexes
> > cannot
> > index arbirarily-long strings?  I see nothing in the documentation now
> > about it.
>
> +1
>
> I never even knew. I'd be especially interested to hear about
> details, like how long strings can be until they are not guaranteed
> unique.

Part of the problem is we don't know the length --- it is the compressed
length that is limited, I think to 8k.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Part of the problem is we don't know the length --- it is the compressed
> length that is limited, I think to 8k.

That's a problem, and the length limit is different for different index
types.  So you're really not going to be able to say anything very
specific.  Another problem is where would you put the information?

            regards, tom lane

Re: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
"Gurjeet Singh"
Дата:
On Wed, Mar 5, 2008 at 10:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> Part of the problem is we don't know the length --- it is the compressed
> length that is limited, I think to 8k.

That's a problem, and the length limit is different for different index
types.  So you're really not going to be able to say anything very
specific.  Another problem is where would you put the information?

Along with the following statement:

method: The name of the index method to be used. Choices are btree, hash, gist, and gin. The default method is btree.

Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB      http://www.enterprisedb.com

17° 29' 34.37"N,  78° 30' 59.76"E - Hyderabad *
18° 32' 57.25"N,  73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco

http://gurjeet.frihost.net

Mail sent from my BlackLaptop device

Re: Re: [BUGS] BUG #3965: UNIQUE constraint fails on long column values

От
Bruce Momjian
Дата:
Gurjeet Singh wrote:
> On Wed, Mar 5, 2008 at 10:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Bruce Momjian <bruce@momjian.us> writes:
> > > Part of the problem is we don't know the length --- it is the compressed
> > > length that is limited, I think to 8k.
> >
> > That's a problem, and the length limit is different for different index
> > types.  So you're really not going to be able to say anything very
> > specific.  Another problem is where would you put the information?
>
>
> In 'CREATE INDEX'  docs...
>
> http://www.postgresql.org/docs/8.3/interactive/sql-createindex.html
>
> Along with the following statement:
>
> method: The name of the index method to be used. Choices are btree, hash,
> gist, and gin. The default method is btree.

We are trying to figure out how to specify the index-type-specific length
limits, not the index types themselves.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +