Re: ISN patch that applies cleanly with git apply

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: ISN patch that applies cleanly with git apply
Дата
Msg-id AANLkTimydWxkc-qdi8cRj11ArQx74EBVtb_1bmmXzb7e@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ISN patch that applies cleanly with git apply  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: ISN patch that applies cleanly with git apply  (Peter Geoghegan <peter.geoghegan86@gmail.com>)
Re: ISN patch that applies cleanly with git apply  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Whoops...seems there was some minor mangling when I applied the
original patch by:

peter@linux-peter-home:~/postgresql> patch --version
GNU patch 2.6.1.81-5b68
****snip***
peter@linux-peter-home:~/postgresql> patch -c < contrib_isn-1.patch


I've attached a revised version, which I've carefully eye-balled and
corrected manually. The issue before was that an else if(...){...} was
repeated. Sorry about that.


On 2 October 2010 21:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Even more to the point, what about a link to the relevant changes in the
> standard?  It's impossible for anyone to tell whether these changes are
> sane in a vacuum, and a regression test will prove nothing at all except
> perhaps self-consistency.

Here's the problem....as far as I'm aware, there was no change to the
standard. The standard is extremely simple. What has changed is that
the relevant authorities (regional ISBN agencies) have continued to
allocate new publisher codes as new publishers have started trading,
and we have fallen out of lock step with them. That, and an
additional, reserved "bookland" country code has come into use for
ISBN-13 (namely, 979). This affected hyphenation of ISBNs, which is
what the patch author complained about, though it didn't cause valid
ISBNs to be rejected outright.

I have some misgivings about the design of contrib/isn.

The isbn domains are a bit like a regex domain to validate e-mail
addresses that has every single TLD in the world baked in - it fails
to take into account that TLDs come and go, just as contrib/isn fails
to take into account that new ISBN ranges are created over time. You
should either accept that you can't do that because it's beyond your
remit as a domain author, and be happy with just validating the
syntax, or validate against an external, maintained database of valid
TLDs (there's a CPAN module that does just that and more). We cannot
know all ISBN ranges in advance, because they haven't all been
allocated yet, and never will be. The patch even says "Range Table as
of 2010-Jul-29".

While we're on the topic of contrib/isn's shortcomings, I think that
there should really be an additional convenience domain that enforces
that the value is some type of barcode (be it a UPC, EAN-13 or
whatever), which could be implemented with a simple CASE statement in
the check constraint, and there should be an EAN-8 domain and a
GTIN-14 domain. At the moment, I use a domain (which is AS bigint)
with a check constraint that calls a function that's like this:

-- returns checkdigit validity for UPC, EAN-8, EAN-13 and GTIN-14
CREATE OR REPLACE FUNCTION is_gtin(bigint)
RETURNS BOOLEAN
LANGUAGE sql
STRICT IMMUTABLE
AS $$
SELECT ( sum(dgt) % 10 ) = 0
    FROM
    (
            SELECT substring($1::text from idx for 1)::smallint AS dgt
            FROM (SELECT generate_series(length($1::text), 1, -2) as idx) AS foo
        UNION ALL
            SELECT substring($1::text from idx for 1)::smallint * 3 AS dgt
            FROM (SELECT generate_series(length($1::text) -1, 1, -2) as idx) AS foo
    ) AS bar

$$;

I'm not sure that the ISBN datatypes should be internally represented
as a 64-bit integer only - there should be some additional data that
specifies hyphenation. I'm not sure that we should be in the
hyphenation catch-up business. The fact is, however, that we are.
However, as far as I can tell at this stage, the patch doesn't make
the situation any worse, and is a temporary fix.

It's quite possible that I've overstated the problem, and that we
should continue to do our level best to hyphenate correctly for the
user, but it's important to understand that we cannot guarantee
correct hyphenation over time. Perhaps Jan Otto can weigh in here.


--
Regards,
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hitoshi Harada
Дата:
Сообщение: Re: INSERT ... VALUES... with ORDER BY / LIMIT
Следующее
От: Hitoshi Harada
Дата:
Сообщение: Re: wip: functions median and percentile