Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: Feature Request: Extending PostgreSQL's Identifier Length Limit
Дата
Msg-id CAJ7c6TOurV4uA5Yz=aJ-ae4czL_zdFNqxbu47eyVrYFefrWoog@mail.gmail.com
обсуждение исходный текст
Ответ на Feature Request: Extending PostgreSQL's Identifier Length Limit  (David HJ <chuxiongzhong@gmail.com>)
Список pgsql-hackers
Hi David,

> As a long-time PostgreSQL user, I've increasingly run into issues with the 63-byte limit for identifiers,
particularlytable names. This limit, while historically sufficient, is becoming a significant pain point in modern
databasedesign and usage.
 

I can understand your pain. Unfortunately there are a number of
complications involved.

Take pg_class catalog table [1] as an example and its column `relname`
of type `name` [2]. On disk it is stored as an fixed-sized array of
chars:

```
typedef struct nameData
{
    char        data[NAMEDATALEN];
} NameData;
typedef NameData *Name;
```

Why not use TEXT? Mostly for performance reasons. In general case TEXT
data can be TOASTed. When using TEXT one should do an additional call
of heap_deform_tuple().

Using NAME allows the code to interpret tuple data as is, e.g.:
```
typedef struct FormData_phonebook
{
    int32 id;
    NameData name;
    int32 phone;
} FormData_phonebook;

typedef FormData_phonebook* Form_phonebook;

/* ... */

    while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
    {
        Form_phonebook record = (Form_phonebook) GETSTRUCT(tup);

        if(strcmp(record->name.data, name->data) == 0)
        {
            found_phone = record->phone;
            break;
        }
    }
```

So if you change NAME definition several things will happen:

1. You have to rebuild catalog tables. This can't be done in-place
because larger tuples may not fit into pages. Note that page size may
also vary depending on how PostgreSQL was compiled. Note that the
indexes will also be affected.

2. You will break all the extensions that use NAME and the
corresponding heap_* and index_* APIs.

3. The performance will generally decrease - many existing
applications will just waste memory or do unnecessary work due to
extra calls to heap_deform_tuple().

If (1) is doable in theory, I don't think (2) and (3) are something we
do in this project.

On top of that there is are relatively simple workarounds for the situation:

1. An application may have a function like shorten_name(x) = substr(x,
1, 50) || '_' || substr(md5(x), 1, 8). So instead of `SELECT * FROM x`
you just do `SELECT * FROM shorten_name(x)`.

2. You may fork the code and enlarge NAMEDATALEN. This is not
recommended and not guaranteed to work but worth a try.

This makes me think that solving the named limitation isn't worth the effort.

Personally I'm not opposed to the idea in general but you will need to
come up with a specific RFC that explains how exactly you propose to
solve named problems.

[1]: https://www.postgresql.org/docs/current/catalog-pg-class.html
[2]: https://www.postgresql.org/docs/current/datatype-character.html

-- 
Best regards,
Aleksander Alekseev



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Add mention of execution time memory for enable_partitionwise_* GUCs
Следующее
От: Nitin Motiani
Дата:
Сообщение: Re: long-standing data loss bug in initial sync of logical replication