Обсуждение: Feature Request: Extending PostgreSQL's Identifier Length Limit

Поиск

Список

Период

Сортировка

Feature Request: Extending PostgreSQL's Identifier Length Limit

От

David HJ

Дата:

18 июля 2024 г., 08:20:01

The Current Situation

As a long-time PostgreSQL user, I've increasingly run into issues with the 63-byte limit for identifiers, particularly table names. This limit, while historically sufficient, is becoming a significant pain point in modern database design and usage.

Real-World Examples

The problem is particularly evident in English table names, which make up a large portion of PostgreSQL's user base:

"Gross Domestic Product, Billions of Dollars, Not Seasonally Adjusted (GDPA)~Percent Change from Year Ago"
"Inflation, consumer prices for the United States, Percent, Not Seasonally Adjusted (FPCPITOTLZGUSA)"
"Annual Average Consumer Price Index for All Urban Consumers (CPI-U): U.S. City Average, All Items"

These names, while descriptive and useful, exceed our current limit. The issue extends to multi-byte character sets as well, such as this Chinese table name:

"能源消耗统计_全球主要国家石油与天然气使用量_年度碳排放与可再生能源比例表" (Energy consumption statistics table)

Why This Matters

In my experience, the complexity of data and the need for self-documenting schemas have grown significantly. We're dealing with:

Increasingly descriptive table names for clarity and self-documentation
Automated systems generating tables with detailed, often lengthy names
A growing international user base requiring support for multi-byte characters

While workarounds exist, such as using abbreviations or moving descriptions to comments, these solutions often lead to less intuitive database designs and reduced readability.

Potential Solutions

Based on community discussions, I believe we should consider:

Increasing NAMEDATALEN to 256 bytes or more
Making NAMEDATALEN configurable at compile-time
Changing NAMEDATALEN to represent character count instead of byte count
Implementing a variable-length "name" data type

Challenges and Considerations

I'm aware that this change presents significant technical challenges:

Maintaining backward compatibility
Potential impacts on storage and performance
Complexities in implementation, especially if moving to a variable-length system

Despite these challenges, I believe addressing this limitation is crucial for maintaining PostgreSQL's position as a versatile, user-friendly database system.

Call to Action

I respectfully request the PostgreSQL development team to consider this feature request. While I understand the technical complexities involved, I believe the benefits to the user experience would be substantial.

I'm eager to contribute to discussions and potentially assist in testing any proposed solutions. Let's work together to enhance PostgreSQL's capability to handle the evolving needs of modern database users.

Thank you for your consideration and your ongoing efforts in improving PostgreSQL.

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

От

Aleksander Alekseev

Дата:

18 июля 2024 г., 09:25:01

Hi David,

> As a long-time PostgreSQL user, I've increasingly run into issues with the 63-byte limit for identifiers,
particularlytable names. This limit, while historically sufficient, is becoming a significant pain point in modern
databasedesign and usage.
 

I can understand your pain. Unfortunately there are a number of
complications involved.

Take pg_class catalog table [1] as an example and its column `relname`
of type `name` [2]. On disk it is stored as an fixed-sized array of
chars:

```
typedef struct nameData
{
    char        data[NAMEDATALEN];
} NameData;
typedef NameData *Name;
```

Why not use TEXT? Mostly for performance reasons. In general case TEXT
data can be TOASTed. When using TEXT one should do an additional call
of heap_deform_tuple().

Using NAME allows the code to interpret tuple data as is, e.g.:
```
typedef struct FormData_phonebook
{
    int32 id;
    NameData name;
    int32 phone;
} FormData_phonebook;

typedef FormData_phonebook* Form_phonebook;

/* ... */

    while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
    {
        Form_phonebook record = (Form_phonebook) GETSTRUCT(tup);

        if(strcmp(record->name.data, name->data) == 0)
        {
            found_phone = record->phone;
            break;
        }
    }
```

So if you change NAME definition several things will happen:

1. You have to rebuild catalog tables. This can't be done in-place
because larger tuples may not fit into pages. Note that page size may
also vary depending on how PostgreSQL was compiled. Note that the
indexes will also be affected.

2. You will break all the extensions that use NAME and the
corresponding heap_* and index_* APIs.

3. The performance will generally decrease - many existing
applications will just waste memory or do unnecessary work due to
extra calls to heap_deform_tuple().

If (1) is doable in theory, I don't think (2) and (3) are something we
do in this project.

On top of that there is are relatively simple workarounds for the situation:

1. An application may have a function like shorten_name(x) = substr(x,
1, 50) || '_' || substr(md5(x), 1, 8). So instead of `SELECT * FROM x`
you just do `SELECT * FROM shorten_name(x)`.

2. You may fork the code and enlarge NAMEDATALEN. This is not
recommended and not guaranteed to work but worth a try.

This makes me think that solving the named limitation isn't worth the effort.

Personally I'm not opposed to the idea in general but you will need to
come up with a specific RFC that explains how exactly you propose to
solve named problems.

[1]: https://www.postgresql.org/docs/current/catalog-pg-class.html
[2]: https://www.postgresql.org/docs/current/datatype-character.html

-- 
Best regards,
Aleksander Alekseev

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

От

Alvaro Herrera

Дата:

18 июля 2024 г., 09:45:45

On 2024-Jul-18, David HJ wrote:

> As a long-time PostgreSQL user, I've increasingly run into issues with the
> 63-byte limit for identifiers, particularly table names. This limit, while
> historically sufficient, is becoming a significant pain point in modern
> database design and usage.

This has been discussed before.  I think the latest discussion, and some
preliminary proof-of-concept patches, were around here:

https://postgr.es/m/CAFBsxsF2V8n9w0SGK56bre3Mk9fzZS=9aaA8Gfs_n+woa3Dr-Q@mail.gmail.com

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

От

Bruce Momjian

Дата:

19 августа 2024 г., 04:17:38

On Thu, Jul 18, 2024 at 11:45:45AM +0200, Álvaro Herrera wrote:
> On 2024-Jul-18, David HJ wrote:
> 
> > As a long-time PostgreSQL user, I've increasingly run into issues with the
> > 63-byte limit for identifiers, particularly table names. This limit, while
> > historically sufficient, is becoming a significant pain point in modern
> > database design and usage.
> 
> This has been discussed before.  I think the latest discussion, and some
> preliminary proof-of-concept patches, were around here:
> 
> https://postgr.es/m/CAFBsxsF2V8n9w0SGK56bre3Mk9fzZS=9aaA8Gfs_n+woa3Dr-Q@mail.gmail.com

FYI, the COMMENT ON command can help to document identifiers.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Feature Request: Extending PostgreSQL's Identifier Length Limit

Feature Request: Extending PostgreSQL's Identifier Length Limit

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit

Re: Feature Request: Extending PostgreSQL's Identifier Length Limit