Re: \describe*

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: \describe*
Дата
Msg-id CADkLM=fxfsrHASKk_bY_A4uomJ1Te5MfGgD_rwwQfV8wP68ewg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: \describe*  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: \describe*  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
It seems this topic is ongoing so I've moved it to the September CF,
but it's in "Waiting on Author" because we don't have a concrete patch
that applies (or agreement on what it should do?) right now.

All recent work has been investigating the need(s) we're trying to address. This is as good of a time as any to share my findings (with much collaboration with Dave Fetter) so far.

1. Adding helper commands to psql aids only psql, and a great number of users do not, or can not, use psql. So adding something on the server side would have broader usage and appeal. Furthermore, some access tools (especially browser-based ones) are not good about returning non-tabular results, so helper commands that return result sets would have the broadest usage.

2. Our own interest in server-side commands is all over the map. Some just want the convenience of having them server side, or familiarity with $OTHER_DB. Others want to eliminate the need for some code in pg_dump, JDBC, or elsewhere.

3. There isn't much consensus in the other databases, though all of them do something:

SQLServer
---------------

which contextually returns one of two different result sets (name, owner, object type) or (column name, type, storage, length, precision, scale, nullable, default, rule, collation)

DB2
------
Has a describe command (source: https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0002019.html)  which can be used to describe query output (data type, data type length, column name, column name length).

It also has an option to DESCRIBE TABLE foo which returns a set of (col_name, schema_of_datatype, data_type, data_type_length, data_type_scale, Nulls t/f)

It also has DESCRIBE INDEXES FOR TABLE foo which returns a set of (schema of index, name of index, unique flag, number of columns, index type)

It also has DESCRIBE DATA PARTITIONS FOR TABLE which as you might guess shows partitions.

All of these options have a SHOW DETAIL modifier which adds more columns.

MySQL
----------

MySSQL has SHOW COLUMNS which also returns a set of  (name, type similar to format_type(), null flag, PK or index indicator, default value, notes about auto-increment/autogreneration/implicit trggers), and can be extended to show privileges and comments with the EXTENDED and FULL options.

MySQL has a DESCRIBE command, but it is a synonym of EXPLAIN.

MySQL also has a raft of commands like SHOW CREATE USER, SHOW CREATE VIEW, SHOW CREATE TRIGGER, SHOW CREATE TABLE, etc. (ex: https://dev.mysql.com/doc/refman/8.0/en/show-create-user.html)  These commands all return a result set of of exactly one column, each row representing one SQL statement, essentially doing a single-object schema-only pg_dump.

Oracle
---------

https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12019.htm

SQL*Plus has a describe command that works on tables and views and composite types (tabular set of: name, null, type) procedures (tabular set of: arg name, type, in/out), and packages (a series of sets one per type and procedure)

SQLcl has the INFO statement, which is roughly analogous to psql's \d in that it is a mix of tabular and non-tabular information.

Oracle itself has dbms_metadata.get_ddl() which seems analogous to mysql's SHOW CREATE commands.

Snowflake
--------------


Which return a set of: (name, type, column type, null flag, default, primary key, unique key, check, expression, comment).

It also has an option for describing "stage" tables, which are s3 buckets with a file format associated, the closest postgresql analog would be a file_fdw foreign table, and there is a separate result set format for that.

Snowflake has no concept of indexes (it understands that there's things called a unique keys, and it remembers that you said you wanted one, but does nothing to enforce it), so no command for that.

These result sets are not composable in a query, however, they are stored in the RESULT_SCAN cache, which means that you can run a describe, and then immediately fetch the results of that command as if it was a table.

Snowflake also has a get_ddl() function https://docs.snowflake.net/manuals/sql-reference/sql/desc-view.html which is a one-column result set of statements to re-create the given object.


From all this, I have so far concluded:

1. There is real demand to be able to easily see the basic structure of tables, views, and indexes in a way that strikes a balance between detail and clutter.
2. There is some acknowledgement that this data be useful if it was further filtered through SQL, though only one vendor has attempted to implement that (Snowflake) and even that was far from seamless.
3. There's a clear need to be able to get the DDL steps needed to re-create most common objects. This could be to copy-paste the info into another session to create a similar object elsewhere, or for test cases, or so an experienced person can see the "real guts" of an object without worrying about what details have been hidden.
4. The needs in #1 and #3 are in direct opposition to each other, and cannot easily be handled by the same command. Indeed, no one has tried.
5. The SHOW CREATE commands are neat, but the plethora of options (include comments? include RI constraints? does current session locale figure into the answer? dependencies?) probably mean that a function the get_ddl() examples above, can be jam-packed with default parameters and it's still going to have a very RETURNS SETOF TEXT output.
6. The convenience-function DESCRIBE / SHOW COLUMNS commands strive to have a tabular format with following: column name, a user-readable datatype, null/notnull flag, indicator that the column participates in a PK, in a unique index, in any other sort of index, without naming the index, and options for showing default values, and comments
7. People coming from other databases have expectations of a command like DESCRIBE existing, and those expectations are reasonable.
8. The commands we do make should strive to have a stable result-set format to allow their use in situations where the results can easily be handled via JDBC/DBD/DBI,   and maybe somewhere down the road the commands themselves can be used as a subquery the same way that TABLE foo is an alias for SELECT * FROM foo.
9. Since the relevant columns for describing a table are different than those for a view or an index or another object, they should be separate commands.

I'm interested to hear what other people thing, and which of these goals are most worth pursuing in the near term to make postgres more usable to newbies and veterans alike.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Batch insert in CTAS/MatView code
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Avoid full GIN index scan when possible