Обсуждение: Row ordering after CREATE TABLE AS...SELECT regexp_split_to_table(source_text, regexp) AS new_column
This is a two-part question: 1) I have a source_text that I want to divide into smaller subunits that will be contained in rows in a column in a new table. Is it absolutely certain that the initial order of the rows in the resultant table after this operation: CREATE TABLE new_table AS SELECT regexp_split_to_table(source_text, E'regexp') as subunits FROM source_table; will be the same as the order of these subunits in the original text? Emphasis *initial order*. 2) I would like to be able to create a serial-type column during CREATE TABLE AS in the new table that "memorizes" this order so that I can reconstruct the original text using ORDER BY on that serial column. However, I am stumped how to do that. I do not see how to put the name of that column into my SELECT statement which generates the table, and I do not see where else to put it. Please forgive my stupidity. The "work-around" to this problem has been to ALTER my table after its creation with a new serial-type column. But this assumes that the answer to Question 1) above is always "Yes". Thanking you for your understanding, John
Re: Row ordering after CREATE TABLE AS...SELECT regexp_split_to_table(source_text, regexp) AS new_column
От
"Bret S. Lambert"
Дата:
On Wed, Feb 24, 2010 at 07:51:54AM +0100, John Gage wrote: > This is a two-part question: > > 1) I have a source_text that I want to divide into smaller subunits > that will be contained in rows in a column in a new table. Is it > absolutely certain that the initial order of the rows in the > resultant table after this operation: > > CREATE TABLE new_table AS SELECT regexp_split_to_table(source_text, > E'regexp') as subunits FROM source_table; > > will be the same as the order of these subunits in the original > text? Emphasis *initial order*. I'd put money on not; this is not what databases are designed for. > > 2) I would like to be able to create a serial-type column during > CREATE TABLE AS in the new table that "memorizes" this order so that > I can reconstruct the original text using ORDER BY on that serial > column. However, I am stumped how to do that. I do not see how to > put the name of that column into my SELECT statement which generates > the table, and I do not see where else to put it. Please forgive my > stupidity. Pre- or append an increasing serial number to the data, and use that as a column named "initial_order" or something else that will make it clear to you and other users what it is, and then import. But if you have the original data, in order, why do you need to be able to reconstruct it from a database dump? It just looks like adding a step to add a step, to me. > > The "work-around" to this problem has been to ALTER my table after > its creation with a new serial-type column. But this assumes that > the answer to Question 1) above is always "Yes". > > Thanking you for your understanding, > > John > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
John Gage <jsmgage@numericable.fr> writes: > This is a two-part question: > 1) I have a source_text that I want to divide into smaller subunits > that will be contained in rows in a column in a new table. Is it > absolutely certain that the initial order of the rows in the resultant > table after this operation: > CREATE TABLE new_table AS SELECT regexp_split_to_table(source_text, > E'regexp') as subunits FROM source_table; > will be the same as the order of these subunits in the original text? If you have a version new enough to have synchronize_seqscans, you'd need to turn that off. Otherwise should be OK. > 2) I would like to be able to create a serial-type column during > CREATE TABLE AS in the new table that "memorizes" this order so that I > can reconstruct the original text using ORDER BY on that serial > column. However, I am stumped how to do that. I think the trick is to get the SRF to be expanded before the serial values are assigned. There's more than one way to do it, but I think (too tired to experiment) this would work: CREATE TABLE new_table (id serial, subunits text); INSERT INTO new_table(subunits) SELECT regexp_split_to_table(source_text, E'regexp') FROM source_table; regards, tom lane
Thank you very much for this explanation/reply. It precisely answers my question. Unfortunately, it prompts a new question. I am using 8.4.2 which I assume is new enough to trigger a "yes" response to "If you have a version new enough to have synchronize_seqscans...". I have absolutely no idea how to turn that off. Perhaps the best thing would be to direct me to the documentation where turning it off is described so that I can become more autonomous. However, accompanying that with explicit directions would be welcome too. I am in Greenwich +1 timezone, but I fear you are in the 2AM time zone. Thank you again, John On Feb 24, 2010, at 8:06 AM, Tom Lane wrote: > John Gage <jsmgage@numericable.fr> writes: >> This is a two-part question: >> 1) I have a source_text that I want to divide into smaller subunits >> that will be contained in rows in a column in a new table. Is it >> absolutely certain that the initial order of the rows in the >> resultant >> table after this operation: > >> CREATE TABLE new_table AS SELECT regexp_split_to_table(source_text, >> E'regexp') as subunits FROM source_table; > >> will be the same as the order of these subunits in the original text? > > If you have a version new enough to have synchronize_seqscans, you'd > need to turn that off. Otherwise should be OK. > >> 2) I would like to be able to create a serial-type column during >> CREATE TABLE AS in the new table that "memorizes" this order so >> that I >> can reconstruct the original text using ORDER BY on that serial >> column. However, I am stumped how to do that. > > I think the trick is to get the SRF to be expanded before the serial > values are assigned. There's more than one way to do it, but I think > (too tired to experiment) this would work: > > CREATE TABLE new_table (id serial, subunits text); > > INSERT INTO new_table(subunits) SELECT > regexp_split_to_table(source_text, > E'regexp') FROM source_table; > > regards, tom lane
John Gage wrote: > Unfortunately, it prompts a new question. I am using 8.4.2 which I > assume is new enough to trigger a "yes" response to "If you have a > version new enough to have synchronize_seqscans...". I have > absolutely no idea how to turn that off. Perhaps the best thing > would be to direct me to the documentation where turning it off is > described so that I can become more autonomous. However, > accompanying that with explicit directions would be welcome too. See postgresql.conf, but you probably want to leave it turned on in general and turn it off only for the specific case of this usage. (Using the SET command, or ALTER ROLE, or ALTER DATABASE). See here http://www.postgresql.org/docs/8.4/static/runtime-config-compatible.html#RUNTIME-CONFIG-COMPATIBLE-VERSION -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support