Обсуждение: [GENEAL] dynamically changing table
Hi, In the next project I'm going to have a number of colums in my tables, but I don't know how many, they change. They all use integers as datatype though.. One day, I get 2 new columns, a week later I loose one column, and so on in a random pattern. I will most likely have a few million rows of data so I just wonder if there are any problems with running alter table x add column ..... or alter table x drop column ..... Adding a column, will it place data "far away" on the disc so that select * from x where id=y will result in not quite optimal performance since it has to fetch columns from a lot of different places? Will deleting a column result in a lot of empty space that will anoy me later on? Are there any other clever solutions of this problem?
IMHO one of the possible solutions is to review the table and storage externaly the stable columns. So when you run 'alter table xxxx' it was less expensive (i never test this, but maybe its true :P ) So the mutable columns will be separated from the others and the phisical structure could be more light to changes. Emanauel 2009/3/30, A B <gentosaker@gmail.com>: > Hi, > In the next project I'm going to have a number of colums in my tables, > but I don't know how many, they change. They all use integers as > datatype though.. One day, I get 2 new columns, a week later I loose > one column, and so on in a random pattern. > > I will most likely have a few million rows of data so I just wonder if > there are any problems with running > alter table x add column ..... > or > alter table x drop column ..... > > Adding a column, will it place data "far away" on the disc so that > select * from x where id=y will result in not quite optimal > performance since it has to fetch columns from a lot of different > places? > Will deleting a column result in a lot of empty space that will anoy > me later on? > > Are there any other clever solutions of this problem? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Emanuel Calvo Franco Sumate al ARPUG ! (www.postgres-arg.org - www.arpug.com.ar) ArPUG / AOSUG Member Postgresql Support & Admin
Without knowing to much currently.. can you create one table with enough columns? Then create a view to query the table and 'reflect' the changes and correct column names. Using rule you could also even simulate the update to the view and update to the correct columns. This so that you don't have to drop/create columns over and over again. I hope I made myself clear... Ries On Mar 30, 2009, at 10:39 AM, A B wrote: > Hi, > In the next project I'm going to have a number of colums in my tables, > but I don't know how many, they change. They all use integers as > datatype though.. One day, I get 2 new columns, a week later I loose > one column, and so on in a random pattern. > > I will most likely have a few million rows of data so I just wonder if > there are any problems with running > alter table x add column ..... > or > alter table x drop column ..... > > Adding a column, will it place data "far away" on the disc so that > select * from x where id=y will result in not quite optimal > performance since it has to fetch columns from a lot of different > places? > Will deleting a column result in a lot of empty space that will anoy > me later on? > > Are there any other clever solutions of this problem? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On Mon, Mar 30, 2009 at 05:39:19PM +0200, A B wrote: > Hi, > In the next project I'm going to have a number of colums in my tables, > but I don't know how many, they change. They all use integers as > datatype though.. One day, I get 2 new columns, a week later I loose > one column, and so on in a random pattern. I think you should think of something else. > I will most likely have a few million rows of data so I just wonder if > there are any problems with running > alter table x add column ..... > or > alter table x drop column ..... Well, not as such. Except that deleting a column doesn't really delete it, it hides it, so it never really goes away. So the number of "columns" in your table will only go up and eventually you're going to reach the point (around 1600 IIRC, probably earlier) where it will simply stop working. > Adding a column, will it place data "far away" on the disc so that > select * from x where id=y will result in not quite optimal > performance since it has to fetch columns from a lot of different > places? Nope, no extra cost there. > Will deleting a column result in a lot of empty space that will anoy > me later on? Yes, the space isn't actually released until the next time you update that row. > Are there any other clever solutions of this problem? If this is just for development where the actual space/columns used is just temporary, your trick might work. Otherwise I'd suggest normalising so the columns to become rows in another table. But you're going to have to be more specific as to what you're trying to do if you want proper answers. Or perhaps an array of integers? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Вложения
> Well, not as such. Except that deleting a column doesn't really delete > it, it hides it, so it never really goes away. So the number of > "columns" in your table will only go up and eventually you're going to > reach the point (around 1600 IIRC, probably earlier) where it will > simply stop working. Oh, that would be a not very plesant surprise. > If this is just for development where the actual space/columns used is > just temporary, your trick might work. Otherwise I'd suggest > normalising so the columns to become rows in another table. But you're > going to have to be more specific as to what you're trying to do if you > want proper answers. Well, I want to store information about certain objects. Some columns will be fixed from the start, other columns will be added or removed (like when someone comes up with a brilliant idea of adding new information about the object, or removing something that is not needed) It's a little hard to specify what operations will be performed on the data, but mostly it will be "fetch all data for object number X", or "increase integer field nr 4 by 1 for object number X". And then I'll also have the operations: "add a new field with default value 0" and "remove integer field number 2" The fixed columns could be placed in a special table, as suggested above. > Or perhaps an array of integers? That sounds to be a better way. I'll start reading about arrays. I have not used them earlier so I wasn't thinking about them. How would you handle fields of other datatype than integers? Have an array for timestamps, one for texts, one for varchar(50), one for floatingpoint numbers, etc.?
On Mon, Mar 30, 2009 at 05:39:19PM +0200, A B wrote: > Hi, > In the next project I'm going to have a number of colums in my tables, > but I don't know how many, they change. Stop right there. You need to get some sanity into your project, either by changing that requirement, or by not using an RDBMS for it. Cheers, David. > They all use integers as > datatype though.. One day, I get 2 new columns, a week later I loose > one column, and so on in a random pattern. > > I will most likely have a few million rows of data so I just wonder if > there are any problems with running > alter table x add column ..... > or > alter table x drop column ..... > > Adding a column, will it place data "far away" on the disc so that > select * from x where id=y will result in not quite optimal > performance since it has to fetch columns from a lot of different > places? > Will deleting a column result in a lot of empty space that will anoy > me later on? > > Are there any other clever solutions of this problem? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
From: michaelblack75052@hotmail.com
To: gentosaker@gmail.com
Subject: RE: [GENERAL] [GENEAL] dynamically changing table
Date: Mon, 30 Mar 2009 16:05:52 +0000
Or you could use the Drop table and Create table instead of Delete data and Alter Table. Also by varying the number of columns you have programming considerations in addition. The the input and process meths will need to check the meta data to determine how many columns it is dealing with.
Those are just to options that come to mind.
Michael
> Date: Mon, 30 Mar 2009 17:39:19 +0200
> Subject: [GENERAL] [GENEAL] dynamically changing table
> From: gentosaker@gmail.com
> To: pgsql-general@postgresql.org
>
> Hi,
> In the next project I'm going to have a number of colums in my tables,
> but I don't know how many, they change. They all use integers as
> datatype though.. One day, I get 2 new columns, a week later I loose
> one column, and so on in a random pattern.
>
> I will most likely have a few million rows of data so I just wonder if
> there are any problems with running
> alter table x add column .....
> or
> alter table x drop column .....
>
> Adding a column, will it place data "far away" on the disc so that
> select * from x where id=y will result in not quite optimal
> performance since it has to fetch columns from a lot of different
> places?
> Will deleting a column result in a lot of empty space that will anoy
> me later on?
>
> Are there any other clever solutions of this problem?
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
2009/3/30 David Fetter <david@fetter.org>: > On Mon, Mar 30, 2009 at 05:39:19PM +0200, A B wrote: >> Hi, >> In the next project I'm going to have a number of colums in my tables, >> but I don't know how many, they change. > > Stop right there. You need to get some sanity into your project, > either by changing that requirement, or by not using an RDBMS for it. > > Cheers, > David. Well, the requirement is: keep a lot of data stored, don't loose any of it, and you will not know what you will have to store (changing number of fields and of different types) But it is not all that bad, the fields will be integers, or text, or floatingpoint numbers. One option is, put it in a db as a huge text (or in textfiles, one per object) and parse it when you need it. That might also work. Why are you demanding sanity? I need crazy ideas to get this to work ;-)
On Mon, Mar 30, 2009 at 11:04:06AM -0700, David Fetter wrote: > On Mon, Mar 30, 2009 at 05:39:19PM +0200, A B wrote: > > Hi, > > In the next project I'm going to have a number of colums in my tables, > > but I don't know how many, they change. > > Stop right there. You need to get some sanity into your project, > either by changing that requirement, or by not using an RDBMS for it. I'd agree with that sentiment as well. It's very easy to add columns to relational databases; much harder and more time consuming is following this through the rest of the software; and harder still the fuzzier aspects of tying down what's actually needed, documenting and testing it. Databases aren't fixed in stone from the minute they're created; columns come and go, tables come and go. The more information you have at the beginning the better job you can do initially, but it's also very easy to over-design. In my experience over-design normally manifests itself in an over-complication of the design by making the database handle the few weird exceptions as though it's the normal data. Obviously some exceptions are useful for the database to know about, but there are some that it really doesn't matter---determining which is which up front isn't easy. This is where the "KISS principle" comes in; optimising a database design so it's easy to add/remove integer columns to/from tables doesn't sound like a good choice to be making in the long run. If you don't know what's going on; try leaving the data you're unsure about in a spreadsheet until you understand it better. It's much easier that way than rewriting user interfaces every day because somebody decides that something's changed. -- Sam http://samason.me.uk/
2009/3/31 A B <gentosaker@gmail.com>: > One option is, put it in a db as a huge text (or in textfiles, one > per object) and parse it when you need it. That might also work. > > Why are you demanding sanity? I need crazy ideas to get this to work ;-) Heh ... sorry, but "dynamic table" just SCREAMS "design flaw!!" ... as pointed out above, an approach with the "new columns" being rows in a separate table sounds quite sane. Cheers, Andrej -- Please don't top post, and don't use HTML e-Mail :} Make your quotes concise. http://www.american.edu/econ/notes/htmlmail.htm
On Mon, Mar 30, 2009 at 08:50:59PM +0200, A B wrote: > Well, the requirement is: keep a lot of data stored, don't loose any > of it, and you will not know what you will have to store (changing > number of fields and of different types) As you've not said anything about getting said data back; it doesn't seem to matter if it actually gets lost! Sounds as though the main unanswered constraint on the database at the moment is what people what to do with the data once they've handed it to you. When you've figured that out you may have a better idea of what to do. -- Sam http://samason.me.uk/
Is it possible that what you want is a lookup table with a string index, i.e. attribute-value pairs? If so, that would be represented as a hash in Perl, or a map in C++. In a database, you could design a very simple schema for it. Or do you just have unclear requirements, as others have suggested? -Will -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of A B Sent: 30 March 2009 14:51 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] [GENEAL] dynamically changing table Well, the requirement is: keep a lot of data stored, don't loose any of it, and you will not know what you will have to store (changing number of fields and of different types) But it is not all that bad, the fields will be integers, or text, or floatingpoint numbers. One option is, put it in a db as a huge text (or in textfiles, one per object) and parse it when you need it. That might also work.
On Mar 30, 2009, at 5:39 PM, A B wrote: > Hi, > In the next project I'm going to have a number of colums in my tables, > but I don't know how many, they change. They all use integers as > datatype though.. One day, I get 2 new columns, a week later I loose > one column, and so on in a random pattern. Ignoring design implications (you got enough replies about that I think)... You could add the columns you're sure that you need and put the rest in an XML field. That field can then contain any custom fields that you don't need right away while you still have the data available in a useful fashion. If it turns out some of those custom fields should end up in the table as a column it isn't hard to extract the data from the XML field. It's probably a good idea to add a version attribute or field to your XML that you increment on each model change (an xsd would be even better, it allows you to specify types for your data) so that you know which fields to expect in the document. You can query XML fields using xpath expressions in your queries. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,49d20a0c129741113880388!
In article <437FAA9F-DF2D-429E-9856-EB2026B55940@solfertje.student.utwente.nl>, Alban Hertroys <dalroi@solfertje.student.utwente.nl> writes: > On Mar 30, 2009, at 5:39 PM, A B wrote: >> Hi, >> In the next project I'm going to have a number of colums in my tables, >> but I don't know how many, they change. They all use integers as >> datatype though.. One day, I get 2 new columns, a week later I loose >> one column, and so on in a random pattern. > Ignoring design implications (you got enough replies about that I > think)... > You could add the columns you're sure that you need and put the rest > in an XML field. <mantra> If you have a problem and want to solve it using XML, you have two problems. </mantra> Why serializing the rest of the data in an XML field? contrib/hstore seems to accomplish the same, without the XML overhead.
On Mar 31, 2009, at 6:41 PM, Harald Fuchs wrote: > In article <437FAA9F-DF2D-429E-9856-EB2026B55940@solfertje.student.utwente.nl > >, > Alban Hertroys <dalroi@solfertje.student.utwente.nl> writes: > >> You could add the columns you're sure that you need and put the rest >> in an XML field. > > <mantra> > If you have a problem and want to solve it using XML, you have two > problems. > </mantra> <mantra> A mantra is never good argumentation, whether for or against. </mantra> I don't like mantras, they're like dogmas, they prevent people from thinking. I consider them dangerous ;) > Why serializing the rest of the data in an XML field? contrib/hstore > seems to accomplish the same, without the XML overhead. Although I'm no fan of XML, it does have it's benefits. In this case, it's standardised (which means there are libraries for about every language to deal with XML data), it can store and constrain type information (although a bit limited IMO) about the data and there are many tools for manipulating data contained in XML and those can usually also be used at the application side of things. contrib/hstore doesn't do these things for you, so it adds problems of its own. The OP may run into problems with database layers in his programming language of choice that can't deal with the specifics of querying a hstore field for example. If type information in the data is of importance, he'll have to explicitly cast the data (based on the key field). If certain fields are constrained to certain value ranges, he'll have to check those in his application. It all depends on the requirements of the OP and how far he wants to go with this, if hstore solves his problem, by all means, go with it. In either case, using an XML field is a valid solution. Nobody said it's a perfect one. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,49d2554e129747441114695!