Обсуждение: Custom Fields Database Architecture
Hi, I'm designing a database schema in which I should allow user to create custom fields at the application level. My application is a web-based system and it has multiple companies in a single database. So this means that each company can create their own custom fields. A custom field created in a company should not be visibile to the other company. Also, we don't want to restrict the number of fields allowed to create. I also read some article which talks about the type of patterns: 1. Meta-database 2. Mutating 3. Fixed 4. LOB My question here is, what is the best approach to define the architecture for custom fields. Performance should not be compromised. Thank you in advance. Regards, Gnanam. -- View this message in context: http://www.nabble.com/Custom-Fields-Database-Architecture-tp24034270p24034270.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Mon, Jun 15, 2009 at 06:04:25AM -0700, Gnanam wrote: > > Hi, > > I'm designing a database schema in which I should allow user to create > custom fields at the application level. This is called EAV (Entity-Attribute-Value), and it's a multi-decade-old mistake. Re-think your design. http://archives.postgresql.org/pgsql-general/2008-02/msg00075.php http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/ http://en.wikipedia.org/wiki/Inner-Platform_Effect Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Monday 15 June 2009, Gnanam wrote: > Hi, > > I'm designing a database schema in which I should allow user to create > custom fields at the application level. My application is a web-based > system and it has multiple companies in a single database. So this means > that each company can create their own custom fields. A custom field > created in a company should not be visibile to the other company. Also, we > don't want to restrict the number of fields allowed to create. > > I also read some article which talks about the type of patterns: > 1. Meta-database > 2. Mutating > 3. Fixed > 4. LOB > > My question here is, what is the best approach to define the architecture > for custom fields. Performance should not be compromised. > > Thank you in advance. > > Regards, > Gnanam. > > -- > View this message in context: > http://www.nabble.com/Custom-Fields-Database-Architecture-tp24034270p240342 >70.html Sent from the PostgreSQL - general mailing list archive at > Nabble.com. It depends a bit how you want to use the data. If you are not wedded to the RDMS model, you might look at CouchDB which is a schema-less DB. But do not expect to run SQL against it - it takes a rather different approach. There are others around, some of them proprietary, Lotus Notes/Domino is probably the best know of these. David
@David: You wrote in the links cited "The "flexibility" stems from fear of making a design decision.". That's an important note. Nevertheless, there are use cases where you *can not* know in advance what the name is of the attribute! To me that's not fear but adaptiveness, modesty and knowing when to break the rules! An apparent successful example of this EAV design is OpenStreetMap (http://wiki.openstreetmap.org/wiki/Database_schema). @Gnanam: Look also at hstore which is an indexable(!) PostgreSQL data type for storing sets of (key,value) pairs within a single data field: http://www.postgresql.org/docs/8.3/static/hstore.html -S. 2009/6/15 David Fetter <david@fetter.org>: > On Mon, Jun 15, 2009 at 06:04:25AM -0700, Gnanam wrote: >> >> Hi, >> >> I'm designing a database schema in which I should allow user to create >> custom fields at the application level. > > This is called EAV (Entity-Attribute-Value), and it's a > multi-decade-old mistake. Re-think your design. > > http://archives.postgresql.org/pgsql-general/2008-02/msg00075.php > http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/ > http://en.wikipedia.org/wiki/Inner-Platform_Effect > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Mon, Jun 15, 2009 at 10:37:04PM +0200, Stefan Keller wrote: > @David: You wrote in the links cited "The "flexibility" stems from > fear of making a design decision.". That's an important note. > Nevertheless, there are use cases where you *can not* know in > advance what the name is of the attribute! Those cases are extremely rare, and they don't fit with an RDBMS. > To me that's not fear but adaptiveness, modesty and knowing when to > break the rules! An apparent successful example of this EAV design > is OpenStreetMap > (http://wiki.openstreetmap.org/wiki/Database_schema). Funny you should mention that. The OpenStreetMap people are hitting exactly the issues I named, and they're ruing the day they decided on that "schema." I don't recommend that anybody re-make their mistake. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Custom fields are a fact of life, and used in many, many business critical applications. EAV sucks, as you mentioned, but that doesn't take away from the requirement to build that kind of system. From the user's perspective: If you design an application for me and I want to add a new data field or a new form, should I have to call you back and pay your exorbitant consulting fees? I would prefer to pay a little bit more at the beginning and be able to add what I want into the framework that was already built. We handled this at one client by actually generating the ddl statements and actually building the table/fields, including relationships (user chooses a related object from a list and that is generated as a foreign key). This was after we threw out their EAV system, which sucked. This can lead to design inefficiencies and not-normalized structure, will will lead to reporting havoc, but it depends on the requirements of the user. Gnanam's problem is exasperated by having multiple customers adding multiple fields that only they can see. I don't know your situation, so this might be off-base for your needs, but I would try a similar approach to what I suggested above. Have base fields in one table, with a customerid, indicating who can see the row, and then create a custom table per client who wants to add fields. The tablename can start with their customerid and can have security rights automatically assigned to it. Problems with this approach that I have seen is when the user adds 10 numeric fields, that should be normalized and then wants to generate an aggregate query from all of them. For most data gathering, this should be fine. Sim David Fetter wrote: > On Mon, Jun 15, 2009 at 06:04:25AM -0700, Gnanam wrote: >> Hi, >> >> I'm designing a database schema in which I should allow user to create >> custom fields at the application level. > > This is called EAV (Entity-Attribute-Value), and it's a > multi-decade-old mistake. Re-think your design. > > http://archives.postgresql.org/pgsql-general/2008-02/msg00075.php > http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/ > http://en.wikipedia.org/wiki/Inner-Platform_Effect > > Cheers, > David.
On Mon, Jun 15, 2009 at 2:04 PM, Gnanam<gnanam@zoniac.com> wrote: > > I also read some article which talks about the type of patterns: > 1. Meta-database > 2. Mutating > 3. Fixed > 4. LOB > > My question here is, what is the best approach to define the architecture > for custom fields. Performance should not be compromised. The reason there are multiple patterns are because the best approach depends very much on the specifics of your needs. For all David's dogma there are use cases where EAV is the best fit. But there are downsides and if those downsides are a problem then one of the other patterns may be a better fit. -- greg http://mit.edu/~gsstark/resume.pdf
On Mon, Jun 15, 2009 at 10:21 PM, Sim Zacks<sim@compulab.co.il> wrote: > From the user's perspective: If you design an application for me and I > want to add a new data field or a new form, should I have to call you > back and pay your exorbitant consulting fees? I would prefer to pay a > little bit more at the beginning and be able to add what I want into the > framework that was already built. The problem with this approach is that it becomes a "slippery slope" for the client. We have no control over what they do with this design. So you give the client an EAV which they overuse. They call you back anyway because they amassed redundant attributes keys many have nearly identical key names. The client no long knows which of the keys is the "most" correct key holding the data they want. And in the keys they do have, the values have non-nonsensical data. It is much easier to fix the former problem. It extremely painful to fix the latter. -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug
On Tue, Jun 16, 2009 at 12:50:28PM +0100, Greg Stark wrote: > On Mon, Jun 15, 2009 at 2:04 PM, Gnanam<gnanam@zoniac.com> wrote: > > > > I also read some article which talks about the type of patterns: > > 1. Meta-database > > 2. Mutating > > 3. Fixed > > 4. LOB > > > > My question here is, what is the best approach to define the > > architecture for custom fields. Performance should not be > > compromised. > > The reason there are multiple patterns are because the best approach > depends very much on the specifics of your needs. > > For all David's dogma there are use cases where EAV is the best fit. Sure there are, just not until every other option has been exhausted. The amount of maintenance needed for EAV always increases, usually with quite nasty complexity terms, which means you need to budget resources for that maintenance if it turns out you can't do it any other way. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate