Обсуждение: generic modelling of data models; enforcing constraints dynamically...

Поиск
Список
Период
Сортировка

generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Dear List,

I am trying to implement the following:

In a database I wish to implement a GENERIC datamodel, thus on a meta-level. All RELATIONS (part of a MODEL) will be a view on some base (being a table) JOINed with (an) extra column(s). Thus, this view consists of a number of FIELDS. I whish to make this view editable (INSERT, UPDATE) using the RULE system. Some constraints will apply; enforcing these is the problem I am trying to solve by modeling these rules with a table "RELATION_CONSTRAINTS" (see below).

Tables:

BASE(col1, col2, col3)

MODELS(name)

RELATIONS(modelname, name)

FIELDS(modelname, relation_name, name, datatype)

RELATION_CONSTRAINTS(modelname, relation_name, constraint_name, constraining_expression)

I was thinking of implementing this using a FUNCTION that takes a polymorphic record parameter (and the relation name); then checking this record against the applicable constraint expression.
This whole idea may sound like a DBMS-in-a-DBMS kind of thing... What I am trying is to write as little as table/view-specific code as would be necessary, while still collecting all base data in one central table...

All suggestions are very much appreciated,
regards,


Rob


Re: generic modelling of data models; enforcing constraints dynamically...

От
Ben Chobot
Дата:
InterRob wrote:
> Dear List,
>
> I am trying to implement the following:
>
> [snip]
>
> All suggestions are very much appreciated,
> regards,
>
>
> Rob
>
>

It's not clear to me what you're asking, but I suspect the suggestion
you need is the same as if you had asked how to best implement an
Entity-Attribute-Value scheme: don't do it. Why it may be possible,
performance is going to go into the toilet, constraints are going to be
difficult to enforce, and maintenance will be difficult at best.
Spending the effort upfront to define a schema will have drastic
long-term payoffs. It can be tempting to believe an application can
define the appropriate schema for itself at runtime if you just give it
a big enough sandbox, but this rarely works out well.

Re: generic modelling of data models; enforcing constraints dynamically...

От
Sam Mason
Дата:
On Thu, Sep 24, 2009 at 06:28:28PM +0200, InterRob wrote:
> I am trying to implement the following:
>
> In a database I wish to implement a GENERIC datamodel, thus on a meta-level.

Sounds like you're describing an EAV design:

  http://en.wikipedia.org/wiki/Entity-attribute-value_model

Designs like this tend to result in you getting very little support
from the database and get awkward as they grow.  If your problem really
is suited to this then go for it, but surprisingly few actually are.
I'd highly recommend using a more traditional design until you've been
through at least one big revision and then you'll know whether EAV
really fits.

--
  Sam  http://samason.me.uk/

Re: generic modelling of data models; enforcing constraints dynamically...

От
Rob Marjot
Дата:
Thank you, Ben. Well, I'm afraid you got the basic idea... I intend to implement a hybrid between a fixed schema and an Entity-Attribute-Value scheme. The schema will be able to cover 90% of the data needs; in other cases (specific projects) additional fields (and/or tables/relations) will be needed; including their constraints...

I'm experienting now with some smart thought that just came up: passing a set of key/value pairs to function that will test the new row; on insert / update the following could then be checked (as part of a RULE-set):

SELECT doesComply('relationname', keyValues.*) FROM (VALUES('col1', CAST(col1 AS TEXT)), VALUES('col2', CAST(col2 AS TEXT))) AS keyValues(the_key, the_value);

The function "doesComply()" will then process the CONSTRAINTS table and raise an Error if the new / updated row does not fit...


Any thoughts?


Rob

2009/9/24 Ben Chobot <bench@silentmedia.com>
InterRob wrote:
Dear List,

I am trying to implement the following:

[snip]


All suggestions are very much appreciated,
regards,


Rob



It's not clear to me what you're asking, but I suspect the suggestion you need is the same as if you had asked how to best implement an Entity-Attribute-Value scheme: don't do it. Why it may be possible, performance is going to go into the toilet, constraints are going to be difficult to enforce, and maintenance will be difficult at best. Spending the effort upfront to define a schema will have drastic long-term payoffs. It can be tempting to believe an application can define the appropriate schema for itself at runtime if you just give it a big enough sandbox, but this rarely works out well.


Re: generic modelling of data models; enforcing constraints dynamically...

От
Ben Chobot
Дата:
Rob Marjot wrote:
> Thank you, Ben. Well, I'm afraid you got the basic idea... I intend to
> implement a hybrid between a fixed schema and an
> Entity-Attribute-Value scheme. The schema will be able to cover 90% of
> the data needs; in other cases (specific projects) additional fields
> (and/or tables/relations) will be needed; including their constraints...

If you absolutely must have a dynamic schema like this, and can't have a
DBA simply add tables as needed, then I think it would be less work,
overall, to create a schema that your application has DDL rights to, and
then let it create and modify normal tables with normal constraints there.

There certainly are some cases where an EAV solution is the proper one,
and yours may be one of them. But most aren't.

Re: generic modelling of data models; enforcing constraints dynamically...

От
Sam Mason
Дата:
On Thu, Sep 24, 2009 at 09:23:35PM +0200, Rob Marjot wrote:
> SELECT doesComply('relationname', keyValues.*) FROM (VALUES('col1',
> CAST(col1 AS TEXT)), VALUES('col2', CAST(col2 AS TEXT))) AS
> keyValues(the_key, the_value);
>
> The function "doesComply()" will then process the CONSTRAINTS table and
> raise an Error if the new / updated row does not fit...

I'd have a set of doesComply functions, the first two parameters
as you have them but overload a set to support different datatypes
specifically.  Something like:

  CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val INT) ...
  CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val DATE) ...
  CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val TEXT) ...
  CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val NUMERIC) ...

And then have a set of "attribute" tables (one for each datatype) to
store the actual values in.  At least PG can do some type checking for
you that way.  Either that, or just leave them all as text to text
mappings in the database and only attempt to type things out in the
client code.

Not sure why you're doing the VALUES contortions as well, why not just:

  SELECT doesComply('relationname', 'col1', col2);

?

--
  Sam  http://samason.me.uk/

Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Thank you, Ben. Well, I'm afraid you got the basic idea... I intend to implement a hybrid between a fixed schema and an Entity-Attribute-Value scheme. The schema will be able to cover 90% of the data needs; in other cases (specific projects) additional fields (and/or tables/relations) will be needed; including their constraints...

I'm experienting now with some smart thought that just came up: passing a set of key/value pairs to function that will test the new row; on insert / update the following could then be checked (as part of a RULE-set):

SELECT doesComply('relationname', keyValues.*) FROM (VALUES('col1', CAST(col1 AS TEXT)), VALUES('col2', CAST(col2 AS TEXT))) AS keyValues(the_key, the_value);

The function "doesComply()" will then process the CONSTRAINTS table and raise an Error if the new / updated row does not fit...


Any thoughts?


Rob

2009/9/24 Ben Chobot <bench@silentmedia.com>
Rob Marjot wrote:
Thank you, Ben. Well, I'm afraid you got the basic idea... I intend to implement a hybrid between a fixed schema and an Entity-Attribute-Value scheme. The schema will be able to cover 90% of the data needs; in other cases (specific projects) additional fields (and/or tables/relations) will be needed; including their constraints...

If you absolutely must have a dynamic schema like this, and can't have a DBA simply add tables as needed, then I think it would be less work, overall, to create a schema that your application has DDL rights to, and then let it create and modify normal tables with normal constraints there.

There certainly are some cases where an EAV solution is the proper one, and yours may be one of them. But most aren't.


Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Sam, Thanks for thinking along.

The thing is that a SINGLE constraint might apply to MULTIPLE fields; therefore it seems best to build a set of key/value pairs... Multiple doesComply()s won't do the job :(

BY THE WAY:
I came to think of another option: putting additional columns (that is: addittional to the default set of fields) in xml, in a column that is part of row (=object) it belongs to.
Any body has done so before? Any body has experience with XML schema validation within PostgreSQL?

Cheerz,

Rob

2009/9/24 Sam Mason <sam@samason.me.uk>
On Thu, Sep 24, 2009 at 09:23:35PM +0200, Rob Marjot wrote:
> SELECT doesComply('relationname', keyValues.*) FROM (VALUES('col1',
> CAST(col1 AS TEXT)), VALUES('col2', CAST(col2 AS TEXT))) AS
> keyValues(the_key, the_value);
>
> The function "doesComply()" will then process the CONSTRAINTS table and
> raise an Error if the new / updated row does not fit...

I'd have a set of doesComply functions, the first two parameters
as you have them but overload a set to support different datatypes
specifically.  Something like:

 CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val INT) ...
 CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val DATE) ...
 CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val TEXT) ...
 CREATE FUNCTION doesComply(_rel TEXT, _key TEXT, _val NUMERIC) ...

And then have a set of "attribute" tables (one for each datatype) to
store the actual values in.  At least PG can do some type checking for
you that way.  Either that, or just leave them all as text to text
mappings in the database and only attempt to type things out in the
client code.

Not sure why you're doing the VALUES contortions as well, why not just:

 SELECT doesComply('relationname', 'col1', col2);

?

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: generic modelling of data models; enforcing constraints dynamically...

От
Sam Mason
Дата:
On Thu, Sep 24, 2009 at 10:33:37PM +0200, InterRob wrote:
> I came to think of another option: putting additional columns (that is:
> addittional to the default set of fields) in xml, in a column that is part
> of row (=object) it belongs to.
> Any body has done so before? Any body has experience with XML schema
> validation within PostgreSQL?

Sorry; but was sounding a little over engineered before, it seems to be
blowing out of proportion now.  By whom and how are these (immensely
complicated) rule sets going to be maintained? how is using XML going to
make this any easier than using the tools native to the database?

If they're validated inside the database then it's going to be done by a
DB admin anyway, or am I missing something?  If they're done by the DB
admin, isn't it easy to just use the tools they're used to?

--
  Sam  http://samason.me.uk/

Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
I guess it IS quite overengineered indeed...

What I'm trying to do is to facilitate different fieldwork methodologies for archaeological research (on project basis); there is no final agreement on data structure and semantics; however, on a meta-level all choices are rational and can be modelled... Infact, all models can be related to each other: that's where the "hybrid" part comes in: I wish to implement the common denominator (90%) and then further extend this, enabing specific data model implementations -- including checks for data integrity.

As soon as that works, it becomes possible to record changes at row-level -- providing access to data-snapshots in time. Furthermore, it becomes possible to build upon this central database automated tools for management and filing of information and different modes of data entry (including webbased)...

The thing is: altering table structures (changes should be limited to adding columns) is required on a ad hoc basis and End User should be able to do so... I guess that requires some over engineering... ?


Rob

2009/9/24 Sam Mason <sam@samason.me.uk>
On Thu, Sep 24, 2009 at 10:33:37PM +0200, InterRob wrote:
> I came to think of another option: putting additional columns (that is:
> addittional to the default set of fields) in xml, in a column that is part
> of row (=object) it belongs to.
> Any body has done so before? Any body has experience with XML schema
> validation within PostgreSQL?

Sorry; but was sounding a little over engineered before, it seems to be
blowing out of proportion now.  By whom and how are these (immensely
complicated) rule sets going to be maintained? how is using XML going to
make this any easier than using the tools native to the database?

If they're validated inside the database then it's going to be done by a
DB admin anyway, or am I missing something?  If they're done by the DB
admin, isn't it easy to just use the tools they're used to?

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: generic modelling of data models; enforcing constraints dynamically...

От
Johan Nel
Дата:
Hi Rob,

> In a database I wish to implement a GENERIC datamodel, thus on a
> meta-level. All RELATIONS (part of a MODEL) will be a view on some base
> (being a table) JOINed with (an) extra column(s). Thus, this view
> consists of a number of FIELDS. I whish to make this view editable
> (INSERT, UPDATE) using the RULE system. Some constraints will apply;
> enforcing these is the problem I am trying to solve by modeling these
> rules with a table "RELATION_CONSTRAINTS" (see below).

Although many people will shoot it down, I follow a very similar approach.
  Data-driven application framework with exactly what you have described.
In short, my application will read the "Metadata" at runtime and "build"
the application so to speak on the fly.

For this I use the following two table approach:

meta_master:
   master_no SERIAL NOT NULL PRIMARY KEY,
   master_type VARCHAR(30) NOT NULL REFERENCE master_type(master_type),
   master_id VARCHAR(30) NOT NULL,
   master_property TEXT,
   UNIQUE (master_type, master_id)

meta_link:
   link_no SERIAL NOT NULL PRIMARY KEY
   master_no REFERENCE meta_master(master_no),
   link_type NOT NULL REFERENCE master_type(master_type),
   member_no NOT NULL REFERENCE meta_master(master_no),
   member_property TEXT,
   UNIQUE (master_no, member_no)

Some explanation:
master_type and link_type have values like database, table, column etc.

Thus, at runtime a method FormLoad(nID) will make use of a recursive query
to load everything that is needed to build the "Form" at runtime and
associate it with the correct database, table, etc.

It is necessary to recurse all members via meta_master.master_no =
meta_link.master_no and meta_link.member_no = meta_master.master_no (use
connect_by() pre 8.4 or use the RECURSIVE views from 8.4)

Where applicable the %_property columns are used to define additional
information in the format 'name=value;nextname=value;' etc.

> I was thinking of implementing this using a FUNCTION that takes a
> polymorphic record parameter (and the relation name); then checking this
> record against the applicable constraint expression.
> This whole idea may sound like a DBMS-in-a-DBMS kind of thing... What I
> am trying is to write as little as table/view-specific code as would be
> necessary, while still collecting all base data in one central table...

I take the %_property column even further, in my business data I have a
property column again in tables where additional columns can be defined on
the fly based on the metadata, until such a time that users have a clear
picture of what they exactly need.  This is also used in tables where
multi-company differences makes it almost impossible to have not more than
  50% of a table's columns redundant.

If you need more info regarding this approach, feel free to contact me in
private.

Johan Nel
Pretoria, South Africa.

Re: generic modelling of data models; enforcing constraints dynamically...

От
Sam Mason
Дата:
On Thu, Sep 24, 2009 at 11:07:31PM +0200, InterRob wrote:
> What I'm trying to do is to facilitate different fieldwork methodologies for
> archaeological research (on project basis); there is no final agreement on
> data structure and semantics; however, on a meta-level all choices are
> rational and can be modelled... Infact, all models can be related to each
> other: that's where the "hybrid" part comes in: I wish to implement the
> common denominator (90%) and then further extend this, enabing specific data
> model implementations -- including checks for data integrity.

I'm my experience it depends on how technically competent your users
are.  Most of mine are fine working with "their own" data files/formats
and only want the data in the database to keep track of the larger
structural stuff.

I therefore tend to just leave their data as opaque blobs (stored in
large objects, as they're reasonably small) of data and only pull out
the parts of it that are needed to keep the other parts of the projects
happy.  That way I can make sure the bits the database takes care of can
be appropriately designed and the users get to keep their data exactly as
they want.

To support this I've written various bits of code that get automatically
run when users insert their data files to pull them out into the
appropriate tables.  The detailed bits of the structure are of course
missed, but most of the time this data isn't needed and when it is they
want the rest of the original (normally proprietary binary file formats
that I've had to reverse engineer) file so that their program can figure
out what's going on.

It all depends on the problem domain of course, but this seems to work
OK for us!  I really want to hack Samba around so that the users can
view the files directly from inside the database, but I'm not sure how
good an idea this really.

The bigger datasets (1GB+) tend to be nicely structured, so they get
handled specially.

> As soon as that works, it becomes possible to record changes at row-level --
> providing access to data-snapshots in time.

I think these are what my blobs are...

> Furthermore, it becomes possible
> to build upon this central database automated tools for management and
> filing of information and different modes of data entry (including
> webbased)...

...and this is what I'd call my structural bits.

> The thing is: altering table structures (changes should be limited to adding
> columns) is required on a ad hoc basis and End User should be able to do
> so...

I generally find it's easier if I'm involved in that.  Maybe it's just
my users!

> I guess that requires some over engineering... ?

By "over engineering" I was meaning that you seem to be trying to solve
a more complicated problem than is necessary.  There will be some
essential complexity inherent in any problem, but it's the job of every
engineer (software or hardware) to ensure that only minimal amounts of
incidental complexity are introduced.


In my case the "important" thing is to make sure that we know the
state of what's going on in the projects.  I can do this by getting a
combination of data from the user (through traditional means) and by
pulling apart their data files.  The "incidental complexity" I've added,
that of writing fiddly little programs to interpret their files, seems
to be better than getting the users to input the data twice; once in
their programs and once into the database.

In your case you've introduced this strange new EAV style design and the
constraint system on top of it.  The benefits of this design may well be
better than the costs of developing it, but I have a feeling it may be
easier to "side-step" the problem somehow.

That all got a bit longer than I was expecting, but I hope it's useful!

--
  Sam  http://samason.me.uk/

Re: generic modelling of data models; enforcing constraints dynamically...

От
Ron Mayer
Дата:
Sam Mason wrote:
> It all depends on the problem domain of course, but this seems to work
> OK for us!  I really want to hack Samba around so that the users can
> view the files directly from inside the database, but I'm not sure how
> good an idea this really.

"hack Samba"?   Wouldn't it be easier to use one of the database-as-a
filesystem FUSE bindings and run stock samba over that?

The perl Fuse::DBI module's example  sounds pretty similar to the
system you described where he "file" seems to be a column in a table.
http://www.rot13.org/~dpavlin/fuse_dbi.html

If that doesn't suit your needs there are more elaborate ones(libferris)
that seem more complex and more flexible, and simpler ones (dumbofs) that
seem to be meant more as example code you could hack for your purposes
http://lwn.net/Articles/306860/
http://yiannnos.com/dumbofs


And then you could use unmodified samba out-of-the-box exporting
that to whatever the heck speaks SMB/CIFS these days.


Re: generic modelling of data models; enforcing constraints dynamically...

От
Sam Mason
Дата:
On Fri, Sep 25, 2009 at 11:01:02AM -0700, Ron Mayer wrote:
> Sam Mason wrote:
> > It all depends on the problem domain of course, but this seems to work
> > OK for us!  I really want to hack Samba around so that the users can
> > view the files directly from inside the database, but I'm not sure how
> > good an idea this really.
>
> "hack Samba"?   Wouldn't it be easier to use one of the database-as-a
> filesystem FUSE bindings and run stock samba over that?

Huh, that would indeed be much easier.  I hadn't thought about this for
a while and Rob's post reminded me.  I don't think FUSE existed when I
started thinking about it and as all our clients are Windows boxes it
didn't matter at the time.

> The perl Fuse::DBI module's example  sounds pretty similar to the
> system you described where he "file" seems to be a column in a table.
> http://www.rot13.org/~dpavlin/fuse_dbi.html

FUSE looks pretty easy to get going and I think I'd want more control
over how files were presented than this gives so I'd probably end up
rolling my own code.  Thanks for pointing out that FUSE though, not sure
why I'd not thought of it before.  I'll probably still never get around
to it, but maybe I will!

--
  Sam  http://samason.me.uk/

Re: generic modelling of data models; enforcing constraints dynamically...

От
Ron Mayer
Дата:
Drifting off topic so I'm no longer ccing the lists.

Sam Mason wrote:
>
>> The perl Fuse::DBI module's example  sounds pretty similar to the
>> system you described where he "file" seems to be a column in a table.
>> http://www.rot13.org/~dpavlin/fuse_dbi.html
>
> FUSE looks pretty easy to get going and I think I'd want more control
> over how files were presented than this gives so I'd probably end up
> rolling my own code.  Thanks for pointing out that FUSE though, not sure

I FUSE really more the framework that wraps around your code.

There are applications using fuse that expose gmail as a filesystem.

Here's a simple example that uses FUSE to expose a
perl HASH and a few hello-world-like perl functions.
http://cpansearch.perl.org/src/NOSEYNICK/Fuse-Simple-1.00/README

> why I'd not thought of it before.  I'll probably still never get around
> to it, but maybe I will!

It's actually easy enough that I wouldn't be surprised if you
try it, and get it working just for fun even if noone uses it.

Re: generic modelling of data models; enforcing constraints dynamically...

От
Erik Jones
Дата:
On Sep 24, 2009, at 2:07 PM, InterRob wrote:

> I guess it IS quite overengineered indeed...
>
> What I'm trying to do is to facilitate different fieldwork
> methodologies for archaeological research (on project basis); there
> is no final agreement on data structure and semantics; however, on a
> meta-level all choices are rational and can be modelled... Infact,
> all models can be related to each other: that's where the "hybrid"
> part comes in: I wish to implement the common denominator (90%) and
> then further extend this, enabing specific data model
> implementations -- including checks for data integrity.

Have you considered a non-relational, "schema-less" database such as
MongoDB or Cassandra?  You're pretty much throwing out the relational
features of this database anyways so it seems that it would make sense
to use something more geared to that kind of work.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: generic modelling of data models; enforcing constraints dynamically...

От
David Fetter
Дата:
On Thu, Sep 24, 2009 at 06:28:28PM +0200, InterRob wrote:
> Dear List,
> I am trying to implement the following:
>
> In a database I wish to implement a GENERIC datamodel, thus on a
> meta-level.

That's not a very bright idea, even though it seems so when you first
think of it.

Relational database management explicitly trades flexibility for size
and speed.  You won't be able to constrain the things you think you'll
be constraining, and the query complexity will go up like O(n!).

Instead of going down this dead-end road, get the stakeholders
together, try a few prototypes of your schema, get them together
again, etc.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Dear David, dear all,

I very well understand what you are saying... However, the solution won't be found in the direction you are suggesting: the system I am designing will be used by archaeologists, involved in archaeological research (fieldwork). Their research strategy (and with it their methodology and techniques) may vary during research, depending on their findings and understanding of the past that is reconstructed on-site... Along with these methodologies en techniques, differing data-models may be needed to introduced...

Relationships between these models may be formalised (that's what I (will) put stakeholders together for); in fact this is what I try to model -- trying to develop a system that will centralize and version the data gathered. On a supra-project level, in fact.

Meanwhile, I made some progress; in another mail I sent to this same list, I described the technological challenge at hand as follows:
-----
What I am trying to do is: building views on a base table, extended by one or more columns, extracted (hence the naming of the function "deserialize()") from a SINGLE column (XML) **that is in this same base table** (see below). Instructions for deserialization (that is: which 'fields' to look for) reside in some other table. There are MULTIPLE base tables, they basically look like this:

[table definition:]
BASETABLE(ID INT, model TEXT, field1 some_type, field2 some_type, ... fieldN some_type, serialized_data XML)

So, I wish to define multiple VIEWs based on a BASETABLE; one for each "model" (as stated in the above table definition: "model" is a property for each row). This QUERY would look like this (producing a VIEW for "MODEL1"; the query below in invalid, unfortunately):

>> SELECT base_t.*, deserialized.* FROM "BASETABLE" base_t, deserialize('MODEL1', base_t) as deserialized(fieldX some_type, fieldY some_type) WHERE base_t.model = 'MODEL1';

I have no problem with the requirement to supply the table type in the query; infact this is logical. Still, this query is impossible, obviously, because "base_t" as a target is not known in the context of the FROM-clause, where I whish to use it in calling "deserialize(...)". Ofcourse, I could write a "deserialize()" function for each base table (e.g. "deserialize_base1(...)") but I wish it to perform it's action on only rows that will actually be part of the result set; thus I want the WHERE-clause to apply to the function's seq scan álso. When provided, I whish to incorporated the user's WHERE-clause as well; this is done by the PostgreSQL RULE system...

Alternatively, the VIEW could be defined by the following query:
>> SELECT base_t.*, (deserialize('MODEL1', base_t) as temp(fieldX some_type, field_Y some_type)).* FROM "BASETABLE" base_t WHERE base_t.model = 'MODEL1';

This approach does not work either: deserialize(...) will return its set of fields as ONE field (comma separated, circumfixed by brackets); expressions within a SELECT-list seem to be only allowed to result in ONE column, except from the * shorthand...

** So, the question is: how can i feed my "deserialize()" function with a record subset (e.g. filter by a WHERE-clause) of the BASETABLE, while still returning a record?!? **

I tried the following approach also:
>> SELECT (SELECT fieldX FROM deserialize(base_t) deserialized(fieldX some_type, fieldY some_type)) "fieldX", (SELECT fieldY FROM deserialize(base_t) deserialized(fieldX some_type, fieldY some_type)) "fieldY" FROM "BASETABLE" table_t WHERE model= 'MODEL1';

Which infact worked, but caused the function to get invoked TWICE FOR EACH ROW (even when marked IMMUTABLE STRICT; or did I have to flush cached query plans in psql?). 

Another approach would be to put all key/value pairs into a separate table (as one would do when implementing a EAV-model within a RDBMS) which is then to be joined (and joined again... and possibly again (!); in case of MULTIPLE additional rows -- depending on the definition of the VIEW) onto the BASETABLE, rather than to deserialize from XML which is stored within the same record... How does this approach then actually translate in terms of table scans? Will they be limited by the filter on the BASETABLE, as the available values to join on will be limited? At any rate: this approach will be more difficult to implement / maintain in case of EDITABLE VIEWS (inserts, update, delete)...

Hope any of you has some useful thoughts on this... It appears to me updating the additional (virtual) fields in the BASETABLE is much easier: the "serialize()"-function can be fed by a list of key/value pairs, producing some XML that can be stored in the xml field of "serialized_data", part of this same base table...
All this needs to be implemented fully in the database back-end; client application will not know they are talking to VIEWS rather than tables... Thus: the hosted database must simulate to provide various tables, whereas these are in fact stored in a limited number of base tables.

----

 Thanks in advance, you guys out there!


Rob

2009/9/26 David Fetter <david@fetter.org>
On Thu, Sep 24, 2009 at 06:28:28PM +0200, InterRob wrote:
> Dear List,
> I am trying to implement the following:
>
> In a database I wish to implement a GENERIC datamodel, thus on a
> meta-level.

That's not a very bright idea, even though it seems so when you first
think of it.

Relational database management explicitly trades flexibility for size
and speed.  You won't be able to constrain the things you think you'll
be constraining, and the query complexity will go up like O(n!).

Instead of going down this dead-end road, get the stakeholders
together, try a few prototypes of your schema, get them together
again, etc.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
In fact, I considered doing so, yes... But no luck: to complicate things, I will need the support for spatial datatypes, as implemented by the contrib "PostGIS"... Moreover: various applications that will make-up the front-end, will only be able to talk with mainstraim or ODBC-compatible databases  :((


Rob

2009/9/26 Erik Jones <ejones@engineyard.com>

On Sep 24, 2009, at 2:07 PM, InterRob wrote:

I guess it IS quite overengineered indeed...

What I'm trying to do is to facilitate different fieldwork methodologies for archaeological research (on project basis); there is no final agreement on data structure and semantics; however, on a meta-level all choices are rational and can be modelled... Infact, all models can be related to each other: that's where the "hybrid" part comes in: I wish to implement the common denominator (90%) and then further extend this, enabing specific data model implementations -- including checks for data integrity.

Have you considered a non-relational, "schema-less" database such as MongoDB or Cassandra?  You're pretty much throwing out the relational features of this database anyways so it seems that it would make sense to use something more geared to that kind of work.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k







Re: generic modelling of data models; enforcing constraints dynamically...

От
David Fetter
Дата:
On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
> Dear David, dear all,
> I very well understand what you are saying...

Clearly you do not.  What you are proposing has been tried many, many
times before, and universally fails.

That your people are failing to get together and agree to a data model
is not a reason for you to prop up their failure with a technological
"fix" that you know from the outset can't be made to work.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: generic modelling of data models; enforcing constraints dynamically...

От
Peter Hunsberger
Дата:
On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
> On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
>> Dear David, dear all,
>> I very well understand what you are saying...
>
> Clearly you do not.  What you are proposing has been tried many, many
> times before, and universally fails.

I've been refraining from jumping on this due to time constraints, but
this statement is silly.  We have a system that does almost exactly
what the OP wants although the implementation is slightly different:
we use an EAV like model with strong typing and build set / subset
forests to maintain arbitrary hierarchies of relationships.  Our
reasons for doing this are similar to the OPs; it's for research (in
our case medical research).  We maintain over 200,000 pieces of end
user generated metadata, describing what would be in a conventional
relational model over 20,000 columns and some 1,000s of tables but the
actual physical model is some 40 tables.   Yes, the flip side is, such
a system won't support more than 1,000,000s of transactions per day,
but that's not why you build them.

>
> That your people are failing to get together and agree to a data model
> is not a reason for you to prop up their failure with a technological
> "fix" that you know from the outset can't be made to work.
>

Spoken like someone who has always had the luxury of working in areas
with well defined problem domains...   I can't tell you the number of
people that told us exactly the same thing when we started on it.
That was 8 years ago.  Not only can such systems be built, they can be
made to scale reasonably well.  You do need to understand what you are
doing and why: the costs can be high, but when it comes to research,
the benefits can far outweigh the costs.

--
Peter Hunsberger

Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Dear David, dear Peter, dear all,

Peter, I was happy reading your reply right after I opened and read Davids. I do think I am on the right track; it is not a matter of building the one-and-only right schema, not in this case. Archaeology has the same twist as has ethnography, antropology and alike: they work with (what I would call) "narratives" (in fact, in the case of archaeology this seems to me to be an archaeologists monologue...). They try to support their findings with statistics and other means of quatification -- as does this modern, rationalist world require them to do, to be taken seriously as science... I seek to implement all this in a hybrid form; a fusion between the relational and EAV concept.

Peter, may I invite you to privately share some more details on the system you are using and the design of it? Did you implement it using PostgreSQL? Looking forward to your reply.
(And with respect to your previous message: whom are you actually referring to by the acronym "OPs"?)

Cheerz,


Rob

2009/9/27 Peter Hunsberger <peter.hunsberger@gmail.com>
On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
> On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
>> Dear David, dear all,
>> I very well understand what you are saying...
>
> Clearly you do not.  What you are proposing has been tried many, many
> times before, and universally fails.

I've been refraining from jumping on this due to time constraints, but
this statement is silly.  We have a system that does almost exactly
what the OP wants although the implementation is slightly different:
we use an EAV like model with strong typing and build set / subset
forests to maintain arbitrary hierarchies of relationships.  Our
reasons for doing this are similar to the OPs; it's for research (in
our case medical research).  We maintain over 200,000 pieces of end
user generated metadata, describing what would be in a conventional
relational model over 20,000 columns and some 1,000s of tables but the
actual physical model is some 40 tables.   Yes, the flip side is, such
a system won't support more than 1,000,000s of transactions per day,
but that's not why you build them.

>
> That your people are failing to get together and agree to a data model
> is not a reason for you to prop up their failure with a technological
> "fix" that you know from the outset can't be made to work.
>

Spoken like someone who has always had the luxury of working in areas
with well defined problem domains...   I can't tell you the number of
people that told us exactly the same thing when we started on it.
That was 8 years ago.  Not only can such systems be built, they can be
made to scale reasonably well.  You do need to understand what you are
doing and why: the costs can be high, but when it comes to research,
the benefits can far outweigh the costs.

--
Peter Hunsberger


Re: generic modelling of data models; enforcing constraints dynamically...

От
Oleg Bartunov
Дата:
Have you considered contrib/hstore to build flexible database scheme ?

Oleg
On Sun, 27 Sep 2009, InterRob wrote:

> Dear David, dear Peter, dear all,
> Peter, I was happy reading your reply right after I opened and read Davids.
> I do think I am on the right track; it is not a matter of building the
> one-and-only right schema, not in this case. Archaeology has the same twist
> as has ethnography, antropology and alike: they work with (what I would
> call) "narratives" (in fact, in the case of archaeology this seems to me to
> be an archaeologists monologue...). They try to support their findings with
> statistics and other means of quatification -- as does this modern,
> rationalist world require them to do, to be taken seriously as science... I
> seek to implement all this in a hybrid form; a fusion between the relational
> and EAV concept.
>
> Peter, may I invite you to privately share some more details on the system
> you are using and the design of it? Did you implement it using PostgreSQL?
> Looking forward to your reply.
> (And with respect to your previous message: whom are you actually referring
> to by the acronym "OPs"?)
>
> Cheerz,
>
>
> Rob
>
> 2009/9/27 Peter Hunsberger <peter.hunsberger@gmail.com>
>
>> On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
>>> On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
>>>> Dear David, dear all,
>>>> I very well understand what you are saying...
>>>
>>> Clearly you do not.  What you are proposing has been tried many, many
>>> times before, and universally fails.
>>
>> I've been refraining from jumping on this due to time constraints, but
>> this statement is silly.  We have a system that does almost exactly
>> what the OP wants although the implementation is slightly different:
>> we use an EAV like model with strong typing and build set / subset
>> forests to maintain arbitrary hierarchies of relationships.  Our
>> reasons for doing this are similar to the OPs; it's for research (in
>> our case medical research).  We maintain over 200,000 pieces of end
>> user generated metadata, describing what would be in a conventional
>> relational model over 20,000 columns and some 1,000s of tables but the
>> actual physical model is some 40 tables.   Yes, the flip side is, such
>> a system won't support more than 1,000,000s of transactions per day,
>> but that's not why you build them.
>>
>>>
>>> That your people are failing to get together and agree to a data model
>>> is not a reason for you to prop up their failure with a technological
>>> "fix" that you know from the outset can't be made to work.
>>>
>>
>> Spoken like someone who has always had the luxury of working in areas
>> with well defined problem domains...   I can't tell you the number of
>> people that told us exactly the same thing when we started on it.
>> That was 8 years ago.  Not only can such systems be built, they can be
>> made to scale reasonably well.  You do need to understand what you are
>> doing and why: the costs can be high, but when it comes to research,
>> the benefits can far outweigh the costs.
>>
>> --
>> Peter Hunsberger
>>
>>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
At first glance: brilliant! I was about to implement this key/value thing with an XML type... I will take a closer look at this, thanks a lot, Oleg!

Tips & tricks to get this going in PostgreSQL?


Rob

2009/9/28 Oleg Bartunov <oleg@sai.msu.su>
Have you considered contrib/hstore to build flexible database scheme ?

Oleg

On Sun, 27 Sep 2009, InterRob wrote:

Dear David, dear Peter, dear all,
Peter, I was happy reading your reply right after I opened and read Davids.
I do think I am on the right track; it is not a matter of building the
one-and-only right schema, not in this case. Archaeology has the same twist
as has ethnography, antropology and alike: they work with (what I would
call) "narratives" (in fact, in the case of archaeology this seems to me to
be an archaeologists monologue...). They try to support their findings with
statistics and other means of quatification -- as does this modern,
rationalist world require them to do, to be taken seriously as science... I
seek to implement all this in a hybrid form; a fusion between the relational
and EAV concept.

Peter, may I invite you to privately share some more details on the system
you are using and the design of it? Did you implement it using PostgreSQL?
Looking forward to your reply.
(And with respect to your previous message: whom are you actually referring
to by the acronym "OPs"?)

Cheerz,


Rob

2009/9/27 Peter Hunsberger <peter.hunsberger@gmail.com>

On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
Dear David, dear all,
I very well understand what you are saying...

Clearly you do not.  What you are proposing has been tried many, many
times before, and universally fails.

I've been refraining from jumping on this due to time constraints, but
this statement is silly.  We have a system that does almost exactly
what the OP wants although the implementation is slightly different:
we use an EAV like model with strong typing and build set / subset
forests to maintain arbitrary hierarchies of relationships.  Our
reasons for doing this are similar to the OPs; it's for research (in
our case medical research).  We maintain over 200,000 pieces of end
user generated metadata, describing what would be in a conventional
relational model over 20,000 columns and some 1,000s of tables but the
actual physical model is some 40 tables.   Yes, the flip side is, such
a system won't support more than 1,000,000s of transactions per day,
but that's not why you build them.


That your people are failing to get together and agree to a data model
is not a reason for you to prop up their failure with a technological
"fix" that you know from the outset can't be made to work.


Spoken like someone who has always had the luxury of working in areas
with well defined problem domains...   I can't tell you the number of
people that told us exactly the same thing when we started on it.
That was 8 years ago.  Not only can such systems be built, they can be
made to scale reasonably well.  You do need to understand what you are
doing and why: the costs can be high, but when it comes to research,
the benefits can far outweigh the costs.

--
Peter Hunsberger




       Regards,
               Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: generic modelling of data models; enforcing constraints dynamically...

От
InterRob
Дата:
Second glance: brilliant again! Even support for indexing is available; nice job.

I found the hstore.sql -- that will add type, functions and stuff to my db.

I will give it a serious try!


Rob

2009/9/28 InterRob <rob.marjot@gmail.com>
At first glance: brilliant! I was about to implement this key/value thing with an XML type... I will take a closer look at this, thanks a lot, Oleg!

Tips & tricks to get this going in PostgreSQL?


Rob

2009/9/28 Oleg Bartunov <oleg@sai.msu.su>

Have you considered contrib/hstore to build flexible database scheme ?

Oleg

On Sun, 27 Sep 2009, InterRob wrote:

Dear David, dear Peter, dear all,
Peter, I was happy reading your reply right after I opened and read Davids.
I do think I am on the right track; it is not a matter of building the
one-and-only right schema, not in this case. Archaeology has the same twist
as has ethnography, antropology and alike: they work with (what I would
call) "narratives" (in fact, in the case of archaeology this seems to me to
be an archaeologists monologue...). They try to support their findings with
statistics and other means of quatification -- as does this modern,
rationalist world require them to do, to be taken seriously as science... I
seek to implement all this in a hybrid form; a fusion between the relational
and EAV concept.

Peter, may I invite you to privately share some more details on the system
you are using and the design of it? Did you implement it using PostgreSQL?
Looking forward to your reply.
(And with respect to your previous message: whom are you actually referring
to by the acronym "OPs"?)

Cheerz,


Rob

2009/9/27 Peter Hunsberger <peter.hunsberger@gmail.com>

On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
Dear David, dear all,
I very well understand what you are saying...

Clearly you do not.  What you are proposing has been tried many, many
times before, and universally fails.

I've been refraining from jumping on this due to time constraints, but
this statement is silly.  We have a system that does almost exactly
what the OP wants although the implementation is slightly different:
we use an EAV like model with strong typing and build set / subset
forests to maintain arbitrary hierarchies of relationships.  Our
reasons for doing this are similar to the OPs; it's for research (in
our case medical research).  We maintain over 200,000 pieces of end
user generated metadata, describing what would be in a conventional
relational model over 20,000 columns and some 1,000s of tables but the
actual physical model is some 40 tables.   Yes, the flip side is, such
a system won't support more than 1,000,000s of transactions per day,
but that's not why you build them.


That your people are failing to get together and agree to a data model
is not a reason for you to prop up their failure with a technological
"fix" that you know from the outset can't be made to work.


Spoken like someone who has always had the luxury of working in areas
with well defined problem domains...   I can't tell you the number of
people that told us exactly the same thing when we started on it.
That was 8 years ago.  Not only can such systems be built, they can be
made to scale reasonably well.  You do need to understand what you are
doing and why: the costs can be high, but when it comes to research,
the benefits can far outweigh the costs.

--
Peter Hunsberger




       Regards,
               Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83



Re: generic modelling of data models; enforcing constraints dynamically...

От
Oleg Bartunov
Дата:
Rob,
There are many users of hstore, so you can get support here. Also, someone
is working on the new improved version of hstore, check pgfoundry and
-hackers mailing list.

Oleg
On Mon, 28 Sep 2009, InterRob wrote:

> Second glance: brilliant again! Even support for indexing is available; nice
> job.
> I found the hstore.sql -- that will add type, functions and stuff to my db.
>
> I will give it a serious try!
>
>
> Rob
>
> 2009/9/28 InterRob <rob.marjot@gmail.com>
>
>> At first glance: brilliant! I was about to implement this key/value thing
>> with an XML type... I will take a closer look at this, thanks a lot, Oleg!
>> Tips & tricks to get this going in PostgreSQL?
>>
>>
>> Rob
>>
>> 2009/9/28 Oleg Bartunov <oleg@sai.msu.su>
>>
>> Have you considered contrib/hstore to build flexible database scheme ?
>>>
>>> Oleg
>>>
>>> On Sun, 27 Sep 2009, InterRob wrote:
>>>
>>>  Dear David, dear Peter, dear all,
>>>> Peter, I was happy reading your reply right after I opened and read
>>>> Davids.
>>>> I do think I am on the right track; it is not a matter of building the
>>>> one-and-only right schema, not in this case. Archaeology has the same
>>>> twist
>>>> as has ethnography, antropology and alike: they work with (what I would
>>>> call) "narratives" (in fact, in the case of archaeology this seems to me
>>>> to
>>>> be an archaeologists monologue...). They try to support their findings
>>>> with
>>>> statistics and other means of quatification -- as does this modern,
>>>> rationalist world require them to do, to be taken seriously as science...
>>>> I
>>>> seek to implement all this in a hybrid form; a fusion between the
>>>> relational
>>>> and EAV concept.
>>>>
>>>> Peter, may I invite you to privately share some more details on the
>>>> system
>>>> you are using and the design of it? Did you implement it using
>>>> PostgreSQL?
>>>> Looking forward to your reply.
>>>> (And with respect to your previous message: whom are you actually
>>>> referring
>>>> to by the acronym "OPs"?)
>>>>
>>>> Cheerz,
>>>>
>>>>
>>>> Rob
>>>>
>>>> 2009/9/27 Peter Hunsberger <peter.hunsberger@gmail.com>
>>>>
>>>>  On Sun, Sep 27, 2009 at 2:22 PM, David Fetter <david@fetter.org> wrote:
>>>>>
>>>>>> On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
>>>>>>
>>>>>>> Dear David, dear all,
>>>>>>> I very well understand what you are saying...
>>>>>>>
>>>>>>
>>>>>> Clearly you do not.  What you are proposing has been tried many, many
>>>>>> times before, and universally fails.
>>>>>>
>>>>>
>>>>> I've been refraining from jumping on this due to time constraints, but
>>>>> this statement is silly.  We have a system that does almost exactly
>>>>> what the OP wants although the implementation is slightly different:
>>>>> we use an EAV like model with strong typing and build set / subset
>>>>> forests to maintain arbitrary hierarchies of relationships.  Our
>>>>> reasons for doing this are similar to the OPs; it's for research (in
>>>>> our case medical research).  We maintain over 200,000 pieces of end
>>>>> user generated metadata, describing what would be in a conventional
>>>>> relational model over 20,000 columns and some 1,000s of tables but the
>>>>> actual physical model is some 40 tables.   Yes, the flip side is, such
>>>>> a system won't support more than 1,000,000s of transactions per day,
>>>>> but that's not why you build them.
>>>>>
>>>>>
>>>>>> That your people are failing to get together and agree to a data model
>>>>>> is not a reason for you to prop up their failure with a technological
>>>>>> "fix" that you know from the outset can't be made to work.
>>>>>>
>>>>>>
>>>>> Spoken like someone who has always had the luxury of working in areas
>>>>> with well defined problem domains...   I can't tell you the number of
>>>>> people that told us exactly the same thing when we started on it.
>>>>> That was 8 years ago.  Not only can such systems be built, they can be
>>>>> made to scale reasonably well.  You do need to understand what you are
>>>>> doing and why: the costs can be high, but when it comes to research,
>>>>> the benefits can far outweigh the costs.
>>>>>
>>>>> --
>>>>> Peter Hunsberger
>>>>>
>>>>>
>>>>>
>>>>
>>>        Regards,
>>>                Oleg
>>> _____________________________________________________________
>>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>>> Sternberg Astronomical Institute, Moscow University, Russia
>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>>> phone: +007(495)939-16-83, +007(495)939-23-83
>>>
>>>
>>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83