Обсуждение: Polymorphic delete help needed
			
				I am doing a project using Ruby On Rails with PostgreSQL as the database.  I have not seen the term polymorphic used with databases except with Rails so I will quickly describe it. 
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
			
		
		
	Instead of holding just an id as a foreign key, the record holds a "type" field which is a string and an id.  The string is the name of the table to which the id applies.  (That is slightly simplified).
The first problem that creates is it makes it hard to do a constraint on the name/id pair.  I thought about writing a function that would take the pair and search the appropriate table.  If it found a match, it would return true and if not, it would return false.  I have not done that because the string used to "name" the table has been modified to look like a class name.  So, a foreign key pointing to the table happy_people would have "HappyPeople" in the string (and not "happy_people").  It is not an impossible task to transform the string but I just have not attacked it yet for a couple of reasons.
One reason is that I can put this check into Rails much easier.  I don't know which would be faster to execute or if it would make any significant different.
But I have a much bigger problem.  One that I can not really visualize how to properly solve and that is how do I do deletes.
To back up a step, I have a table called relationships which has two polymorphic foreign keys in it call parent and child.  Then I have a set of tables such as people, companies, addresses, etc.  Collectively, I call these items.
The relationships are the glue that point between items like a person and a company for example.
Now, suppose I want to delete a person.  That implies that some relationships pointing to that person are no longer valid.  If I remove those, that could imply that there are other items that now have no relationships pointing to them.  How should I delete the item, extra relationships, and extra items and still make this update so that if something fails in the middle, it will get properly rolled back?
Thank you for your help,
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
> I am doing a project using Ruby On Rails with PostgreSQL as the
> database.  I have not seen the term polymorphic used with databases
> except with Rails so I will quickly describe it.
>
> Instead of holding just an id as a foreign key, the record holds a
> "type" field which is a string and an id.  The string is the name of
> the table to which the id applies.  (That is slightly simplified).
    Here is how I implemented something very similar (in PHP) :
    - "Node" class and several derived classes.
    - "nodes" table which contains the fields for the base class with node_id
as a PK and a field which indicates the class
    - "nodes_***" tables which contain the extra fields for the derived class
"***", having node_id as a primary key.
    As you see this is very similar to what you got.
    All the "nodes_***" tables have :
    node_id PRIMARY KEY REPERENCES nodes( id ) ON DELETE CASCADE
    So when I delete a Node, the derived class records are automatically
deleted in the auxiliary tables.
    Since there can be only one distinct node_id per node, you can put ON
DELETE CASCADE safely.
    Now, for your tree-thing, the fact that references are polymorphic isn't
important since they all refer to the same main "nodes" table.
    However as soon as you say "when a node no longer has any relations
pointing to it", then you get to choose between various flavors of garbage
collection and reference counting...
    Personally I like to move the deleted or orphaned rows to a "trash"
folder so they can be recovered if the user did delete the wrong node for
instance. Once in a while i "empty trash".
> The first problem that creates is it makes it hard to do a constraint
> on the name/id pair.  I thought about writing a function that would
    Is this ClassName / id only found in the "main" table or is every FK
implemented as a ClassName / id pair ?
			
		On Jul 6, 2007, at 2:31 AM, PFC wrote: > >> I am doing a project using Ruby On Rails with PostgreSQL as the >> database. I have not seen the term polymorphic used with databases >> except with Rails so I will quickly describe it. >> >> Instead of holding just an id as a foreign key, the record holds a >> "type" field which is a string and an id. The string is the name of >> the table to which the id applies. (That is slightly simplified). > > Here is how I implemented something very similar (in PHP) : > > - "Node" class and several derived classes. > - "nodes" table which contains the fields for the base class with > node_id as a PK and a field which indicates the class > - "nodes_***" tables which contain the extra fields for the > derived class "***", having node_id as a primary key. > > As you see this is very similar to what you got. > All the "nodes_***" tables have : > node_id PRIMARY KEY REPERENCES nodes( id ) ON DELETE CASCADE > > So when I delete a Node, the derived class records are > automatically deleted in the auxiliary tables. > Since there can be only one distinct node_id per node, you can put > ON DELETE CASCADE safely. > > Now, for your tree-thing, the fact that references are polymorphic > isn't important since they all refer to the same main "nodes" table. > > However as soon as you say "when a node no longer has any > relations pointing to it", then you get to choose between various > flavors of garbage collection and reference counting... > > Personally I like to move the deleted or orphaned rows to a > "trash" folder so they can be recovered if the user did delete the > wrong node for instance. Once in a while i "empty trash". Your method is not exactly what I am doing because I do not have the "nodes" table. I have only the "nodes_***" tables. But, I believe your approach has many advantages. Rails has an inheritance ability but they do it differently. They simply have a "nodes" table with all the "nodes_***" tables smashed in to it. I did not like that approach at all. But doing what you are doing, I believe I can very nicely fit into Rails and (obviously) PostgreSQL. Plus, your suggestion of moving entries to a "trash" bin seems very wise. > >> The first problem that creates is it makes it hard to do a constraint >> on the name/id pair. I thought about writing a function that would > > Is this ClassName / id only found in the "main" table or is every > FK implemented as a ClassName / id pair ? The ClassName / id pair is found only in the relationships table. There are two instances of it however: parent and child.
On Thu, Jul 05, 2007 at 09:56:12PM -0500, Perry Smith wrote: > I am doing a project using Ruby On Rails with PostgreSQL as the > database. I have not seen the term polymorphic used with databases > except with Rails so I will quickly describe it. You have now :) http://archives.postgresql.org/sfpug/2005-04/msg00022.php > Instead of holding just an id as a foreign key, the record holds a > "type" field which is a string and an id. The string is the name of > the table to which the id applies. (That is slightly simplified). This is brittle by nature. The above link sketches out a way to make it stable. If you have questions, ask :) > The first problem of many ;) [other stuff snipped] > that creates is it makes it hard to do a constraint > on the name/id pair. Let PostgreSQL work *for* you instead of picking a fight with it and then piling on heaps of unnecessary code. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
O>>     Here is how I implemented something very similar (in PHP) :
>>
>>     - "Node" class and several derived classes.
>>     - "nodes" table which contains the fields for the base class with
>> node_id as a PK and a field which indicates the class
>>     - "nodes_***" tables which contain the extra fields for the derived
>> class "***", having node_id as a primary key.
>>
>>     As you see this is very similar to what you got.
>>     All the "nodes_***" tables have :
>>     node_id PRIMARY KEY REPERENCES nodes( id ) ON DELETE CASCADE
> Your method is not exactly what I am doing because I do not have the
> "nodes" table.  I have only the "nodes_***" tables.  But, I believe your
> approach has many advantages.
    OK, I see.
    The advantage of my approach is that :
    - all common fields that are shared by all derived classes are stored in
the "base" nodes table
    - the PK on this table uniquely identifies any instance, whatever its
object class
    - the tables holding the derived classes's extra members (ie columns)
merely refer to it via a FK which is clean
    When you want to get a Node instance by its id, the ORM grabs the row in
the main table, which contains the type information, then instantiates the
right object class, grabs the extra row in the extra tables (which shares
the same id), and gives this data to the instantiated object which then
uses it to populate its member variables.
    Therefore, there can be no duplicates, and you only need to look into one
table to know if an object with a specific id exists or not, or to delete
it.
    Say Nodes can be, for instance, Photos, or Albums, or Articles ; say an
Album contains Photos and Articles and you want to display its contents :
    - select from relations r JOIN nodes n ON n.id=r.child_id WHERE
r.parent_id = (id of album)
    => gets the ids of children along with extra info stored in the relations
table (like sort order, etc)
    - scan the result set and group it by object class
    - for each object class :
        - grab the rows into the extra tables for the ids of objects of this
class using SELECT WHERE id IN (...)
        - instantiate and populate
    So if you have an Album with 5 Articles and 10 Photos, you do 3 queries -
    - get 15 rows from Nodes JOIN relations
    - get 5 rows from nodes_articles
    - get 10 rows from nodes_photos
    It's pretty efficient. I believe you can tweak Rails' ORM into doing that.
    FYI it's a PHP ORM that I wrote over the course of various projects.
Unfortunately PHP is very bad at generic/meta programming, so the
implementation is quite convoluted, and I did not opensource it because
the documentation would be quite an effort to write ;)
> Rails has an inheritance ability but they do it differently.  They
> simply have a "nodes" table with all the "nodes_***" tables smashed in
> to it.  I did not like that approach at all.
    Hm, another reason why I don't really like the Rails approach... do they
at least use unique IDs ? Please say yes ;)
> But doing what you are doing, I believe I can very nicely fit into Rails
> and (obviously) PostgreSQL.  Plus, your suggestion of moving entries to
> a "trash" bin seems very wise.
>>> The first problem that creates is it makes it hard to do a constraint
>>> on the name/id pair.  I thought about writing a function that would
>>
>>     Is this ClassName / id only found in the "main" table or is every FK
>> implemented as a ClassName / id pair ?
>
> The ClassName / id pair is found only in the relationships table.  There
> are two instances of it however: parent and child.
    I believe duplicating the ClassName and storing it everywhere to be bad.
With my approach you only need to use the id in reference since two
instances of the same base class cannot have the same id regardless of
their class, and you only need the id to instantiate a row, whatever its
class.
    If you want to specify that a class instance can only have children of
specific classes (or their subclasses), for example a FriendList can only
contain instances of People, or whatever, you can use a constraint trigger
which will check the class of the referenced row against a table
describing the allowed parent/child combinations.
			
		On Jul 6, 2007, at 8:01 AM, David Fetter wrote:
On Thu, Jul 05, 2007 at 09:56:12PM -0500, Perry Smith wrote:I am doing a project using Ruby On Rails with PostgreSQL as thedatabase. I have not seen the term polymorphic used with databasesexcept with Rails so I will quickly describe it.You have now :)Instead of holding just an id as a foreign key, the record holds a"type" field which is a string and an id. The string is the name ofthe table to which the id applies. (That is slightly simplified).This is brittle by nature. The above link sketches out a way to makeit stable. If you have questions, ask :)The first problemof many ;)[other stuff snipped]that creates is it makes it hard to do a constrainton the name/id pair.Let PostgreSQL work *for* you instead of picking a fight with it andthen piling on heaps of unnecessary code.
I really want to follow this particular edict.  Rails makes it easy to put the checking up in Rails but, I assume that if the interface between PostgreSQL and one of its languages like "SQL"  or Python, etc is an inch thick, then the interface between Rails and PostgreSQL would be 12 inches thick.
You and PFC I think are on a similar track.  I need to figure out how to get Rails to do it this way.  I was thinking just teach Rails about the flex_key view and then add rules for insert, delete, set.  That is where I start to feel intimidated.  It doesn't look hard but I've just never done it.
The other place where my knowledge is lacking is when I get a 'thing' from the database into Rails, making it an instance of a particular class.  That can't be too hard though.
Let me work on this over the weekend.  This is not a "for work" project.  I really appreciate the help.
Perry
Ease Software, Inc. ( http://www.easesoftware.com )
On Jul 6, 2007, at 8:23 AM, PFC wrote: > The advantage of my approach is that : [ snip -- all very good stuff ] I think I can make something like this into Rails fairly easy. > Hm, another reason why I don't really like the Rails approach... > do they at least use unique IDs ? Please say yes ;) Yes. Rails makes it really hard NOT to have id's -- but you can if you need t hook into existing databases. But, two things: 1) Please don't infer what Rails can and can not do from me. I'm new to Rails. 2) The single inheritance is one approach. The polymophic way (that I described first) is an unrelated approach. . >> But doing what you are doing, I believe I can very nicely fit into >> Rails and (obviously) PostgreSQL. Plus, your suggestion of moving >> entries to a "trash" bin seems very wise. > >>>> The first problem that creates is it makes it hard to do a >>>> constraint >>>> on the name/id pair. I thought about writing a function that would >>> >>> Is this ClassName / id only found in the "main" table or is >>> every FK implemented as a ClassName / id pair ? >> >> The ClassName / id pair is found only in the relationships table. >> There are two instances of it however: parent and child. > > I believe duplicating the ClassName and storing it everywhere to > be bad. With my approach you only need to use the id in reference > since two instances of the same base class cannot have the same id > regardless of their class, and you only need the id to instantiate > a row, whatever its class. This is really big to me. I did not mention it but I have a constraint currently implemented in Rails that two things that have a Relationship, must have a relationship that comes from another table. So A can relate to B only if the link_types table has an entry from A to B. So, I have duplicate information sprinkled all over. That is partly my fault but it is also, as you point out, partly the way that Rails puts the type in the Relationship rather than down in a common table. The common table approach makes so much sense. I need to go back and re-read how Rails does single inheritance. Maybe I got that all confused. Because, really, that is what all this is flowing back to create. > > If you want to specify that a class instance can only have > children of specific classes (or their subclasses), for example a > FriendList can only contain instances of People, or whatever, you > can use a constraint trigger which will check the class of the > referenced row against a table describing the allowed parent/child > combinations. I kinda have this but it is implemented up in rails, not down in the db where it would be more efficient. I replied to David as well. You both have similar ideas. I will work on this over the weekend and I hope I can give an update when I'm done. Thank you very much. You have helped me a lot. Perry Smith ( pedz@easesoftware.com ) Ease Software, Inc. ( http://www.easesoftware.com ) Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
On Fri, Jul 06, 2007 at 08:39:50AM -0500, Perry Smith wrote: > > On Jul 6, 2007, at 8:01 AM, David Fetter wrote: > > >On Thu, Jul 05, 2007 at 09:56:12PM -0500, Perry Smith wrote: > >>I am doing a project using Ruby On Rails with PostgreSQL as the > >>database. I have not seen the term polymorphic used with > >>databases except with Rails so I will quickly describe it. > > > >You have now :) > > > >http://archives.postgresql.org/sfpug/2005-04/msg00022.php > > > >>Instead of holding just an id as a foreign key, the record holds a > >>"type" field which is a string and an id. The string is the name > >>of the table to which the id applies. (That is slightly > >>simplified). > > > >This is brittle by nature. The above link sketches out a way to > >make it stable. If you have questions, ask :) > > > >>The first problem > > > >of many ;) > > > >[other stuff snipped] > > > >>that creates is it makes it hard to do a constraint on the name/id > >>pair. > > > >Let PostgreSQL work *for* you instead of picking a fight with it > >and then piling on heaps of unnecessary code. > > I really want to follow this particular edict. Rails makes it easy > to put the checking up in Rails but, I assume that if the interface > between PostgreSQL and one of its languages like "SQL" or Python, > etc is an inch thick, then the interface between Rails and > PostgreSQL would be 12 inches thick. At some point, you're going to realize that Rails is the problem, not the solution. It's written by people who do not understand what a shared data store is and reflects problems inherent in its native database platform: MySQL 3.23. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
On Jul 6, 2007, at 10:36 AM, David Fetter wrote:
At some point, you're going to realize that Rails is the problem, not
the solution. It's written by people who do not understand what a
shared data store is and reflects problems inherent in its native
database platform: MySQL 3.23.
Thats depressing...  I admit that I've questioned why MySQL is used as the default.
From what I can see, Rails is mostly written by people trying to get web applications up and running.  So, admittedly, they are no db experts or javascript experts or whatever else experts but they do know how to get a base set of tools so that a pretty good web application can be implemented very quickly.  By pretty good, I mean better than I could do with other tools I've tried to use.
And, the nice thing is, while they greatly influence which way to go, they allow (or Ruby allows) you to do most anything and without tremendous pain.
e.g.  I think I can plumb this into Rails without chain sawing Rails to pieces.
But, I would like to hear more specifically your thoughts on Rails.  May not be appropriate for this list.  I think adding (or removing) things from Rails to better work with "real" databases is very possible.  If nothing else, as extensions to Rails.
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
On 7/6/07, David Fetter <david@fetter.org> wrote: > At some point, you're going to realize that Rails is the problem, not > the solution. It's written by people who do not understand what a > shared data store is and reflects problems inherent in its native > database platform: MySQL 3.23. This is rampant FUD; generalizations such as these help nobody. There are plenty of developers, myself included, who use Rails in a relational way, with relational integrity, transactions and normalized schemas, none of which are particularly hampered by Rails' MySQLisms. On the other hand, it is true that ActiveRecord lacks some tools out of the box -- for example, model-level foreign key constraints are accessible through a plugin you need to install separately. It is also true that some of ActiveRecord's mechanisms are less than clean, polymorphic associations in particular. Keep in mind that ActiveRecord is an object-relational mapper. There is the well-known impedance mismatch between object-oriented programming and the relational model, one that is not trivially overcome (the view-based example cited earlier in this thread is a paticularly egregious example of a non-trivial solution). Polymorphic associations is a practical solution that compromises relational purity for performance and ease of implementation, and it could be argued that it's a "good enough" solution for many developers, who are ultimately interested in getting things done, even if they "do not understand what a shared data store is". ActiveRecord's polymorphism can be explained on a paper napkin; I haven't encountered a purely relational solution in this thread that fits this description. Alexander.