Optimizing the implementation of an optimized tree

Поиск
Список
Период
Сортировка
От Christian Rishoej
Тема Optimizing the implementation of an optimized tree
Дата
Msg-id 1020733336.892.79.camel@penelope.dyndns.dk
обсуждение исходный текст
Ответы Re: Optimizing the implementation of an optimized tree  ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>)
Список pgsql-sql

Hi,

Recently I needed to store a hiearchy of pages on a website in Postgres
in a way that would allow for fast extraction of parts of the tree for
menu generation and "path" specification (which nodes in the tree make
up the path from the root to my position?).

This can be accomplished by letting each node in the tree have an l and
r value with values determined by traversing the edge of the tree and
assigning the value of integer that is incremented at each node visit to
l and doing the same thing for r, this time traversing the edge of the
tree backwards.

The tree then becomes:

CREATE TABLE tree (
        id      serial PRIMARY KEY,
        parent  int REFERENCES tree,
        l       int DEFAULT NULL,
        r       int DEFAULT NULL,
);

The parent id strictly not needed, but I chose to include it for
convenience.

I can easily extract a complete branch like this:

        SELECT treeNode.id
        FROM tree treeNode, tree treeParent
        WHERE treeNode.l BETWEEN treeParent.l AND treeParent.r
                AND treeParent.id = $1
        ORDER BY treeNode.l

And the "path" from the root to a particular node like this:

        SELECT treeNode.id
        FROM tree treeNode, tree currentNode
        WHERE currentNode.r BETWEEN treeNode.l AND treeNode.r
                AND currentNode.id = $1
        ORDER BY treeNode.l;

Now, in order to avoid having to maintain the values of l and r for each
node when the tree is modified I created triggers in PL/PGSQL that
handle INSERTs of new nodes into the tree, DELETEs of existing nodes and
UPDATEs of a node's position in the tree (i.e. the parant field).

The implementation is included as an attachment.

I am very interested in comments on the implementation - especially
hints on possible optimizations.

It seems to me that the PL/PGSQL-definition of each function is parsed
at each call. Is this true? Is it possible to avoid this?

It is my understanding that Postgres does not yet support triggers that
fire on the update of a paricular column. Therefore I have a check in
the UPDATE trigger that checks if the parent-field was modified. Is
there any was to do this in a nicer manner?

Regards,
Christian

CREATE TABLE tree (
    id     serial PRIMARY KEY,
    parent     int REFERENCES tree,
    l     int DEFAULT NULL,
    r     int DEFAULT NULL,
);

-- Are these a good idea?
CREATE INDEX tree_l_idx ON tree (l);
CREATE INDEX tree_r_idx ON tree (r);

CREATE OR REPLACE FUNCTION handle_tree_insert() RETURNS opaque AS '
    DECLARE
        parentR INTEGER;
    BEGIN
        IF NEW.l ISNULL THEN
            parentR := ( SELECT r FROM tree WHERE id = NEW.parent );
            UPDATE tree SET l=l+2 WHERE l >= parentR;
            UPDATE tree SET r=r+2 WHERE r >= parentR;
            NEW.l := parentR;
            NEW.r := parentR + 1;
        END IF;
        RETURN NEW;
    END;
' LANGUAGE PLPGSQL;

CREATE TRIGGER on_tree_insert BEFORE INSERT
    ON tree
    FOR EACH ROW
    EXECUTE PROCEDURE handle_tree_insert();


CREATE OR REPLACE FUNCTION handle_tree_delete() RETURNS opaque AS '
    DECLARE
        decr INTEGER;
    BEGIN
        -- RAISE NOTICE ''Gets here, ID %'', OLD.id;
        IF NOT OLD.l ISNULL THEN
            decr := (((OLD.r - OLD.l - 1) / 2 ) + 1) * 2;
            -- RAISE NOTICE ''...and executes stuff (decrementing with %)'', decr;

            UPDATE tree SET parent = NULL, l = NULL, r = NULL WHERE l > OLD.l AND r < OLD.r;
            DELETE FROM tree WHERE parent ISNULL AND l ISNULL AND r ISNULL;

            UPDATE tree SET l = l - decr WHERE l > OLD.l;
            UPDATE tree SET r = r - decr WHERE r > OLD.r;
        END IF;
        RETURN OLD;
    END;
' LANGUAGE PLPGSQL;

CREATE TRIGGER on_tree_delete AFTER DELETE
    ON tree
    FOR EACH ROW
    EXECUTE PROCEDURE handle_tree_delete();

CREATE OR REPLACE FUNCTION handle_tree_update() RETURNS opaque AS '
    DECLARE
        parentR INTEGER;
        span INTEGER;
        dist INTEGER;
    BEGIN
        IF (NOT OLD.parent ISNULL) AND (NOT NEW.parent ISNULL) AND (OLD.parent <> NEW.parent) THEN
            span := OLD.r - OLD.l + 1;

            -- Get the R value of the new parent
            parentR := ( SELECT r FROM tree WHERE id = NEW.parent );

            -- Determine the distance
            dist = parentR - OLD.l;

            -- Make the gab
            UPDATE tree SET l = l + span WHERE l >= parentR;
            UPDATE tree SET r = r + span WHERE r >= parentR;

            -- Move our branch to the newly created gab
            -- Decrement nodes after the old position to close the old gab
            -- If dist is < 0 then our own branch was moved <span> steps - this must be taken into consideration
            IF (dist > 0)
                THEN UPDATE tree SET l = l + (dist), r = r + (dist) WHERE l >= OLD.l AND r <= OLD.r;
                UPDATE tree SET l = l - span WHERE l >= OLD.l;
                UPDATE tree SET r = r - span WHERE r >= OLD.r;
            ELSE
                UPDATE tree SET l = l + (dist - span), r = r + (dist - span) WHERE l >= (OLD.l + span) AND r <= (OLD.r
+span); 
                UPDATE tree SET l = l - span WHERE l >= (OLD.l + span);
                UPDATE tree SET r = r - span WHERE r >= (OLD.r + span);
            END IF;

        END IF;
        RETURN NULL;
    END;
' LANGUAGE PLPGSQL;

CREATE TRIGGER on_tree_update AFTER UPDATE
    ON tree
    FOR EACH ROW
    EXECUTE PROCEDURE handle_tree_update();



В списке pgsql-sql по дате отправления:

Предыдущее
От: Gary Stainburn
Дата:
Сообщение: Re: summary VIEW
Следующее
От: "Christopher Kings-Lynne"
Дата:
Сообщение: Re: Optimizing the implementation of an optimized tree