Обсуждение: Beginner's questions about creating a custom data type in PostgreSQL...

Поиск
Список
Период
Сортировка

Beginner's questions about creating a custom data type in PostgreSQL...

От
"Redefined Horizons"
Дата:
I've got a few questions about creating custom data types for PostgreSQL.

I'm not sure how to phrase the questions without an example, so here goes:

I'd like to create a custom data type representing a Medieval Knight.
(This isn't what I really want to do, it's just an example.)

One of the "properties" of this Medieval Knight data type that you can
access through custom functions is "Combat Skill", which represents a
Knights fighting ability.

The Knight's fighting ability depends on two things. His horse, and
his sword. The custom functions that work with the Medieval Knight
data type know how to manipulate the properties of Horses and Swords
as well.

My Medieval Knight datatype "contains" a horse and sword. In the C
programming language implementation of my datatype I would like to
represent Swords and Horses with seperate structs, rather than
throwing everything together into a single Medieval Knight struct.

Is it possible to define the Sword and Horse structs in the same DLL
that I define my Knight struct in, but not have Swords and Horses
available as custom data types themselves?

I want to have 3 separate structs for programming simplicity, but I
want to embody them in a single custom data type and set of custom
functions for PostgreSQL.

Is this possible?

Thanks,

Scott Huey

Re: Beginner's questions about creating a custom data type in PostgreSQL...

От
"Merlin Moncure"
Дата:
On 7/24/06, Redefined Horizons <redefined.horizons@gmail.com> wrote:
> I've got a few questions about creating custom data types for PostgreSQL.

ok, i have a weakness for medieval stuff, so:

> I'm not sure how to phrase the questions without an example, so here goes:
>
> I'd like to create a custom data type representing a Medieval Knight.
> (This isn't what I really want to do, it's just an example.)

> One of the "properties" of this Medieval Knight data type that you can
> access through custom functions is "Combat Skill", which represents a
> Knights fighting ability.
>
> The Knight's fighting ability depends on two things. His horse, and
> his sword. The custom functions that work with the Medieval Knight
> data type know how to manipulate the properties of Horses and Swords
> as well.
>
> My Medieval Knight datatype "contains" a horse and sword. In the C
> programming language implementation of my datatype I would like to
> represent Swords and Horses with seperate structs, rather than
> throwing everything together into a single Medieval Knight struct.
>
> Is it possible to define the Sword and Horse structs in the same DLL
> that I define my Knight struct in, but not have Swords and Horses
> available as custom data types themselves?
>
> I want to have 3 separate structs for programming simplicity, but I
> want to embody them in a single custom data type and set of custom
> functions for PostgreSQL.


This is not necessarily a custom data type question, this is a data
modeling question.  Your phrasing suggests that you have a programming
background...in the database world it's a bit different.  The closest
thing to the custom type as you described it is the composite type.
Composite types are groupings of other types.  When you create a
table, by the way, a composite type is automatically created for you.
you can also create your own using
http://www.postgresql.org/docs/8.1/static/sql-createtype.html (look at
the first form of create type).

so,
create table knight(combat_skill numeric, weapon text, horse text);

gives you a 'knight' type, now you can:

select (1.0, 'sword', 'trigger')::knight;

which composes the knight type out of the fields using a row
constructor.  This isn't really necessary though, in most cases it's
safe just to use tables without consideration of its type at all, a
table is just a set of a type defined in its create table statement.
Let's say you wanted to develop the horse further...

create table horse(name text primary key, age int, speed numeric);

let's link the knight to his horse:

alter table knight add foreign key(horse) references horse;

this is not the only way to do this, but this is more of a relational
approach than what you suggested in your email.  the key to
understanding sql and data modeling is reducing data to it's
functional dependencies so changes/additions to your data are as local
as possible.  This is a little bit diffferent than you might think of
things from a C perspective.

merlin

Re: Beginner's questions about creating a custom data type in PostgreSQL...

От
"Redefined Horizons"
Дата:
Merlin,

Thank you very much for taking the time to write that detailed
response to my question. I appreciate it very much.

You have helped me understand my design issue better, but I have
another question if you have the patience to answer! :]

I am actually working on a spatial extension for PostgreSQL that is
similar to PostGIS, but compatible with a geometry library I am
developing.

I had first thought of using a "table" only desgin, with no custom
data types, to store the geometry information in the database.

For example, the "many_lines" table might store records representing
multi-segmented lines. (One geometry made up of smaller connected line
segments.) A separate table called "single_lines" would contain the
actual lines. The link between the line segment geometries and the
"many lines" geometries would be maintained by a one-to-many
relationship.

However, I began to realize that as my geometries became more complex
I would begin to have more and more interdependent relationships. this
in itself isn't a problem, until you consider that in a typical GIS
system I might be dealing with thousands or tens of thousands of
geometries. In the example above, if I wanted to find all of the line
segment geometries that belonged to a"many line" geometry I would have
to run a query on the "single_lines" table.

I am concerned about the performance of a system that would frequently
rely on a cascade of these types of searches.

If I instead implement a custom "many lines" geometry data type I can
now directly access the line segment geometries.

Do you think this reasoning is sound? Are custom data types the right
solution? Or am I being overly concerned about the performance issues?

If custom data types aren't a good idea in this situation, when are
they? I can think of almost no situation when I can't mirror the
functionality of a custom data type with a series of related database
tables.

Thanks Again,

Scott Huey



On 7/24/06, Merlin Moncure <mmoncure@gmail.com> wrote:
> On 7/24/06, Redefined Horizons <redefined.horizons@gmail.com> wrote:
> > I've got a few questions about creating custom data types for PostgreSQL.
>
> ok, i have a weakness for medieval stuff, so:
>
> > I'm not sure how to phrase the questions without an example, so here goes:
> >
> > I'd like to create a custom data type representing a Medieval Knight.
> > (This isn't what I really want to do, it's just an example.)
>
> > One of the "properties" of this Medieval Knight data type that you can
> > access through custom functions is "Combat Skill", which represents a
> > Knights fighting ability.
> >
> > The Knight's fighting ability depends on two things. His horse, and
> > his sword. The custom functions that work with the Medieval Knight
> > data type know how to manipulate the properties of Horses and Swords
> > as well.
> >
> > My Medieval Knight datatype "contains" a horse and sword. In the C
> > programming language implementation of my datatype I would like to
> > represent Swords and Horses with seperate structs, rather than
> > throwing everything together into a single Medieval Knight struct.
> >
> > Is it possible to define the Sword and Horse structs in the same DLL
> > that I define my Knight struct in, but not have Swords and Horses
> > available as custom data types themselves?
> >
> > I want to have 3 separate structs for programming simplicity, but I
> > want to embody them in a single custom data type and set of custom
> > functions for PostgreSQL.
>
>
> This is not necessarily a custom data type question, this is a data
> modeling question.  Your phrasing suggests that you have a programming
> background...in the database world it's a bit different.  The closest
> thing to the custom type as you described it is the composite type.
> Composite types are groupings of other types.  When you create a
> table, by the way, a composite type is automatically created for you.
> you can also create your own using
> http://www.postgresql.org/docs/8.1/static/sql-createtype.html (look at
> the first form of create type).
>
> so,
> create table knight(combat_skill numeric, weapon text, horse text);
>
> gives you a 'knight' type, now you can:
>
> select (1.0, 'sword', 'trigger')::knight;
>
> which composes the knight type out of the fields using a row
> constructor.  This isn't really necessary though, in most cases it's
> safe just to use tables without consideration of its type at all, a
> table is just a set of a type defined in its create table statement.
> Let's say you wanted to develop the horse further...
>
> create table horse(name text primary key, age int, speed numeric);
>
> let's link the knight to his horse:
>
> alter table knight add foreign key(horse) references horse;
>
> this is not the only way to do this, but this is more of a relational
> approach than what you suggested in your email.  the key to
> understanding sql and data modeling is reducing data to it's
> functional dependencies so changes/additions to your data are as local
> as possible.  This is a little bit diffferent than you might think of
> things from a C perspective.
>
> merlin
>

Re: Beginner's questions about creating a custom data type in PostgreSQL...

От
"Merlin Moncure"
Дата:
On 7/25/06, Redefined Horizons <redefined.horizons@gmail.com> wrote:
> Merlin,
> I am actually working on a spatial extension for PostgreSQL that is
> similar to PostGIS, but compatible with a geometry library I am
> developing.
>
> I had first thought of using a "table" only desgin, with no custom
> data types, to store the geometry information in the database.
>
> For example, the "many_lines" table might store records representing
> multi-segmented lines. (One geometry made up of smaller connected line
> segments.) A separate table called "single_lines" would contain the
> actual lines. The link between the line segment geometries and the
> "many lines" geometries would be maintained by a one-to-many
> relationship.

ah, that changes things completely.

If you are going the custom route, I would try this approach first and
discard it only when you have determined to to be not fast enough.
the major problems here are going to be inefficient data size and
problems with spatial indexing.  postgis makes use of the gist index
and if you want fast spatial indexing you will need to generate index
ops for any custom type regardless of it's implementation.

this means you have to have a pretty good understanding of how gist
works and how the postgresql type system works.  to be perfectly
honest, if you pull this off and make it work reasonably efficient,
you should be giving me advice, not vice versa :)

have you considered implementing postgis and cutting some c code to do
the type translation on the backend?

> However, I began to realize that as my geometries became more complex
> I would begin to have more and more interdependent relationships. this
> in itself isn't a problem, until you consider that in a typical GIS
> system I might be dealing with thousands or tens of thousands of
> geometries. In the example above, if I wanted to find all of the line
> segment geometries that belonged to a"many line" geometry I would have
> to run a query on the "single_lines" table.
>
> I am concerned about the performance of a system that would frequently
> rely on a cascade of these types of searches.
>
> If I instead implement a custom "many lines" geometry data type I can
> now directly access the line segment geometries.
>
> Do you think this reasoning is sound? Are custom data types the right
> solution? Or am I being overly concerned about the performance issues?

that really depends on your requriements.  you are asking some really
broad questions.  generally though, in GIS environments performance is
extremely relevant and you need to carefully consider your
implementation approach.  the good news is postgresql is the right
place to be.

> If custom data types aren't a good idea in this situation, when are
> they? I can think of almost no situation when I can't mirror the
> functionality of a custom data type with a series of related database
> tables.

I am differentiating between custom and composite types.  composite
types are rarely worthwhile although I do use them sometimes to pass
data in and out of functions.  I suggest reading the composite types
chapter of the documentation and making your own determination.  they
are analgous to anonymous 'C' structs as opposed to typedef'd C
structs which are closer to tables.

custom types OTOH are worthwhile if you are filling a need that is not
served by an existing type.  if you want a good example on
implementing a custom type check out the 'cube' example in contrib and
also 'earthdistance' which shows how the cube might be implemted in
gist scenarios.  you will also get a good feel for the complexity
involved, including but not limited to custom grammars, backend
coding, and deep knowledge of the postgresql type system.  It could
also be a great learning experience.

regards,
merlin