Обсуждение: libpq

Поиск
Список
Период
Сортировка

libpq

От
Chris
Дата:
I've been poking around in the libpq area and I'm thinking of tackling
the streaming interface which was suggested recently.

What I have in mind is that a new API PQexecStream() doesn't retrieve
the results. The tuples are then read back one by one with
PQnextObject(). You can also use PQnextObject with regular PQexec, but
in that case you lose the most of the benefit of streaming because it
would allocate memory for all the result. So the proposal is...


/* like PQexec, but streams the results */
PGresult *PQexecStream(PGconn *conn, const char *query)
/* retrieve the next object from a PGresult */
PGobject *PQnextObject(PGconn *conn)
/* get value from an object/tuple */
char *PQgetObjectValue(const PGobject *res, int field_num)
/* free tuple when done */
void PQclearObject(PGobject *obj)

Oh yeah, can I fix the COPY protocol while I'm at it to conform more to
the other types of messages?

BTW, what is this PQ thing? Does it stand for postquel? Are we ever
going to dump that?

-- 
Chris Bitmead
mailto:chris@bitmead.com


Re: [HACKERS] libpq

От
Tom Lane
Дата:
Chris <chris@bitmead.com> writes:
> What I have in mind is that a new API PQexecStream() doesn't retrieve
> the results. The tuples are then read back one by one with
> PQnextObject().

OK, but how does this interact with asynchrononous retrieval?  It
should be possible to run it in a nonblocking (select-waiting) mode.

> /* like PQexec, but streams the results */
> PGresult *PQexecStream(PGconn *conn, const char *query)
> /* retrieve the next object from a PGresult */
> PGobject *PQnextObject(PGconn *conn)
> /* get value from an object/tuple */
> char *PQgetObjectValue(const PGobject *res, int field_num)
> /* free tuple when done */
> void PQclearObject(PGobject *obj)

There are two other big gaps here, which is that you haven't specified
how you represent (a) errors and (b) end of query result.  I assume you
intend the initial PQexecStream call to wait for the first tuple to come
back, so *most* sorts of errors will be reported at that point, but
you have to be able to cope with errors reported later on too.

Rather than inventing a new PGobject struct type, I'd suggest returning
the partial results as PGresults.  This has a couple of benefits:* easy representation of an error encountered midway
(youjust return  an error PGresult).* it's no big trick to "batch" retrieval, ie, return 10 or 100 tuples  at a time,
ifthat happens to prove useful.* each tuple batch could carry its own tuple description, which is  something you will
needif you want to go anywhere with that  polymorphic-results idea.* end-of-query could be represented as a PGresult
withzero tuples.  (This would leave a null-pointer result open for use in the nonblock  case, to indicate "haven't got
aresponse yet".)* no need for an entire new set of API functions to query PGobjects.
 

BTW, an earlier proposal for this same sort of thing didn't see it
as an entirely new operating mode, but just a "limit" option added
to a variant of PQexec: the limit says "return no more than N tuples
per PQresult".

> Oh yeah, can I fix the COPY protocol while I'm at it to conform more to
> the other types of messages?

I looked at that before, and while COPY is certainly ugly as sin, it's
not clear that it's worth creating cross-version compatibility problems
to fix it.  I'm inclined to leave it alone until such time as we
undertake a really massive protocol change (moving to CORBA, say).

> BTW, what is this PQ thing? Does it stand for postquel? Are we ever
> going to dump that?

Yes, and no.  We aren't going to break existing app code by indulging
in cosmetic renaming of API names.  Moreover we have to have *some*
prefix to minimize the probability of global-symbol conflicts with apps
and other libraries, so that one's as good as any.

To the extent that there is any system in the names in libpq (which I
admit ain't much), it'sPQfoo  --- exported public-API routinepqfoo  --- internal routine not meant for apps to call,
butmust           be global symbol because it is called cross-modulePGfoo  --- type name, enum const, etc
 
I'd suggest sticking to those conventions in any new code you write.
        regards, tom lane


Re: [HACKERS] libpq

От
Chris Bitmead
Дата:
Tom Lane wrote:
> 
> Chris <chris@bitmead.com> writes:
> > What I have in mind is that a new API PQexecStream() doesn't retrieve
> > the results. The tuples are then read back one by one with
> > PQnextObject().
> 
> OK, but how does this interact with asynchrononous retrieval?  It
> should be possible to run it in a nonblocking (select-waiting) mode.

I didn't know that was a requirement. Well when doing this sort of 
stuff you never know what other sources of data they may want
to wait for, so the only way is to have PQfileDescriptor or something,
but I don't think that affects these decisions does it? If they want
async, they are given the fd and select. When ready they call
nexttuple.
> BTW, an earlier proposal for this same sort of thing didn't see it
> as an entirely new operating mode, but just a "limit" option added
> to a variant of PQexec: the limit says "return no more than N tuples
> per PQresult".

As in changing the interface to PQexec?

I can't see the benefit of specifically asking for N tuples. Presumably
behind the scenes it will read from the socket in a respectably
large chunk (8k for example). Beyond that I can't see any more reason 
for customisation.

> I looked at that before, and while COPY is certainly ugly as sin, it's
> not clear that it's worth creating cross-version compatibility problems
> to fix it.  I'm inclined to leave it alone until such time as we
> undertake a really massive protocol change (moving to CORBA, say).

I'll look at that situation further later. Is there a policy on
protocol compatibility? If so, one way or both ways?

The other comments you made, I have to think about further.


Re: [HACKERS] libpq

От
Tom Lane
Дата:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Tom Lane wrote:
>> OK, but how does this interact with asynchrononous retrieval?  It
>> should be possible to run it in a nonblocking (select-waiting) mode.

> I didn't know that was a requirement.

Well, there may not be anyone holding a gun to your head about it...
but there have been a number of people sweating to make the existing
facilities of libpq usable in a non-blocking fashion.  Seems to me
that that sort of app would be particularly likely to want to make
use of a streaming API --- so if you don't think about it, there is
going to be someone else coming along to clean up after you pretty
soon.  Better to get it right the first time.

> to wait for, so the only way is to have PQfileDescriptor or something,
> but I don't think that affects these decisions does it? If they want
> async, they are given the fd and select. When ready they call
> nexttuple.

Not really.  The app can and does wait for select() to show read ready
on libpq's input socket --- but that only indicates that there is a TCP
packet's worth of data available, *not* that a whole tuple is available.
libpq must provide the ability to consume data from the kernel (to
clear the select-read-ready condition) and then either hand back a
completed tuple (or several) or say "sorry, no complete data yet".
I'd suggest understanding the existing facilities more carefully before
you set out to improve on them.

>> to a variant of PQexec: the limit says "return no more than N tuples
>> per PQresult".

> As in changing the interface to PQexec?

I did say "variant", no?  We don't get to break existing callers of
PQexec.

> I can't see the benefit of specifically asking for N tuples. Presumably
> behind the scenes it will read from the socket in a respectably
> large chunk (8k for example). Beyond that I can't see any more reason 
> for customisation.

Well, that's true from one point of view, but I think it's just libpq's
point of view.  The application programmer is fairly likely to have
specific knowledge of the size of tuple he's fetching, and maybe even
to have a global perspective that lets him decide he doesn't really
*want* to deal with retrieved tuples on a packet-by-packet basis.
Maybe waiting till he's got 100K of data is just right for his app.

But I can also believe that the app programmer doesn't want to commit to
a particular tuple size any more than libpq does.  Do you have a better
proposal for an API that doesn't commit any decisions about how many
tuples to fetch at once?

>> not clear that it's worth creating cross-version compatibility problems
>> to fix it.  I'm inclined to leave it alone until such time as we
>> undertake a really massive protocol change (moving to CORBA, say).

> I'll look at that situation further later. Is there a policy on
> protocol compatibility? If so, one way or both ways?

The general policy so far has been that backends should be able to
talk to any vintage of frontend, but frontend clients need only be
able to talk to backends of same or later version.  (The idea is to
be able to upgrade your server without breaking existing clients,
and then you can go around and update client apps at your
convenience.)

The last time we actually changed the protocol was in 6.4 (at my
instigation BTW) --- and while we didn't get a tidal wave of
"hey my new psql won't talk to an old server" complaints, we got
a pretty fair number of 'em.  So I'm very hesitant to break either
forwards or backwards compatibility in new releases.  I certainly
don't want to do it just for code beautification; we need a reason
that is compelling to the end users who will be inconvenienced.
        regards, tom lane


Re: [HACKERS] libpq

От
Chris Bitmead
Дата:
Tom Lane wrote:

> Well, that's true from one point of view, but I think it's just libpq's
> point of view.  The application programmer is fairly likely to have
> specific knowledge of the size of tuple he's fetching, and maybe even
> to have a global perspective that lets him decide he doesn't really
> *want* to deal with retrieved tuples on a packet-by-packet basis.
> Maybe waiting till he's got 100K of data is just right for his app.
> 
> But I can also believe that the app programmer doesn't want to commit to
> a particular tuple size any more than libpq does.  Do you have a better
> proposal for an API that doesn't commit any decisions about how many
> tuples to fetch at once?

If you think applications may like to keep buffered 100k of data, isn't
that an argument for the PGobject interface instead of the PGresult
interface?

I'm trying to think of a situation where you want to buffer data. Let's
say psql has something like "more" inbuilt and it needs to buffer
a screenful, and go forward line by line. Now you want to keep the last
40 tuples buffered. First up you want 40 tuples, then you want one
at a time every time you press Enter.

This seems too much responsibility to press onto libpq, but if the user
has control over destruction of PQobjects they can buffer what they
want, how they want, when they want.


Re: [HACKERS] libpq

От
Tom Lane
Дата:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> If you think applications may like to keep buffered 100k of data, isn't
> that an argument for the PGobject interface instead of the PGresult
> interface?

How so?  I haven't actually figured out what you think PGobject will do
differently from PGresult.  Given the considerations I mentioned before,
I think PGobject *is* a PGresult; it has to have all the same
functionality, including carrying a tuple descriptor and a query
status (+ error message if needed).

> This seems too much responsibility to press onto libpq, but if the user
> has control over destruction of PQobjects they can buffer what they
> want, how they want, when they want.

The app has always had control over when to destroy PGresults, too.
I still don't see the difference...
        regards, tom lane


Re: [HACKERS] libpq

От
Chris
Дата:
Tom Lane wrote:
> How so?  I haven't actually figured out what you think PGobject 
> will do
> differently from PGresult.  Given the considerations I mentioned before,
> I think PGobject *is* a PGresult; it has to have all the same
> functionality, including carrying a tuple descriptor and a query
> status (+ error message if needed).

All I mean to say is that it is often desirable to have control over
when each individual object is destroyed, rather than having to destroy
each batch at once. 

The result status and query status is only temporarily interesting. Once
I know the tuple arrived safely I don't care much about the state of
affairs at that moment, and don't care to waste memory on a structure
that has space for all these error fields.

For example, if I want to buffer the last 20 tuples at all times I could
have..
PGobject *cache[20]
GetFirst() {  for (int i = 0; i < 20; i++)     cache[i] = getNextObject(...);
}

GetNext() {  memmove(&cache[0], &cache[1], sizeof(PGobject *));  cache[19] = getNextObject(...);
}

I don't see why the app programmer shouldn't have to write the loop
GetFirst. Why should this be forced onto libpq when it doesn't help
performance or anything? I don't think, if I understand you correctly,
the PGresult idea doesn't give this flexibility. Correct me if I'm
wrong.

The other thing about PGobject idea is that when I do a real OO database
idea, is that getNextObject will optionally populate user-supplied data
instead. i.e. I can optionally pass a C++ object and a list of field
offsets. So probably I would want getNextObject to take optional args of
a block of memory, and a structure describing field offsets. Only if
these are null does getNextObject allocate space for you.

> > This seems too much responsibility to press onto libpq, but if the user
> > has control over destruction of PQobjects they can buffer what they
> > want, how they want, when they want.
> 
> The app has always had control over when to destroy PGresults, too.
> I still don't see the difference...
> 
>                         regards, tom lane

-- 
Chris Bitmead
mailto:chris@bitmead.com


Re: [HACKERS] libpq

От
Tom Lane
Дата:
Chris <chris@bitmead.com> writes:
> All I mean to say is that it is often desirable to have control over
> when each individual object is destroyed, rather than having to destroy
> each batch at once. 

Right, so if you really want to destroy retrieved tuples one at a time,
you request only one per retrieved PGresult.  I claim that the other
case where you want them in small batches (but not necessarily only one
at a time) is at least as interesting; therefore the mechanism should
not be limited to the exactly-one-at-a-time case.  Once you allow for
the other requirements, you have something that looks enough like a
PGresult that it might as well just *be* a PGresult.

> The result status and query status is only temporarily interesting. Once
> I know the tuple arrived safely I don't care much about the state of
> affairs at that moment, and don't care to waste memory on a structure
> that has space for all these error fields.

Let's see (examines PGresult declaration).  Four bytes for the
resultStatus, four for the errMsg pointer, 40 for cmdStatus,
out of a struct that is going to occupy close to 100 bytes on
typical hardware --- and that's not counting the tuple descriptor
data and the tuple(s) proper.  You could easily reduce the cmdStatus
overhead by making it a pointer to an allocated string instead of
an in-line array, if the 40 bytes were really bothering you.  So the
above seems a pretty weak argument for introducing a whole new datatype
and a whole new set of access functions for it.  Besides which, you
haven't explained how it is that you are going to avoid the need to
be able to represent error status in a PGObject.  The function that
fetches the next tuple(s) in a query has to be able to return an
error status, and that has to be distinguishable from "successful
end of query" and from "no more data available yet".

> The other thing about PGobject idea is that when I do a real OO database
> idea, is that getNextObject will optionally populate user-supplied data
> instead.

And that can't be done from a PGresult because?

So far, the *only* valid reason you've given for inventing a new
datatype, rather than just using PGresult for the purpose, is to save a
few bytes by eliminating unnecessary fields.  That seems a pretty weak
argument (even assuming that the fields are unnecessary, which I doubt).
Having to support and document a whole set of essentially-identical
access functions for both PGresult and PGObject is the overhead that
we ought to be worried about, ISTM.  Don't forget that it's not just
libpq we are talking about, either; this additional API will also have
to propagate into libpq++, libpgtcl, the perl5 and python modules,
etc etc etc.
        regards, tom lane


Re: [HACKERS] libpq

От
Chris Bitmead
Дата:
100 bytes, or even 50 bytes seems like a huge price to pay. If I'm 
retrieving 10 byte tuples that's a 500% or 1000% overhead.

There are other issues too. Like if I want to be able to populate
a C++ object without the overhead of copying, I need to know
in advance the type of tuple I'm getting back. So I need something 
like a nextClass() API.

Here is what I'm imagining (in very rough terms with details glossed
over).
How would you do this with the PGresult idea?...

class Base {  int c;
}
class Sub1 : Base {  int b;
}
class Sub2 : Base {  int c;
}
#define OFFSET (class, field) (&((class *)NULL)->field)
struct FieldPositions f1[] = { { "a", OFFSET(Sub1,a) }, { "b",
OFFSET(Sub1,b)} };
struct FieldPositions f2[] = { { "a", OFFSET(Sub1, c) }, { "c",
OFFSET(Sub2, c) } };

PGresult *q = PQexecStream("SELECT ** from Base");
List<Base> results;
for (;;) {  PGClass *class = PQnextClass(q);  if (PQresultStatus(q) == ERROR)     processError(q);  else if
(PQresultStatus(q)== NO_MORE)     break;  if (strcmp(class->name) == "Sub1") {     results.add(PQnextObject(q, new
Sub1,FieldPositions(f1)));  else if (strcmp(class->name) == "Sub2") {     results.add(PQnextObject(q, new Sub2,
FieldPositions(f2))); }
 

Of course in a full ODBMS front end, some of the above code would
be generated or something.

In this case PQnextObject is populating memory supplied by the
programmer.
There is no overhead whatsoever, nor can there be because we are
supplying
memory for the fields we care about.

In this case we don't even need to store tuple descriptors because 
the C++ object has it's vtbl which is enough. If we cared about
tuple descriptors though we could hang onto the PGClass and do 
something like PQgetValue(class, object, "fieldname"), which
would be useful for some language interfaces no doubt.

A basic C example would look like this...

PGresult *q = PQexecStream("SELECT ** from Base");
for (;;) {  PGClass *class = PQnextClass(q);  if (PQresultStatus(q) == ERROR)     processError(q);  else if
(PQresultStatus(q)== NO_MORE)     break;  PGobject *obj = PQnextObject(q, NULL, NULL);  for (int c = 0; c <
PQnColumns(class);c++) {     printf("%s: %s, ", PQcolumnName(class, c), PQcolumnValue(class, c,
 
obj));  printf("\n");  }

The points to note here are:
(1) Yes, the error message stuff comes from PGresult as it does now.
(2) You don't have a wasteful new PGresult for every time you get
the next result.
(3) You are certainly not required to store a whole lot of PGresults
just because you want to cache tuples.
(4) Because the tuple descriptor is explicit (PGClass*) you can
keep it or not as you please. If you are doing pure relational
with fixed number of columns, there is ZERO overhead per tuple
because you only need keep one pointer to the PGClass. This is
even though you retrieve results one at a time.
(5) Because of (4) I can't see the need for any API to support
getting multiple tuples at a time since it is trivially implemented
in terms of nextObject with no overhead.

While a PGresult interface like you described could be built, I can't
see that
it fulfills all the requirements that I would have. It could be
trivially
built on top of the above building blocks, but it doesn't sound fine
enough
grained for me. If you disagree, tell me how you'd do it.

Tom Lane wrote:
> 
> Chris <chris@bitmead.com> writes:
> > All I mean to say is that it is often desirable to have control over
> > when each individual object is destroyed, rather than having to destroy
> > each batch at once.
> 
> Right, so if you really want to destroy retrieved tuples one at a time,
> you request only one per retrieved PGresult.  I claim that the other
> case where you want them in small batches (but not necessarily only one
> at a time) is at least as interesting; therefore the mechanism should
> not be limited to the exactly-one-at-a-time case.  Once you allow for
> the other requirements, you have something that looks enough like a
> PGresult that it might as well just *be* a PGresult.
> 
> > The result status and query status is only temporarily interesting. Once
> > I know the tuple arrived safely I don't care much about the state of
> > affairs at that moment, and don't care to waste memory on a structure
> > that has space for all these error fields.
> 
> Let's see (examines PGresult declaration).  Four bytes for the
> resultStatus, four for the errMsg pointer, 40 for cmdStatus,
> out of a struct that is going to occupy close to 100 bytes on
> typical hardware --- and that's not counting the tuple descriptor
> data and the tuple(s) proper.  You could easily reduce the cmdStatus
> overhead by making it a pointer to an allocated string instead of
> an in-line array, if the 40 bytes were really bothering you.  So the
> above seems a pretty weak argument for introducing a whole new datatype
> and a whole new set of access functions for it.  Besides which, you
> haven't explained how it is that you are going to avoid the need to
> be able to represent error status in a PGObject.  The function that
> fetches the next tuple(s) in a query has to be able to return an
> error status, and that has to be distinguishable from "successful
> end of query" and from "no more data available yet".
> 
> > The other thing about PGobject idea is that when I do a real OO database
> > idea, is that getNextObject will optionally populate user-supplied data
> > instead.
> 
> And that can't be done from a PGresult because?
> 
> So far, the *only* valid reason you've given for inventing a new
> datatype, rather than just using PGresult for the purpose, is to save a
> few bytes by eliminating unnecessary fields.  That seems a pretty weak
> argument (even assuming that the fields are unnecessary, which I doubt).
> Having to support and document a whole set of essentially-identical
> access functions for both PGresult and PGObject is the overhead that
> we ought to be worried about, ISTM.  Don't forget that it's not just
> libpq we are talking about, either; this additional API will also have
> to propagate into libpq++, libpgtcl, the perl5 and python modules,
> etc etc etc.
> 
>                         regards, tom lane


Re: [HACKERS] libpq

От
Chris Bitmead
Дата:
I posted this about a week ago, and it passed without comment.
Does this mean I'm so far off track that no-one cares to comment,
or I got it so right that no comment was needed?

Quick summary: I want to work on libpq, partly to implement
my OO plans in libpq, and partly to implement the streaming 
interface. But I'm concerned that a lower-level interface
will give better control and better efficiency.

Also, this is a fair amount of hacking. I have heard talk of
"when we go to using corba" and such. I could look at doing
this at the same time, but remain to be convinced of the benefit.
What would be the method? something like sequence<Attribute> ?
I would have thought this would be a big protocol overhead. I
also would have thought that the db protocol for a database
would be sufficiently simple and static that corba would be
overkill. Am I wrong?


Chris Bitmead wrote:
> 
> 100 bytes, or even 50 bytes seems like a huge price to pay. If I'm
> retrieving 10 byte tuples that's a 500% or 1000% overhead.
> 
> There are other issues too. Like if I want to be able to populate
> a C++ object without the overhead of copying, I need to know
> in advance the type of tuple I'm getting back. So I need something
> like a nextClass() API.
> 
> Here is what I'm imagining (in very rough terms with details glossed
> over).
> How would you do this with the PGresult idea?...
> 
> class Base {
>    int c;
> }
> class Sub1 : Base {
>    int b;
> }
> class Sub2 : Base {
>    int c;
> }
> #define OFFSET (class, field) (&((class *)NULL)->field)
> struct FieldPositions f1[] = { { "a", OFFSET(Sub1,a) }, { "b",
> OFFSET(Sub1,b)} };
> struct FieldPositions f2[] = { { "a", OFFSET(Sub1, c) }, { "c",
> OFFSET(Sub2, c) } };
> 
> PGresult *q = PQexecStream("SELECT ** from Base");
> List<Base> results;
> for (;;) {
>    PGClass *class = PQnextClass(q);
>    if (PQresultStatus(q) == ERROR)
>       processError(q);
>    else if (PQresultStatus(q) == NO_MORE)
>       break;
>    if (strcmp(class->name) == "Sub1") {
>       results.add(PQnextObject(q, new Sub1, FieldPositions(f1)));
>    else if (strcmp(class->name) == "Sub2") {
>       results.add(PQnextObject(q, new Sub2, FieldPositions(f2)));
>    }
> 
> Of course in a full ODBMS front end, some of the above code would
> be generated or something.
> 
> In this case PQnextObject is populating memory supplied by the
> programmer.
> There is no overhead whatsoever, nor can there be because we are
> supplying
> memory for the fields we care about.
> 
> In this case we don't even need to store tuple descriptors because
> the C++ object has it's vtbl which is enough. If we cared about
> tuple descriptors though we could hang onto the PGClass and do
> something like PQgetValue(class, object, "fieldname"), which
> would be useful for some language interfaces no doubt.
> 
> A basic C example would look like this...
> 
> PGresult *q = PQexecStream("SELECT ** from Base");
> for (;;) {
>    PGClass *class = PQnextClass(q);
>    if (PQresultStatus(q) == ERROR)
>       processError(q);
>    else if (PQresultStatus(q) == NO_MORE)
>       break;
>    PGobject *obj = PQnextObject(q, NULL, NULL);
>    for (int c = 0; c < PQnColumns(class); c++) {
>       printf("%s: %s, ", PQcolumnName(class, c), PQcolumnValue(class, c,
> obj));
>    printf("\n");
>    }
> 
> The points to note here are:
> (1) Yes, the error message stuff comes from PGresult as it does now.
> (2) You don't have a wasteful new PGresult for every time you get
> the next result.
> (3) You are certainly not required to store a whole lot of PGresults
> just because you want to cache tuples.
> (4) Because the tuple descriptor is explicit (PGClass*) you can
> keep it or not as you please. If you are doing pure relational
> with fixed number of columns, there is ZERO overhead per tuple
> because you only need keep one pointer to the PGClass. This is
> even though you retrieve results one at a time.
> (5) Because of (4) I can't see the need for any API to support
> getting multiple tuples at a time since it is trivially implemented
> in terms of nextObject with no overhead.
> 
> While a PGresult interface like you described could be built, I can't
> see that
> it fulfills all the requirements that I would have. It could be
> trivially
> built on top of the above building blocks, but it doesn't sound fine
> enough
> grained for me. If you disagree, tell me how you'd do it.
> 
> Tom Lane wrote:
> >
> > Chris <chris@bitmead.com> writes:
> > > All I mean to say is that it is often desirable to have control over
> > > when each individual object is destroyed, rather than having to destroy
> > > each batch at once.
> >
> > Right, so if you really want to destroy retrieved tuples one at a time,
> > you request only one per retrieved PGresult.  I claim that the other
> > case where you want them in small batches (but not necessarily only one
> > at a time) is at least as interesting; therefore the mechanism should
> > not be limited to the exactly-one-at-a-time case.  Once you allow for
> > the other requirements, you have something that looks enough like a
> > PGresult that it might as well just *be* a PGresult.
> >
> > > The result status and query status is only temporarily interesting. Once
> > > I know the tuple arrived safely I don't care much about the state of
> > > affairs at that moment, and don't care to waste memory on a structure
> > > that has space for all these error fields.
> >
> > Let's see (examines PGresult declaration).  Four bytes for the
> > resultStatus, four for the errMsg pointer, 40 for cmdStatus,
> > out of a struct that is going to occupy close to 100 bytes on
> > typical hardware --- and that's not counting the tuple descriptor
> > data and the tuple(s) proper.  You could easily reduce the cmdStatus
> > overhead by making it a pointer to an allocated string instead of
> > an in-line array, if the 40 bytes were really bothering you.  So the
> > above seems a pretty weak argument for introducing a whole new datatype
> > and a whole new set of access functions for it.  Besides which, you
> > haven't explained how it is that you are going to avoid the need to
> > be able to represent error status in a PGObject.  The function that
> > fetches the next tuple(s) in a query has to be able to return an
> > error status, and that has to be distinguishable from "successful
> > end of query" and from "no more data available yet".
> >
> > > The other thing about PGobject idea is that when I do a real OO database
> > > idea, is that getNextObject will optionally populate user-supplied data
> > > instead.
> >
> > And that can't be done from a PGresult because?
> >
> > So far, the *only* valid reason you've given for inventing a new
> > datatype, rather than just using PGresult for the purpose, is to save a
> > few bytes by eliminating unnecessary fields.  That seems a pretty weak
> > argument (even assuming that the fields are unnecessary, which I doubt).
> > Having to support and document a whole set of essentially-identical
> > access functions for both PGresult and PGObject is the overhead that
> > we ought to be worried about, ISTM.  Don't forget that it's not just
> > libpq we are talking about, either; this additional API will also have
> > to propagate into libpq++, libpgtcl, the perl5 and python modules,
> > etc etc etc.
> >
> >                         regards, tom lane


Re: [HACKERS] libpq

От
Tom Lane
Дата:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> I posted this about a week ago, and it passed without comment.
> Does this mean I'm so far off track that no-one cares to comment,
> or I got it so right that no comment was needed?

I haven't looked at it because I am trying to finish up other stuff
before we go beta.  Will get back to you later.  I imagine other
people are in deadline mode also...
        regards, tom lane


Re: [HACKERS] libpq

От
Chris Bitmead
Дата:
Tom Lane wrote:

> I haven't looked at it because I am trying to finish up other stuff
> before we go beta.  Will get back to you later.  I imagine other
> people are in deadline mode also...

Ok, sure.