Обсуждение: interesting observatation regarding views and V7.0

Поиск
Список
Период
Сортировка

interesting observatation regarding views and V7.0

От
Don Baccus
Дата:
We've already seen how column alias were breaking pg_dump's ability to
restore views unless a table alias were created, fixed now thanks to
Tom's hack.

Here's an observation that's not really a bug report but which is
proving to be an annoyance.

The creation of column aliases for tables referenced by views causes
the rule created on the underlying virtual table to be in some cases
considerably longer than the corresponding rule in V6.5.

In practice, this means that several of the views used in the web
toolkit I'm porting no longer can be created.  In some cases, the
views had changed and I'd assumed that this was the cause, but now
I'm seeing it in a module (ecommerce) that as yet has not been
ported.  I'd ported the data model to 6.5 with no problem, but the
views can't be created in 7.0.  I just tried this yesterday, when
I decided to put some effort into porting the module (it contains
about 2000 lines of PL/SQL which need to be re-written in PL/pgSQL
so it's not entirely a trivial task to move it over).

Seeing that these views - which hadn't changed - and in light of
the column alias vs. pg_dump issue, I realized that the rule
strings are just getting much longer.  

(The error I'm getting is that the tuple size is too long)

Of course, TOAST will solve the problem, but we don't have TOAST
yet.  

I'm assuming Thomas put this in as part of the 'outer join' work.

In my case, I recompiled PG with a blocksize of 16KB rather
than 8KB, which I've been intending to do anyway for the time 
being since the 8KB blocksize causes other limitations on the size
of text vars, i.e. the discussion forum table is limited to about
6KB chars for the message text when the blocksize is 8KB, really
too small.  With TOAST coming in 7.1, I'm sticking with "text"
rather than segmenting messages into a series of rows and kludging
a "solution" by compiling with a 16KB blocksize.

This "fixed" my problem with views, too.

But I thought I'd share my experience with the group.  I don't
know how many folks use views in complex ways, but if many do
quite a few of them will run into the same problem and we'll
probably hear about it.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Ed Loehr
Дата:
Don Baccus wrote:
> 
> The creation of column aliases for tables referenced by views causes
> the rule created on the underlying virtual table to be in some cases
> considerably longer than the corresponding rule in V6.5.
> 
> ...In my case, I recompiled PG with a blocksize of 16KB...
> 
> ...This "fixed" my problem with views, too.

Thanks for this info, Don.  Would you mind posting your patch, simple
as it may be?

Cheers,
Ed Loehr


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Don Baccus
Дата:
At 11:04 AM 2/23/00 -0600, Ed Loehr wrote:
>Don Baccus wrote:
>> 
>> The creation of column aliases for tables referenced by views causes
>> the rule created on the underlying virtual table to be in some cases
>> considerably longer than the corresponding rule in V6.5.
>> 
>> ...In my case, I recompiled PG with a blocksize of 16KB...
>> 
>> ...This "fixed" my problem with views, too.
>
>Thanks for this info, Don.  Would you mind posting your patch, simple
>as it may be?

That was it, I just recompiled PG with a blocksize of 16KB, i.e.
edited src/include/config.h.in's BLCKSZ definition, ran configure,
and did a gmake all/gmake install.

As I mentioned, I had other reasons for wanting to run with a 16KB
blocksize while waiting for TOASTed large text (and other) types,
so it's no biggie for me.

Others might find this change a lot more annoying.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Tom Lane
Дата:
Don Baccus <dhogaza@pacifier.com> writes:
> The creation of column aliases for tables referenced by views causes
> the rule created on the underlying virtual table to be in some cases
> considerably longer than the corresponding rule in V6.5.
> In practice, this means that several of the views used in the web
> toolkit I'm porting no longer can be created.

Yes, this is exactly the concern I raised last week.  Thomas didn't
seem to be very worried about the issue, but when he gets back from
his vacation we can lean on him to fix it.

Something else we might consider as a stopgap is to resurrect the
"compressed text" datatype that Jan wrote, and then removed in
anticipation of having TOAST.  Jan was concerned about creating
future compatibility problems by having a datatype with only a
one-release-cycle expected lifetime ... but I think it might be
OK to use it just internally for rules.
        regards, tom lane


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Don Baccus
Дата:
At 05:54 PM 2/23/00 -0500, Tom Lane wrote:
>Don Baccus <dhogaza@pacifier.com> writes:
>> The creation of column aliases for tables referenced by views causes
>> the rule created on the underlying virtual table to be in some cases
>> considerably longer than the corresponding rule in V6.5.
>> In practice, this means that several of the views used in the web
>> toolkit I'm porting no longer can be created.
>
>Yes, this is exactly the concern I raised last week.  Thomas didn't
>seem to be very worried about the issue, but when he gets back from
>his vacation we can lean on him to fix it.

OK, I saw some of the exchange last week but was so busy I
didn't really read it, other than to note when he'd committed changes
so I could update and throw the web toolkit at them.  The ecommerce
module wasn't part of what I was throwing at it last week since
I knew it wasn't going to get ported from Oracle in time for our
very preliminary first cut at a port.  This week, though, hasn't
been as crazy.  Otherwise I would've yelped at Thomas a week ago.

"Here, YOU rewrite all these queries that use these views!" :)

>Something else we might consider as a stopgap is to resurrect the
>"compressed text" datatype that Jan wrote, and then removed in
>anticipation of having TOAST.  Jan was concerned about creating
>future compatibility problems by having a datatype with only a
>one-release-cycle expected lifetime ... but I think it might be
>OK to use it just internally for rules.

Yeah, that's not a bad idea at all.  

Also...interbase's "text" type is apparently compressed, and that's
an interesting idea for "text" itself (as opposed to "varchar()" of
a given size).   Someone who just says "text" probably wants to be
able to stuff as much text into the column as possible, I know
I do!  The price of compression/decompression is to some extent
balanced by not having to drag as many bytes around during joins
and sorts and the like.  Decompression in particular should be
very cheap and in the kind of systems I'm working on one hopes
one's ad, product description, Q&A post etc is selected (read)
many more times than inserted (written).  One hopes!  

Just an interesting notion...I was kinda excited about lzText when
Jan implemented it, though a smart TOASTer is even more exciting so
I won't whine about the delay.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Tom Lane
Дата:
Don Baccus <dhogaza@pacifier.com> writes:
>> Something else we might consider as a stopgap is to resurrect the
>> "compressed text" datatype that Jan wrote, and then removed in
>> anticipation of having TOAST.

> Also...interbase's "text" type is apparently compressed, and that's
> an interesting idea for "text" itself (as opposed to "varchar()" of
> a given size).   Someone who just says "text" probably wants to be
> able to stuff as much text into the column as possible, I know
> I do!

Just quietly make text compressed-under-the-hood, you mean?  Hmm.
Interesting idea, all right, and it wouldn't create any long-term
compatibility problem since users couldn't see it directly.  I think
we might have some places in the system that assume char/varchar/text
all have the same internal representation, but that could probably
be fixed without too much grief.

> The price of compression/decompression is to some extent
> balanced by not having to drag as many bytes around during joins
> and sorts and the like.

Also, there could be a threshold: don't bother trying to compress
fields that are less than, say, 1K bytes.

Jan, what do you think?  I might be able to find some time to try this,
if you approve of the idea but just don't have cycles to spare.
        regards, tom lane


Re: [HACKERS] interesting observatation regarding views and V7.0

От
wieck@debis.com (Jan Wieck)
Дата:
Tom Lane wrote:

> Something else we might consider as a stopgap is to resurrect the
> "compressed text" datatype that Jan wrote, and then removed in
> anticipation of having TOAST.  Jan was concerned about creating
> future compatibility problems by having a datatype with only a
> one-release-cycle expected lifetime ... but I think it might be
> OK to use it just internally for rules.
   Ech - must be YOU!
   If I hadn't deleted the entire (including catalog changes for   pg_type ... pg_rewrite) patch, I'd be the one to
suggest. We   could easily add some warning, like "LZTEXT will disappear in   a subsequent release again -  be
warned", spit  out  during   parse, if someone explicitly uses the lztext type.
 
   I'll  spend some time with CVS to see if I can regenerate the   patch from there.
   But I can feel the punches of Marc already - this patch  will   cause catalog changes after official BETA start - Uh
-Oh.
 


Jan

BTW: Good  chance for Vince to LOL if I fail on that one, since I    got very impatiant once about "correct usage of
CVS". Was a    little  off-list  flamewar that turned out to be mostly "you    got me wrong" during a phone  call.  But
things  like  that    linger in background until prooven :-). Showtime!
 

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] interesting observatation regarding views and V7.0

От
wieck@debis.com (Jan Wieck)
Дата:
Tom Lane wrote:

> Don Baccus <dhogaza@pacifier.com> writes:
>
> > Also...interbase's "text" type is apparently compressed, and that's
> > an interesting idea for "text" itself (as opposed to "varchar()" of
> > a given size).   Someone who just says "text" probably wants to be
> > able to stuff as much text into the column as possible, I know
> > I do!
>
> Just quietly make text compressed-under-the-hood, you mean?  Hmm.
> Interesting idea, all right, and it wouldn't create any long-term
> compatibility problem since users couldn't see it directly. ...
   If  we  wheren't  in BETA code freeze right now, I'd call for   another month delay - surely.

> > The price of compression/decompression is to some extent
> > balanced by not having to drag as many bytes around during joins
> > and sorts and the like.
>
> Also, there could be a threshold: don't bother trying to compress
> fields that are less than, say, 1K bytes.
>
> Jan, what do you think?  I might be able to find some time to try this,
> if you approve of the idea but just don't have cycles to spare.
   It's a very  temping  solution,  turn  "text"  into  "lztext"   silently,  and  revert  that  internal  changes  in
thenext   release again while implementing TOAST.   Remember  that  the   lztext I implemented had the mentioned
thresholdparamenter -   say 256 - from the very beginning. And you know 256->1K is  a   one-liner  in  my  coding
style.  Moreover, it was a global   parameter set driven value, and thus potentially prepared  to   be  a  runtime
configurable one  (the  other  values of the   parameter  set  where  minimum  compression  ratio  to  gain,   maximum
resultsize to force compression even if ratio below,   GOOD size to stop history lookup and finally  history  lookup
GOODlowering factor during lookups).
 
   The algorithm I used for compression is one, loosing possible   compression  ratio  to  gain  speed.  It  uses  a
poor  XOR   combination  of  the  next 4 input-bytes, to lookup a history   table - and  that's  anything  but  perfect
from  a  hashing   algorithms  point  of  view.  But it was enough to make a 50+   column view fit easily into
pg_rewrite. And that's  what  it   was made for.
 
   Anyway,  there  are far too many direct references to VARDATA   on "text" plus all the assumptions  on  binary
compatibility  between text, varchar etc. in the code, to start on it during   BETA.
 
   Thus, I see a good chance for  a  7.1  release,  really  soon   after  7.0.   Then  have  a  longer  delay  for the
nextone,   featuring TOAST.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #




Re: [HACKERS] interesting observatation regarding views and V7.0

От
Don Baccus
Дата:
At 06:46 PM 2/23/00 -0500, Tom Lane wrote:

>Just quietly make text compressed-under-the-hood, you mean?  Hmm.

Yep...

>Interesting idea, all right, and it wouldn't create any long-term
>compatibility problem since users couldn't see it directly.  I think
>we might have some places in the system that assume char/varchar/text
>all have the same internal representation, but that could probably
>be fixed without too much grief.

I've kind of assumed this might be the case, but have truly been
too busy to dig around looking (which in my case takes a fairly
long time because I'm really only barely familiar with the code)

>> The price of compression/decompression is to some extent
>> balanced by not having to drag as many bytes around during joins
>> and sorts and the like.
>
>Also, there could be a threshold: don't bother trying to compress
>fields that are less than, say, 1K bytes.

Right, I thought about that possibility, too, but it seems a bit
more complicated so I thought I'd raise the simpler-sounding idea
first :)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Tom Lane
Дата:
wieck@debis.com (Jan Wieck) writes:
>     But I can feel the punches of Marc already - this patch  will
>     cause catalog changes after official BETA start - Uh - Oh.

You can hide behind me ;-) ... I just did commit some catalog changes
(but didn't need to force initdb, since they were only additions).

Also, I am more than half expecting that I will have to force an initdb
to clean up the INET/CIDR comparison business; very likely we are
going to end up needing to have separate comparison operators for
INET and CIDR.

Still waiting for input on that from the folks who use the datatypes,
though.   (D'Arcy, are you still out there?)
        regards, tom lane


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Thomas Lockhart
Дата:
> Yes, this is exactly the concern I raised last week.  Thomas didn't
> seem to be very worried about the issue, but when he gets back from
> his vacation we can lean on him to fix it.

OK Tom I'll try to sound more concerned next time :))

I'm using the rte->ref Attr structure to carry internal info on table
names and column names. What I should be able to do is decouple the
internal ref structure from the table name/column list specified by a
user, so the "query recreation" code can ignore the internal structure
and just use the original list from the user.

Should be able to go into v7.0 with no problem (other than initdb, but
it *is* a beta!!).

> Something else we might consider as a stopgap is to resurrect the
> "compressed text" datatype that Jan wrote, and then removed in
> anticipation of having TOAST.  Jan was concerned about creating
> future compatibility problems by having a datatype with only a
> one-release-cycle expected lifetime ... but I think it might be
> OK to use it just internally for rules.

Naw, the above should be easier all around.
                  - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> Something else we might consider as a stopgap is to resurrect the
>> "compressed text" datatype that Jan wrote, and then removed in
>> anticipation of having TOAST.

> Naw, the above should be easier all around.

When you finish catching up on your mail, you'll find lztext is already
back in ;-).  At this point, whether you change the representation is
pretty much irrelevant for rule size, I think.  However, I am still
concerned by the hack I had to put into ruleutils.c to get pg_dump
to produce valid output for cases likecreate view foo as select * from int8_tbl;
See the note and code at about line 1000 of utils/adt/ruleutils.c.
Ideally we want to be able to tell from the parsetree whether the user
wrote any column aliases or not (and if possible, distinguish the ones
he wrote from any that got added by the system).  So that may force
a representation change anyway.
        regards, tom lane


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Thomas Lockhart
Дата:
> When you finish catching up on your mail, you'll find lztext is already
> back in ;-).  At this point, whether you change the representation is
> pretty much irrelevant for rule size, I think.  However, I am still
> concerned by the hack I had to put into ruleutils.c to get pg_dump
> to produce valid output for cases like
>         create view foo as select * from int8_tbl;
> See the note and code at about line 1000 of utils/adt/ruleutils.c.
> Ideally we want to be able to tell from the parsetree whether the user
> wrote any column aliases or not (and if possible, distinguish the ones
> he wrote from any that got added by the system).  So that may force
> a representation change anyway.

Well, if I add another field/list to the RangeTblEntry structure to
hold my working aliases, and if I keep the ref structure as a pristine
copy of the parameters specified by the user, then everything will go
back to working as expected. There may be other places in the code
which really want one or the other of the fields, but as a first cut
I'll isolate the changes to just the parser directory, more or less.
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] interesting observatation regarding views and V7.0

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Well, if I add another field/list to the RangeTblEntry structure to
> hold my working aliases, and if I keep the ref structure as a pristine
> copy of the parameters specified by the user, then everything will go
> back to working as expected. There may be other places in the code
> which really want one or the other of the fields, but as a first cut
> I'll isolate the changes to just the parser directory, more or less.

Sounds like a good plan.
        regards, tom lane