Jesus, what have I done (was: LONG)

Поиск
Список
Период
Сортировка
От wieck@debis.com (Jan Wieck)
Тема Jesus, what have I done (was: LONG)
Дата
Msg-id m11wxsq-0003kJC@orion.SAPserv.Hamburg.dsh.de
обсуждение исходный текст
Ответ на Re: [HACKERS] LONG  (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы Re: Jesus, what have I done (was: LONG)  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Jesus, what have I done (was: LONG)  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-hackers
Bruce Momjian wrote (in several messages):

> Actually, in looking at the fsync() system call, it does write the
> entire file descriptor before marking the transaction as complete, so
> there is no hard reason not to raise it, but because the OS has to do
> two reads to get 16k, I think we are better keeping 8k as our base block
> size.

    Agreed. Let's stay with the 8K default.

> -1|oid|attno

    Actually  I  think  you need two more informations to move it
    around independently.  As you agreed somewhere  else  (on  my
    TRUNCATE  issue),  it would be better to keep the long values
    in a per table expansion relation.  Thus, you need the Oid of
    that too at least. Also, it would be good to know the size of
    the data before fetching it, so you need that to.

    But that's not the important issue,  there's  also  an  (IMHO
    dangerous) assumption on it, see below.

> Now it would be nice if we could set the varlena length to 12, it's
> actual length, and then just somehow know that the varlena of 12 was a
> long data entry.  Our current varlena has a maximum length of 64k.
>
> Or make struct varlena vl_len a 15-bit field, and make islong a 1-bit
> field.  I don't remember if using & manually or bit fields is faster.

    I  don't see vl_len as a 15-bit field. In the current sources
    (in postgres.h), it is an int32. And I'm sure  that  not  any
    code  is  aware that some magic bit's in it contain a special
    meaning. At least the types I  added  recently  (numeric  and
    lztext)  aren't.  Nor  am  I sure, a variable length Datum is
    never duplicated somewhere, just  by  using  the  information
    from  vl_len,  with or without using the macro. Thus we would
    have to visit alot of code to make  sure  this  new  variable
    length Datum can be passed around as you like.

    And  the  IMHO  most counting drawback is, that existing user
    type definitions treat the first 32 bits in a variable length
    data  type just as I interpreted the meaning up to now. So we
    could occationally break more than we are aware of.

> In your system, a sequential scan of the table will pull in all this
> stuff because you are going to expand the tuple.  That could be very
> costly.  In my system, the expansion only happens on output if they LONG
> field does not appear in the WHERE or ORDER BY clauses.

In my system, it would do exactly as in your's, because they are mostly the
same. The modification done to the tuple in heap_insert() and heap_replace(),
just before the call to RelationPutHeapTupleAtEnd(), makes each
LONG Datum of varsize 20. Just that the first 32 bits don't contain any
magic information.

> >     Maybe  we  make  this  mechanism  so  general  that   it   is
> >     automatically applied to ALL varsize attributes? We'll end up
> >     with on big pg_long where 90+% of the databases content  will
> >     be stored.
>
> If most joins, comparisons are done on the 10% in the main table, so
> much the better.

    Yes, but how would you want to judge which varsize  value  to
    put  onto  the "secondary" relation, and which one to keep in
    the "primary" table for fast comparisions?

    I think you forgot one little detail. In our model,  you  can
    only  move  around the Datum's extended information around as
    is. It will never be expanded in place, so it must be fetched
    (index  scan)  again  at  any  place,  the  value  itself  is
    required.

    The installed base currently  uses  varsize  attributes  with
    indices  on  them  to  condition, sort and group on them. Now
    pushing such a field into "secondary" occationally will cause
    a substantial loss of performance.

    So  again,  how do you determine which of the attributes is a
    candidate to push into "secondary"?  It  is  a  such  generic
    approach, that I cannot imagine any fail safe method.

    I'd  better like to have another LONG data type, that enables
    me to store huge string into but where I exactly know what  I
    can't  do  with, than having some automatic detection process
    that I cannot force to do what I want. It  happened  just  to
    often  to me, that these "user friendly better knowing what I
    might want"  systems  got  me  by  the  ball's.  I'm  a  real
    programmer,  so  there's  allway  a  way out for me, but what
    shoud a real user do?

> Let's fact it.  Most long tuples are store/retrieve, not ordered on or
> used in WHERE clauses.  Moving them out of the main table speeds up
> things.  It also prevents expansion of rows that never end up in the
> result set.

    Having a tuple consisting of 30+ attributes, where 20 of them
    are varsize ones (CHAR, VARCHAR, NUMERIC etc.), what makes it
    a long tuple? Yes, I'm repeating this  question  once  again,
    because  we're talking about a "one must fit all cases" here.

> stuff because you are going to expand the tuple.  That could be very
> costly.  In my system, the expansion only happens on output if they LONG
> field does not appear in the WHERE or ORDER BY clauses.

    No I won't. As explained, I would return a tuple as is,  just
    with  the  LONG reference information. It will only, but then
    allways again, be expanded if needed to compare, store  again
    or  beeing output to the client.  This "allways again" is one
    of my drawbacks against your "treating all varsize  pushable"
    concept.  In  one  of  my  early  projects, I had to manage a
    microVax for a year, and I love  systems  that  can  be  fine
    tuned  since  then, really! Auto detection is a nice feature,
    but if that failes and you don't have  any  override  option,
    you're hosed.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] LONG
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] Re: [PATCHES] pg_dump primary keys