Обсуждение: RE: WAL & SHM principles

Поиск
Список
Период
Сортировка

RE: WAL & SHM principles

От
"Mikheev, Vadim"
Дата:
> It is possible to build a logging system so that you mostly don't care
> when the data blocks get written; a particular data block on disk is 
> considered garbage until the next checkpoint, so that you 

How to know if a particular data page was modified if there is no
log record for that modification?
(Ie how to know where is garbage? -:))

> might as well allow the blocks to be written any time,
> even before the log entry.

And what to do with index tuples pointing to unupdated heap pages
after that?

Vadim


RE: WAL & SHM principles

От
"Mikheev, Vadim"
Дата:
> 1) WAL
> We have buffer manager, ok. So why not to use WAL as part of 
> it and don't log INSERT/UPDATE/DELETE xlog records but directly
> changes into buffer pages ? When someone dirties page it has to
> inform bmgr about dirty region and bmgr would formulate xlog record.
> The record could be for example fixed bitmap where each bit corresponds
> to part of page (of size pgsize/no-of-bits) which was changed. These
> changed regions follows. Multiple writes (by multiple backends) can be
> coalesced together as long as their transactions overlaps and there is
> enough memory to keep changed buffer pages in memory.
> 
> Pros: upper layers can think thet buffers are always safe/logged and
>       there is no special handling for indices; very simple/fast redo
> Cons: can't implement undo - but in non-overwriting is not needed (?)

But needed if we want to get rid of vacuum and have savepoints.

Vadim


RE: WAL & SHM principles

От
Martin Devera
Дата:
> > Pros: upper layers can think thet buffers are always safe/logged and
> >       there is no special handling for indices; very simple/fast redo
> > Cons: can't implement undo - but in non-overwriting is not needed (?)
> 
> But needed if we want to get rid of vacuum and have savepoints.

Hmm. How do you implement savepoints ? When there is rollback to savepoint
do you use xlog to undo all changes which the particular transaction has
done ? Hmmm it seems nice ... these resords are locked by such transaction
so that it can safely undo them :-)
Am I right ?

But how can you use xlog to get rid of vacuum ? Do you treat all delete
log records as candidates for free space ?

regards, devik



RE: WAL & SHM principles

От
"Mikheev, Vadim"
Дата:
> > But needed if we want to get rid of vacuum and have savepoints.
> 
> Hmm. How do you implement savepoints ? When there is rollback 
> to savepoint do you use xlog to undo all changes which the particular 
> transaction has done ? Hmmm it seems nice ... these resords are locked by 
> such transaction so that it can safely undo them :-)
> Am I right ?

Yes, but there is no savepoints in 7.1 - hopefully in 7.2

> But how can you use xlog to get rid of vacuum ? Do you treat 
> all delete log records as candidates for free space ?

Vaccum removes deleted records *and* records inserted by aborted
transactions - last ones will be removed by UNDO.

Vadim


Re: WAL & SHM principles

От
"Kevin T. Manley \(Home\)"
Дата:
""Mikheev, Vadim"" <vmikheev@SECTORBASE.COM> wrote in message
news:8F4C99C66D04D4118F580090272A7A234D32FA@sectorbase1.sectorbase.com...
> > It is possible to build a logging system so that you mostly don't care
> > when the data blocks get written; a particular data block on disk is
> > considered garbage until the next checkpoint, so that you
>
> How to know if a particular data page was modified if there is no
> log record for that modification?
> (Ie how to know where is garbage? -:))
>

You could store a log sequence number in the data page header that indicates
the log address of the last log record that was applied to the page. This is
described in Bernstein and Newcomer's book (sec 8.5 operation logging).
Sorry if I'm misunderstanding the question. Back to lurking mode...







RE: WAL & SHM principles

От
"Mikheev, Vadim"
Дата:
>> > It is possible to build a logging system so that you 
>> > mostly don't care when the data blocks get written;
>> > a particular data block on disk is considered garbage
>> > until the next checkpoint, so that you
>> >
> > How to know if a particular data page was modified if there is no
> > log record for that modification?
> > (Ie how to know where is garbage? -:))
> 
> You could store a log sequence number in the data page header 
> that indicates the log address of the last log record that was
> applied to the page.

We do. But how to know at the time of recovery that there is
a page in multi-Gb index file with tuple pointing to uninserted
table row?
Well, actually we could make some improvements in this area:
a buffer without "first after checkpoint" modification could be
written without flushing log records: entire block will be
rewritten on recovery. Not sure how much we get, though -:)

Vadim


Re: WAL & SHM principles

От
ncm@zembu.com (Nathan Myers)
Дата:
Sorry for taking so long to reply...

On Wed, Mar 07, 2001 at 01:27:34PM -0800, Mikheev, Vadim wrote:
> Nathan wrote:
> > It is possible to build a logging system so that you mostly don't care
> > when the data blocks get written    [after being changed, as long as they get written by an fsync];
> > a particular data block on disk is 
> > considered garbage until the next checkpoint, so that you 
> 
> How to know if a particular data page was modified if there is no
> log record for that modification?
> (Ie how to know where is garbage? -:))

In such a scheme, any block on disk not referenced up to (and including) 
the last checkpoint is garbage, and is either blank or reflects a recent 
logged or soon-to-be-logged change.  Everything written (except in the 
log) after the checkpoint thus has to happen in blocks not otherwise 
referenced from on-disk -- except in other post-checkpoint blocks.

During recovery, the log contents get written to those pages during
startup. Blocks that actually got written before the crash are not
changed by being overwritten from the log, but that's ok. If they got
written before the corresponding log entry, too, nothing references
them, so they are considered blank.

> > might as well allow the blocks to be written any time,
> > even before the log entry.
> 
> And what to do with index tuples pointing to unupdated heap pages
> after that?

Maybe index pages are cached in shm and copied to mmapped blocks 
after it is ok for them to be written.

What platforms does PG run on that don't have mmap()?

Nathan Myers
ncm@zembu.com