Обсуждение: Are all unlogged tables in any case truncated after a server-crash?

Поиск
Список
Период
Сортировка

Are all unlogged tables in any case truncated after a server-crash?

От
sch8el@posteo.de
Дата:
Hi everyone,

every few weeks I use Postgres ability, to import huge data sets very 
fast by means of "unlogged tables". The bulk load (consisting of plenty 
"copy"- & DML-Stmts) and the spatial index creation afterwards, takes 
about 5 hours on a proper server  (pg12.7 & PostGIS-Extension). After 
that all unlogged tables remain completely unchanged (no 
DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get 
never "unclean" and should not be truncated after a server crash.

BTW, if I set all unlogged tables to logged after bulk load, it takes 
additional 1.5 hours, mainly because of re-indexing, I suppose. I assume 
that a restart of the database after a server crash takes another 1.5 
hours (reading from WAL) until the database is up and running.

Therefore I am seeking a strategy, to not tagging those tables as 
"unclean" and not truncating all unlogged tables on server restart.


Cheers and regards.




Re: Are all unlogged tables in any case truncated after a server-crash?

От
"David G. Johnston"
Дата:
On Thu, Nov 11, 2021 at 11:39 AM <sch8el@posteo.de> wrote:
After
that all unlogged tables remain completely unchanged (no
DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get
never "unclean" and should not be truncated after a server crash.

The server cannot make this assumption so it truncates unlogged relations upon an unclean shutdown/crash because it has no WAL with which to ensure a proper restoration.

BTW, if I set all unlogged tables to logged after bulk load, it takes
additional 1.5 hours, mainly because of re-indexing, I suppose.

More likely it is writing the entire table, and all of its indexes, to WAL.

I assume
that a restart of the database after a server crash takes another 1.5
hours (reading from WAL) until the database is up and running.

That would be incorrect.  See "CHECKPOINT".


Therefore I am seeking a strategy, to not tagging those tables as
"unclean" and not truncating all unlogged tables on server restart.


There is no middle ground that I am aware of.  Either the contents of the table are in WAL ,or they are not.  If not, they can be lost upon an unclean shutdown.  For manually initiated shutdowns you do have the option to do so cleanly.

This topic (unlogged optimizations) does draw quite a bit of attention every year but so far the problem of proving to the system that the physical file on disk is a truly accurate representation of the post-crash relation is yet unsolved.

David J.

Re: Are all unlogged tables in any case truncated after a server-crash?

От
Laurenz Albe
Дата:
On Thu, 2021-11-11 at 18:39 +0000, sch8el@posteo.de wrote:
> every few weeks I use Postgres ability, to import huge data sets very 
> fast by means of "unlogged tables". The bulk load (consisting of plenty 
> "copy"- & DML-Stmts) and the spatial index creation afterwards, takes 
> about 5 hours on a proper server  (pg12.7 & PostGIS-Extension). After 
> that all unlogged tables remain completely unchanged (no 
> DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get 
> never "unclean" and should not be truncated after a server crash.

There is no way to achieve that.

But you could keep the "huge data sets" around and load them again if
your server happens to crash (which doesn't happen often, I hope).

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Are all unlogged tables in any case truncated after a server-crash?

От
sch8el@posteo.de
Дата:
Hi David,

thx for your comments and your advice on reading docs on "checkpoint".

Of course consistency is most important to any DBMS, and if in doubt about that, truncate data rows and restore from WAL.
But in this case, where data is never modified after bulk load, I thought there might be an undocumented feature or workaround, like ...
  - option to set the datafiles of those tables in read-only mode and record this in the metadata
  - on server-recovery spare these unlogged tables and indexes from truncating all data rows

Its truly a "nice to have"-thing, but I have learned
now, that there is not feature like that.

Mart


Am 11.11.2021 um 22:10 schrieb David G. Johnston:
On Thu, Nov 11, 2021 at 11:39 AM <sch8el@posteo.de> wrote:
After
that all unlogged tables remain completely unchanged (no
DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get
never "unclean" and should not be truncated after a server crash.

The server cannot make this assumption so it truncates unlogged relations upon an unclean shutdown/crash because it has no WAL with which to ensure a proper restoration.

BTW, if I set all unlogged tables to logged after bulk load, it takes
additional 1.5 hours, mainly because of re-indexing, I suppose.

More likely it is writing the entire table, and all of its indexes, to WAL.

I assume
that a restart of the database after a server crash takes another 1.5
hours (reading from WAL) until the database is up and running.

That would be incorrect.  See "CHECKPOINT".


Therefore I am seeking a strategy, to not tagging those tables as
"unclean" and not truncating all unlogged tables on server restart.


There is no middle ground that I am aware of.  Either the contents of the table are in WAL ,or they are not.  If not, they can be lost upon an unclean shutdown.  For manually initiated shutdowns you do have the option to do so cleanly.

This topic (unlogged optimizations) does draw quite a bit of attention every year but so far the problem of proving to the system that the physical file on disk is a truly accurate representation of the post-crash relation is yet unsolved.

David J.


Re: Are all unlogged tables in any case truncated after a server-crash?

От
sch8el@posteo.de
Дата:

Am 12.11.2021 um 08:41 schrieb Laurenz Albe:
> On Thu, 2021-11-11 at 18:39 +0000, sch8el@posteo.de wrote:
>> every few weeks I use Postgres ability, to import huge data sets very
>> fast by means of "unlogged tables". The bulk load (consisting of plenty
>> "copy"- & DML-Stmts) and the spatial index creation afterwards, takes
>> about 5 hours on a proper server  (pg12.7 & PostGIS-Extension). After
>> that all unlogged tables remain completely unchanged (no
>> DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get
>> never "unclean" and should not be truncated after a server crash.
> There is no way to achieve that.
>
> But you could keep the "huge data sets" around and load them again if
> your server happens to crash (which doesn't happen often, I hope).
Thx Laurenz for yr reply! Yes, that's what we did after server crashes 
(~ 2/yr on different locations).
But the system is at least 5 hours offline plus the time until the admin 
manually re-starts the bulk loads. On my system, I have 6 databases 
configured like this. For all I have to redo the bulk loads.
I hoped there was a 'switch' on crash-recovery, to avoid truncating the 
datafiles of these unlogged tables, which are definitely in a perfect 
condition.

Mart
>
> Yours,
> Laurenz Albe




Re: Are all unlogged tables in any case truncated after a server-using

От
Michael Lewis
Дата:
Why keep them as unlogged tables? If data is static, can you spare the disk space to gradually copy data from existing unlogged table to new copy that is logged, and then have brief exclusive lock to drop unlogged and rename new one?

Re: Are all unlogged tables in any case truncated after a server-crash?

От
Michael Lewis
Дата:
Curious... why keep the table as unlogged if it is static? If you can spare the disk space, perhaps just create a regular table with same definition, gradually copy the data to spread the impact on WAL, and when complete, just drop the old table and rename the new one.