Re: pg_upgrade and rsync

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: pg_upgrade and rsync
Дата
Msg-id 20150122234331.GA15865@momjian.us
обсуждение исходный текст
Ответ на Re: pg_upgrade and rsync  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: pg_upgrade and rsync  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers
On Thu, Jan 22, 2015 at 10:48:37PM +0200, Heikki Linnakangas wrote:
> >>>          * If we need to protect hint bit updates from torn writes, WAL-log a
> >>>          * full page image of the page. This full page image is only necessary
> >>>          * if the hint bit update is the first change to the page since the
> >>>          * last checkpoint.
> >>>          *
> >>>          * We don't check full_page_writes here because that logic is included
> >>>          * when we call XLogInsert() since the value changes dynamically.
> >>>          */
> >>>         if (XLogHintBitIsNeeded() && (bufHdr->flags & BM_PERMANENT))
> >>>         {
> >>>             /*
> >>>              * If we're in recovery we cannot dirty a page because of a hint.
> >>>              * We can set the hint, just not dirty the page as a result so the
> >>>              * hint is lost when we evict the page or shutdown.
> >>>              *
> >>>              * See src/backend/storage/page/README for longer discussion.
> >>>              */
> >>>             if (RecoveryInProgress())
> >>>                 return;
> >
> >What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*.
> 
> Then the page will be updated without writing a WAL record. Just
> like in the master, if wal_log_hints is off. wal_log_hints works the
> same on the master or the standby.

[ see below for why this entire idea might not work ]

OK, I was confused by your previous "no".  It means we do update hint
bits on read-only slave queries --- we just don't WAL log them.  In
fact, we can't update hint bits on the standby if we are wal logging
them ---- is that right?

My text was saying:
these differences can be reduced by using a fresh standby and byenabling <xref linkend="guc-wal-log-hints">.
(While<varname>wal_log_hints</>transfers hint bits from the primary tostandbys, additional hint bits are still set on
thestandbys byread-only queries.)
 

meaning if you don't run any read-only queries on the standby, the files
will be same on master/standby because the hint bits will be the same,
and rsync will not copy the files.

This brings up the other problem that the mod times of the files are
likely to be different between master and slave --- should I recommend
to only use rsync --checksum?

I would really like to get a way to pg_upgrade the standbys but we have
never really be able to get a solution.  Ideally we would update just
the system table files, and if the order of pg_upgrade file renames is
exactly the same, everything else would match, but I can't imagine what
such an API would look like.  Have pg_upgrade spit out a list of files
to be copied?

In fact, these are the relfilenodes pg_upgrade preserves:
*  While pg_class.oid and pg_class.relfilenode are initially the same*  in a cluster, they can diverge due to CLUSTER,
REINDEX,or VACUUM*  FULL.  In the new cluster, pg_class.oid and pg_class.relfilenode will*  be the same and will match
theold pg_class.oid value.  Because of*  this, old/new pg_class.relfilenode values will not match if CLUSTER,*
REINDEX,or VACUUM FULL have been performed in the old cluster.**  We control all assignments of pg_type.oid because
theseoids are stored*  in user composite type values.**  We control all assignments of pg_enum.oid because these oids
arestored*  in user tables as enum values.**  We control all assignments of pg_authid.oid because these oids are
stored* in pg_largeobject_metadata.
 

so if the table/index relfilenodes no longer match the oid on the old
cluster, due to CLUSTER, REINDEX, or VACUUM FULL, the file name will not
match on the new cluster and rsync will copy the entire file.  In fact,
rsync is going to copy it to the wrong file name, and delete the right
file.  

I am going to now conclude that rsync is never going to work for this,
unless we have pg_upgrade preserve relfilenodes as well.  However, I am
not even sure that is possible due to conflicts with system table
relfilenodes created in the new cluster.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andreas Karlsson
Дата:
Сообщение: Re: Using 128-bit integers for sum, avg and statistics aggregates
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Back-branch update releases scheduled