Обсуждение: pg_upgrade and epoch
Hi All!
--
Sergey Burladyan
Current pg_upgrade copy XID into new cluster, but not it epoch. Why?
Without epoch from old cluster txid_current() in upgraded database return
lower value than before upgrade. This break, for example, PgQ and it must
be fixed by hand after upgrade with pg_resetxlog.
Sergey Burladyan
On Tue, Apr 22, 2014 at 6:33 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote: > Current pg_upgrade copy XID into new cluster, but not it epoch. Why? > > Without epoch from old cluster txid_current() in upgraded database return > lower value than before upgrade. This break, for example, PgQ and it must > be fixed by hand after upgrade with pg_resetxlog. > > PS: see > http://lists.pgfoundry.org/pipermail/skytools-users/2014-April/001812.html BTW, I didn't manage to make a test case yet. Recently, when I was migrating several servers to skytools3 and upgrading from 9.0 to 9.2, I noticed that epoch was copied, timeline id was >0 after upgrade, but skytools3 sometimes still didn't like it. Also note "sometimes" here, so in some cases everything was okay, but in some it wasn't. I still can't explain this, but incrementing timeline id always helped. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
On Wed, Apr 23, 2014 at 6:38 AM, Sergey Konoplev <gray.ru@gmail.com> wrote:
BTW, I didn't manage to make a test case yet. Recently, when I was
migrating several servers to skytools3 and upgrading from 9.0 to 9.2,
I noticed that epoch was copied, timeline id was >0 after upgrade, but
...
This is strange, if I not mistaken XID copied by copy_clog_xlog_xid(void):
and there is no epoch (-e XIDEPOCH) in pg_resetxlog call args
~/inst/pg9.2/bin/initdb d9.2
/usr/lib/postgresql/9.3/bin/initdb d9.3
~/inst/pg9.2/bin/postgres -D d9.2 -p 5433
^C
~/inst/pg9.2/bin/pg_resetxlog -e 8 d9.2/
~/inst/pg9.2/bin/postgres -D d9.2 -p 5433
psql -At -h localhost -p 5433 -c 'select txid_current()' postgres
34359739064
^C
/usr/lib/postgresql/9.3/bin/pg_upgrade -d d9.2 -D d9.3 -b ~/inst/pg9.2/bin -B /usr/lib/postgresql/9.3/bin -k
/usr/lib/postgresql/9.3/bin/postgres -D d9.3 -p 5433 -k`pwd`/s
psql -At -h localhost -p 5433 -c 'select txid_current()' postgres
756
34359739064 switched to 756 after upgrade
Sergey Burladyan
On Tue, Apr 22, 2014 at 8:08 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote: > On Wed, Apr 23, 2014 at 6:38 AM, Sergey Konoplev <gray.ru@gmail.com> wrote: >> BTW, I didn't manage to make a test case yet. Recently, when I was >> migrating several servers to skytools3 and upgrading from 9.0 to 9.2, >> I noticed that epoch was copied, timeline id was >0 after upgrade, but > ... Sorry, just noticed a typo: * timeline id = next xid > This is strange, if I not mistaken XID copied by copy_clog_xlog_xid(void): > http://doxygen.postgresql.org/pg__upgrade_8c_source.html#l00398 > and there is no epoch (-e XIDEPOCH) in pg_resetxlog call args I can only guess here. + Bruce Momjian -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
On Wed, Apr 23, 2014 at 07:08:42AM +0400, Sergey Burladyan wrote: > On Wed, Apr 23, 2014 at 6:38 AM, Sergey Konoplev <gray.ru@gmail.com> wrote: > > > BTW, I didn't manage to make a test case yet. Recently, when I was > migrating several servers to skytools3 and upgrading from 9.0 to 9.2, > I noticed that epoch was copied, timeline id was >0 after upgrade, but > > ... > > This is strange, if I not mistaken XID copied by copy_clog_xlog_xid(void): > http://doxygen.postgresql.org/pg__upgrade_8c_source.html#l00398 > and there is no epoch (-e XIDEPOCH) in pg_resetxlog call args ... > 34359739064 switched to 756 after upgrade Yes, that looks about right, though not exact: 34359739064 - 8 * 2^32 = 696 I looked at this last night and am trying to figure out the extent of the bug. We have had timestamp epochs since pg_upgrade was created and this is the first time I am hearing that not preserving timestamp epochs is a problem. Do we store the timestamp epoch anywhere in the data files, or just in pg_controldata? (pg_upgrade does not preserve WAL files.) I know we have talked about using epochs on data pages but I am not sure we have ever done it. Sergey, are you seeing a problem only because you are interacting with other systems that didn't reset their epoch? It is an easy fix, but I need to understand the scope of the problem. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Apr 23, 2014 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote: > Sergey, are you seeing a problem only because you are > interacting with other systems that didn't reset their epoch? I faced this after upgrading clusters with PgQ Skytools3 installed only. They didn't interact with any other systems. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
On Wed, Apr 23, 2014 at 12:41:41PM -0700, Sergey Konoplev wrote: > On Wed, Apr 23, 2014 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote: > > Sergey, are you seeing a problem only because you are > > interacting with other systems that didn't reset their epoch? > > I faced this after upgrading clusters with PgQ Skytools3 installed > only. They didn't interact with any other systems. I have developed the attached patch which causes pg_upgrade to preserve the transaction epoch. I plan to apply this for PG 9.5. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Вложения
On Tue, Sep 2, 2014 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Apr 23, 2014 at 12:41:41PM -0700, Sergey Konoplev wrote: >> On Wed, Apr 23, 2014 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote: >> > Sergey, are you seeing a problem only because you are >> > interacting with other systems that didn't reset their epoch? >> >> I faced this after upgrading clusters with PgQ Skytools3 installed >> only. They didn't interact with any other systems. > > I have developed the attached patch which causes pg_upgrade to preserve > the transaction epoch. I plan to apply this for PG 9.5. That is a great news. Thank you, Bruce. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray.ru@gmail.com
On Tue, Sep 2, 2014 at 10:38:55PM -0700, Sergey Konoplev wrote: > On Tue, Sep 2, 2014 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Wed, Apr 23, 2014 at 12:41:41PM -0700, Sergey Konoplev wrote: > >> On Wed, Apr 23, 2014 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote: > >> > Sergey, are you seeing a problem only because you are > >> > interacting with other systems that didn't reset their epoch? > >> > >> I faced this after upgrading clusters with PgQ Skytools3 installed > >> only. They didn't interact with any other systems. > > > > I have developed the attached patch which causes pg_upgrade to preserve > > the transaction epoch. I plan to apply this for PG 9.5. > > That is a great news. Thank you, Bruce. Patch applied. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian <bruce@momjian.us> wrote: > I have developed the attached patch which causes pg_upgrade to preserve > the transaction epoch. I plan to apply this for PG 9.5. I would say this is a simple bug and should be back patched to 9.4 and 9.3. We're only going to continue to get complaints from people running into this. -- greg
On Sat, Sep 6, 2014 at 12:26:55AM +0100, Greg Stark wrote: > On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian <bruce@momjian.us> wrote: > > I have developed the attached patch which causes pg_upgrade to preserve > > the transaction epoch. I plan to apply this for PG 9.5. > > I would say this is a simple bug and should be back patched to 9.4 and > 9.3. We're only going to continue to get complaints from people > running into this. Yes, I did think about that, but it seems like a behavior change. However, it is tempting to avoid future bug reports about this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Fri, Sep 5, 2014 at 07:35:42PM -0400, Bruce Momjian wrote: > On Sat, Sep 6, 2014 at 12:26:55AM +0100, Greg Stark wrote: > > On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian <bruce@momjian.us> wrote: > > > I have developed the attached patch which causes pg_upgrade to preserve > > > the transaction epoch. I plan to apply this for PG 9.5. > > > > I would say this is a simple bug and should be back patched to 9.4 and > > 9.3. We're only going to continue to get complaints from people > > running into this. > > Yes, I did think about that, but it seems like a behavior change. > However, it is tempting to avoid future bug reports about this. When this came up in March, Tom and I agreed that this wasn't something we wanted to slip into 9.4. Given that, it is hard to argue we should now slip this into 9.5, 9.4, and 9.3, so unless someone else votes for inclusion, I think I will leave this as 9.5-only. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Sat, Sep 6, 2014 at 09:30:06AM -0400, Bruce Momjian wrote: > On Fri, Sep 5, 2014 at 07:35:42PM -0400, Bruce Momjian wrote: > > On Sat, Sep 6, 2014 at 12:26:55AM +0100, Greg Stark wrote: > > > On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian <bruce@momjian.us> wrote: > > > > I have developed the attached patch which causes pg_upgrade to preserve > > > > the transaction epoch. I plan to apply this for PG 9.5. > > > > > > I would say this is a simple bug and should be back patched to 9.4 and > > > 9.3. We're only going to continue to get complaints from people > > > running into this. > > > > Yes, I did think about that, but it seems like a behavior change. > > However, it is tempting to avoid future bug reports about this. > > When this came up in March, Tom and I agreed that this wasn't something > we wanted to slip into 9.4. Given that, it is hard to argue we should > now slip this into 9.5, 9.4, and 9.3, so unless someone else votes for > inclusion, I think I will leave this as 9.5-only. With no one replying, I will consider this issue closed and not backpatch this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Tue, Sep 9, 2014 at 4:05 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > Yes, I did think about that, but it seems like a behavior change. >> > However, it is tempting to avoid future bug reports about this. >> >> When this came up in March, Tom and I agreed that this wasn't something >> we wanted to slip into 9.4. Given that, it is hard to argue we should >> now slip this into 9.5, 9.4, and 9.3, so unless someone else votes for >> inclusion, I think I will leave this as 9.5-only. > > With no one replying, I will consider this issue closed and not > backpatch this. I think the reason nobody's responding is because nobody has anything significant to add. It's a behaviour change from not-working to working. Why wouldn't it be backpatched? -- greg
On Wed, Sep 10, 2014 at 02:24:17AM +0100, Greg Stark wrote: > On Tue, Sep 9, 2014 at 4:05 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> > Yes, I did think about that, but it seems like a behavior change. > >> > However, it is tempting to avoid future bug reports about this. > >> > >> When this came up in March, Tom and I agreed that this wasn't something > >> we wanted to slip into 9.4. Given that, it is hard to argue we should > >> now slip this into 9.5, 9.4, and 9.3, so unless someone else votes for > >> inclusion, I think I will leave this as 9.5-only. > > > > With no one replying, I will consider this issue closed and not > > backpatch this. > > I think the reason nobody's responding is because nobody has anything > significant to add. It's a behavior change from not-working to > working. Why wouldn't it be backpatched? OK, Greg seems to be passionate about this. Does anyone _object_ to my back-patching the epoch preservation fix through 9.3. Tom? The patch is commit a74a4aa23bb95b590ff01ee564219d2eacea3706. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Bruce Momjian <bruce@momjian.us> writes: > On Wed, Sep 10, 2014 at 02:24:17AM +0100, Greg Stark wrote: >> I think the reason nobody's responding is because nobody has anything >> significant to add. It's a behavior change from not-working to >> working. Why wouldn't it be backpatched? > OK, Greg seems to be passionate about this. Does anyone _object_ to my > back-patching the epoch preservation fix through 9.3. Tom? Not I. This is a data-loss bug fix, no? Why would we not back-patch it? regards, tom lane
On 2014-09-11 16:58:12 -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Wed, Sep 10, 2014 at 02:24:17AM +0100, Greg Stark wrote: > >> I think the reason nobody's responding is because nobody has anything > >> significant to add. It's a behavior change from not-working to > >> working. Why wouldn't it be backpatched? > > > OK, Greg seems to be passionate about this. Does anyone _object_ to my > > back-patching the epoch preservation fix through 9.3. Tom? > > Not I. This is a data-loss bug fix, no? Why would we not back-patch it? Also, what possible reason could there be for preservation to be problematic? Epoch overflow maybe :P Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Sep 11, 2014 at 04:58:12PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Wed, Sep 10, 2014 at 02:24:17AM +0100, Greg Stark wrote: > >> I think the reason nobody's responding is because nobody has anything > >> significant to add. It's a behavior change from not-working to > >> working. Why wouldn't it be backpatched? > > > OK, Greg seems to be passionate about this. Does anyone _object_ to my > > back-patching the epoch preservation fix through 9.3. Tom? > > Not I. This is a data-loss bug fix, no? Why would we not back-patch it? Seems I was thinking of another pg_upgrade feature we decided not to backpatch, though I can't find it now. Backpatched through 9.3. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +