Обсуждение: errors on restoring postgresql binary dump to glusterfs

Поиск
Список
Период
Сортировка

errors on restoring postgresql binary dump to glusterfs

От
Liang Ma
Дата:
Hi There,

While trying to restore a ~700GM binary dump by command

pg_restore -d dbdata < sampledbdata-20120327.pgdump

I encountered following errors repeatedly

pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
10267347 BLOB 10267347 sdmcleod
pg_restore: [archiver (db)] could not execute query: ERROR:
unexpected data beyond EOF in block 500 of relation base/16386/11743
HINT:  This has been seen to occur with buggy kernels; consider
updating your system.
   Command was: SELECT pg_catalog.lo_create('10267347');



pg_restore: [archiver (db)] could not execute query: ERROR:  large
object 10267347 does not exist
   Command was: ALTER LARGE OBJECT 10267347 OWNER TO sdmcleod;


pg_restore: [archiver (db)] Error from TOC entry 2882464; 2613
10267348 BLOB 10267348 sdmcleod
pg_restore: [archiver (db)] could not execute query: ERROR:
unexpected data beyond EOF in block 500 of relation base/16386/11743
HINT:  This has been seen to occur with buggy kernels; consider
updating your system.
   Command was: SELECT pg_catalog.lo_create('10267348');



pg_restore: [archiver (db)] could not execute query: ERROR:  large
object 10267348 does not exist
   Command was: ALTER LARGE OBJECT 10267348 OWNER TO sdmcleod;


......
......


pg_restore: [archiver (db)] Error from TOC entry 53398; 0 16503 TABLE
DATA l1aaux_sci sdmcleod
pg_restore: [archiver (db)] COPY failed for table "l1aaux_sci": ERROR:
 unexpected data beyond EOF in block 9391 of relation base/16386/17043
HINT:  This has been seen to occur with buggy kernels; consider
updating your system.
CONTEXT:  COPY l1aaux_sci, line 319329: "1854661        \N
1.05156717906094999     1378796678.44843268     2012-02-01
07:04:39.5+00        2012-02-01 07:04:38.4484..."
pg_restore: [archiver (db)] Error from TOC entry 53399; 0 16528 TABLE
DATA l1afts_dbl sdmcleod
pg_restore: [archiver (db)] COPY failed for table "l1afts_dbl": ERROR:
 unexpected data beyond EOF in block 10097 of relation
base/16386/17068
HINT:  This has been seen to occur with buggy kernels; consider
updating your system.
CONTEXT:  COPY l1afts_dbl, line 454411: "459755 2012-03-23
05:31:02.185562+00   ace.sr45190     52867958        299     2591429
FTS     1.1.0   1376321941.75799..."


The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
version 9.1.3-1~lucid. The postgresql data directory is located in a
glusterfs mounted directory to a replicated volume vol-2

192.168.244.101:/vol-2
                    5731222400 3041313920 2398779136  56% /mnt/gluster-2

Here is the gluster info for vol-2:

Volume Name: vol-2
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 192.168.244.101:/data/glbrk-2
Brick2: 192.168.244.102:/data/glbrk-2

The version of glusterfs is 3.2.6.

I think this may have someting to do with glusterfs, because when I
restore the same dump to a same ubuntu 10.04 server with postgresql
upgraded to the same 9.1.3-1~lucid located in a local ext4 filesystem,
the pg_restore went well without a single error.

Has anyone seen something similar before?

Thank you in advance.

Liang Ma

Re: errors on restoring postgresql binary dump to glusterfs

От
Magnus Hagander
Дата:
On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
> Hi There,
>
> While trying to restore a ~700GM binary dump by command
>
> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>
> I encountered following errors repeatedly
>
> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
> 10267347 BLOB 10267347 sdmcleod
> pg_restore: [archiver (db)] could not execute query: ERROR:
> unexpected data beyond EOF in block 500 of relation base/16386/11743
> HINT:  This has been seen to occur with buggy kernels; consider
> updating your system.

Note the message right here...

There may be further indications in the server log about what's wrong.

> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
> version 9.1.3-1~lucid. The postgresql data directory is located in a
> glusterfs mounted directory to a replicated volume vol-2

I assume you don't have more than one node actually *accessing* the
data directory at the same time, right?

Even with that said, I haven't heard of anybody running PostgreSQL on
glusterfs, and I'm not sure it fulfills the basic requirements that
PostgreSQL has on a filesystem. In particular, the messages above
about a buggy kernel certainly indicates that there is a problem with
the filesystem.

> I think this may have someting to do with glusterfs, because when I
> restore the same dump to a same ubuntu 10.04 server with postgresql
> upgraded to the same 9.1.3-1~lucid located in a local ext4 filesystem,
> the pg_restore went well without a single error.

Yes, it certainly sounds like that. You probably need to bring it up
with the glusterfs folks...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: errors on restoring postgresql binary dump to glusterfs

От
Liang Ma
Дата:
Hi Magnus,

Thank you for answering my post.

Please see comments on your answer below.

On Fri, May 4, 2012 at 3:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>> Hi There,
>>
>> While trying to restore a ~700GM binary dump by command
>>
>> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>>
>> I encountered following errors repeatedly
>>
>> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
>> 10267347 BLOB 10267347 sdmcleod
>> pg_restore: [archiver (db)] could not execute query: ERROR:
>> unexpected data beyond EOF in block 500 of relation base/16386/11743
>> HINT:  This has been seen to occur with buggy kernels; consider
>> updating your system.
>
> Note the message right here...
>
> There may be further indications in the server log about what's wrong.
>

The server's logs in message file were clean.


>> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
>> version 9.1.3-1~lucid. The postgresql data directory is located in a
>> glusterfs mounted directory to a replicated volume vol-2
>
> I assume you don't have more than one node actually *accessing* the
> data directory at the same time, right?
>

Yes, you are right. I just set up this glusterfs and postgresql server
with two nodes for testing purpose. There was no other gluster
filesystem access activity at the time I tried to restore the
postgresql dump. Do you know if postgresql recommends any other
cluster filesystem, or it may not like cluster filesystem at all?


> Even with that said, I haven't heard of anybody running PostgreSQL on
> glusterfs, and I'm not sure it fulfills the basic requirements that
> PostgreSQL has on a filesystem. In particular, the messages above
> about a buggy kernel certainly indicates that there is a problem with
> the filesystem.
>
>> I think this may have someting to do with glusterfs, because when I
>> restore the same dump to a same ubuntu 10.04 server with postgresql
>> upgraded to the same 9.1.3-1~lucid located in a local ext4 filesystem,
>> the pg_restore went well without a single error.
>
> Yes, it certainly sounds like that. You probably need to bring it up
> with the glusterfs folks...
>

I posted to glusterfs mailing list at the same time but haven't got
any feedback yet. I think it is more likely related to glusterfs, but
would like to know if any other postgresql users have similar
experience or ideas.

> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/

Thanks.

Liang

Re: errors on restoring postgresql binary dump to glusterfs

От
Magnus Hagander
Дата:
On Mon, May 7, 2012 at 5:02 PM, Liang Ma <ma.satops@gmail.com> wrote:
> On Fri, May 4, 2012 at 3:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>> Hi There,
>>>
>>> While trying to restore a ~700GM binary dump by command
>>>
>>> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>>>
>>> I encountered following errors repeatedly
>>>
>>> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
>>> 10267347 BLOB 10267347 sdmcleod
>>> pg_restore: [archiver (db)] could not execute query: ERROR:
>>> unexpected data beyond EOF in block 500 of relation base/16386/11743
>>> HINT:  This has been seen to occur with buggy kernels; consider
>>> updating your system.
>>
>> Note the message right here...
>>
>> There may be further indications in the server log about what's wrong.
>>
>
> The server's logs in message file were clean.

Then your logging is incorrectly configured, because it should *at
least* have the same message as the one that showed up in the client.


>>> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
>>> version 9.1.3-1~lucid. The postgresql data directory is located in a
>>> glusterfs mounted directory to a replicated volume vol-2
>>
>> I assume you don't have more than one node actually *accessing* the
>> data directory at the same time, right?
>>
>
> Yes, you are right. I just set up this glusterfs and postgresql server
> with two nodes for testing purpose. There was no other gluster
> filesystem access activity at the time I tried to restore the
> postgresql dump. Do you know if postgresql recommends any other
> cluster filesystem, or it may not like cluster filesystem at all?


Did you have PostgreSQL started on both nodes? That is *not*
supported. If PostgreSQL only runs on one node at a time it should in
theory work, provided the cluster filesystem provides all the services
that a normal filesystem does, such as respecting fsync.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: errors on restoring postgresql binary dump to glusterfs

От
Liang Ma
Дата:
On Mon, May 7, 2012 at 12:54 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, May 7, 2012 at 5:02 PM, Liang Ma <ma.satops@gmail.com> wrote:
>> On Fri, May 4, 2012 at 3:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>>> Hi There,
>>>>
>>>> While trying to restore a ~700GM binary dump by command
>>>>
>>>> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>>>>
>>>> I encountered following errors repeatedly
>>>>
>>>> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
>>>> 10267347 BLOB 10267347 sdmcleod
>>>> pg_restore: [archiver (db)] could not execute query: ERROR:
>>>> unexpected data beyond EOF in block 500 of relation base/16386/11743
>>>> HINT:  This has been seen to occur with buggy kernels; consider
>>>> updating your system.
>>>
>>> Note the message right here...
>>>
>>> There may be further indications in the server log about what's wrong.
>>>
>>
>> The server's logs in message file were clean.
>
> Then your logging is incorrectly configured, because it should *at
> least* have the same message as the one that showed up in the client.
>

Oh, yes, the same error messages were logged in the postgresql log
file but no further information. I thought you implied that there may
be some indication in server's system logs, which I couldn't find any.

>
>>>> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
>>>> version 9.1.3-1~lucid. The postgresql data directory is located in a
>>>> glusterfs mounted directory to a replicated volume vol-2
>>>
>>> I assume you don't have more than one node actually *accessing* the
>>> data directory at the same time, right?
>>>
>>
>> Yes, you are right. I just set up this glusterfs and postgresql server
>> with two nodes for testing purpose. There was no other gluster
>> filesystem access activity at the time I tried to restore the
>> postgresql dump. Do you know if postgresql recommends any other
>> cluster filesystem, or it may not like cluster filesystem at all?
>
>
> Did you have PostgreSQL started on both nodes? That is *not*
> supported. If PostgreSQL only runs on one node at a time it should in
> theory work, provided the cluster filesystem provides all the services
> that a normal filesystem does, such as respecting fsync.
>

Postgresql are installed in both nodes, but only one node's postgresql
data directory points to glusterfs filesystem. Another one's data
directory is in its default location in the local ext4 filesystem.
This is the one I used to prove the dump file can be restored without
any problem when glusterfs is not involved.

According to its introduction and document, glusterfs is supposed to
appear as a normal filesystem when being mounted, although I don't
know how well it respects things like fsync.

> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/

Liang

Re: errors on restoring postgresql binary dump to glusterfs

От
Magnus Hagander
Дата:
On Mon, May 7, 2012 at 7:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
> On Mon, May 7, 2012 at 12:54 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Mon, May 7, 2012 at 5:02 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>> On Fri, May 4, 2012 at 3:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
>>>> On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>>>> Hi There,
>>>>>
>>>>> While trying to restore a ~700GM binary dump by command
>>>>>
>>>>> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>>>>>
>>>>> I encountered following errors repeatedly
>>>>>
>>>>> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
>>>>> 10267347 BLOB 10267347 sdmcleod
>>>>> pg_restore: [archiver (db)] could not execute query: ERROR:
>>>>> unexpected data beyond EOF in block 500 of relation base/16386/11743
>>>>> HINT:  This has been seen to occur with buggy kernels; consider
>>>>> updating your system.
>>>>
>>>> Note the message right here...
>>>>
>>>> There may be further indications in the server log about what's wrong.
>>>>
>>>
>>> The server's logs in message file were clean.
>>
>> Then your logging is incorrectly configured, because it should *at
>> least* have the same message as the one that showed up in the client.
>>
>
> Oh, yes, the same error messages were logged in the postgresql log
> file but no further information. I thought you implied that there may
> be some indication in server's system logs, which I couldn't find any.

Well, there might be, I wasn't sure :-) I guess there wasn't.


>>>>> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
>>>>> version 9.1.3-1~lucid. The postgresql data directory is located in a
>>>>> glusterfs mounted directory to a replicated volume vol-2
>>>>
>>>> I assume you don't have more than one node actually *accessing* the
>>>> data directory at the same time, right?
>>>>
>>>
>>> Yes, you are right. I just set up this glusterfs and postgresql server
>>> with two nodes for testing purpose. There was no other gluster
>>> filesystem access activity at the time I tried to restore the
>>> postgresql dump. Do you know if postgresql recommends any other
>>> cluster filesystem, or it may not like cluster filesystem at all?
>>
>>
>> Did you have PostgreSQL started on both nodes? That is *not*
>> supported. If PostgreSQL only runs on one node at a time it should in
>> theory work, provided the cluster filesystem provides all the services
>> that a normal filesystem does, such as respecting fsync.
>>
>
> Postgresql are installed in both nodes, but only one node's postgresql
> data directory points to glusterfs filesystem. Another one's data
> directory is in its default location in the local ext4 filesystem.
> This is the one I used to prove the dump file can be restored without
> any problem when glusterfs is not involved.

ok. That should in theory be safe. Having two active notes against th
efilesystem is never safe.


> According to its introduction and document, glusterfs is supposed to
> appear as a normal filesystem when being mounted, although I don't
> know how well it respects things like fsync.

It certainly looks like it's failing at some point. So yeah, I'm
pretty sure you need to get in touch with the glusterfs folks -
hopefully you get a response from them soon.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: errors on restoring postgresql binary dump to glusterfs

От
Liang Ma
Дата:
Thank you Magnus for all the inputs. If I get any comments from
gluster community, I will update here.

Liang

On Mon, May 7, 2012 at 3:27 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, May 7, 2012 at 7:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>> On Mon, May 7, 2012 at 12:54 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Mon, May 7, 2012 at 5:02 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>>> On Fri, May 4, 2012 at 3:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
>>>>> On Mon, Apr 30, 2012 at 8:34 PM, Liang Ma <ma.satops@gmail.com> wrote:
>>>>>> Hi There,
>>>>>>
>>>>>> While trying to restore a ~700GM binary dump by command
>>>>>>
>>>>>> pg_restore -d dbdata < sampledbdata-20120327.pgdump
>>>>>>
>>>>>> I encountered following errors repeatedly
>>>>>>
>>>>>> pg_restore: [archiver (db)] Error from TOC entry 2882463; 2613
>>>>>> 10267347 BLOB 10267347 sdmcleod
>>>>>> pg_restore: [archiver (db)] could not execute query: ERROR:
>>>>>> unexpected data beyond EOF in block 500 of relation base/16386/11743
>>>>>> HINT:  This has been seen to occur with buggy kernels; consider
>>>>>> updating your system.
>>>>>
>>>>> Note the message right here...
>>>>>
>>>>> There may be further indications in the server log about what's wrong.
>>>>>
>>>>
>>>> The server's logs in message file were clean.
>>>
>>> Then your logging is incorrectly configured, because it should *at
>>> least* have the same message as the one that showed up in the client.
>>>
>>
>> Oh, yes, the same error messages were logged in the postgresql log
>> file but no further information. I thought you implied that there may
>> be some indication in server's system logs, which I couldn't find any.
>
> Well, there might be, I wasn't sure :-) I guess there wasn't.
>
>
>>>>>> The server runs Ubuntu server 10.04 LTS with postgresql upgraded to
>>>>>> version 9.1.3-1~lucid. The postgresql data directory is located in a
>>>>>> glusterfs mounted directory to a replicated volume vol-2
>>>>>
>>>>> I assume you don't have more than one node actually *accessing* the
>>>>> data directory at the same time, right?
>>>>>
>>>>
>>>> Yes, you are right. I just set up this glusterfs and postgresql server
>>>> with two nodes for testing purpose. There was no other gluster
>>>> filesystem access activity at the time I tried to restore the
>>>> postgresql dump. Do you know if postgresql recommends any other
>>>> cluster filesystem, or it may not like cluster filesystem at all?
>>>
>>>
>>> Did you have PostgreSQL started on both nodes? That is *not*
>>> supported. If PostgreSQL only runs on one node at a time it should in
>>> theory work, provided the cluster filesystem provides all the services
>>> that a normal filesystem does, such as respecting fsync.
>>>
>>
>> Postgresql are installed in both nodes, but only one node's postgresql
>> data directory points to glusterfs filesystem. Another one's data
>> directory is in its default location in the local ext4 filesystem.
>> This is the one I used to prove the dump file can be restored without
>> any problem when glusterfs is not involved.
>
> ok. That should in theory be safe. Having two active notes against th
> efilesystem is never safe.
>
>
>> According to its introduction and document, glusterfs is supposed to
>> appear as a normal filesystem when being mounted, although I don't
>> know how well it respects things like fsync.
>
> It certainly looks like it's failing at some point. So yeah, I'm
> pretty sure you need to get in touch with the glusterfs folks -
> hopefully you get a response from them soon.
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/