Обсуждение: Update statement results in Out of memory

Поиск
Список
Период
Сортировка

Update statement results in Out of memory

От
Ivan Bianchi
Дата:
Hello,

I am trying to update a column using a PostGIS ST_Buffer function into a table of 4.257.769 rows, but after 6 hours, an Out of memory error appears and the kernel starts killing processes until a Kernel Panic shows up.

I have simplified the buffer target geometry and also added a gist index to that column.

The statement is the following:
psql -h host -U user -W -d database -c "UPDATE table SET buffer = ST_Buffer(simplified_geometry, 0.005);"

After reading and tunning the configuration, I still have the same result.

Here's the initial memory stats:

              total        used        free shared  buff/cache   available
Mem:            15G        1.5G         12G        503M        1.4G         13G
Swap:          7.8G          0B        7.8G


I'm running out of ideas, as I think the postgresql.conf memory parameters are quite low for the machine specs. I understand I can split the process and paginate the rows, but I can't see why I can't deal with this full statement right now.

Do you think this issue is related with the postgres memory parameters configuration? Why is not respecting the shared_buffers or effective_cache_size parameters and keeps growing?


Here's some info:

Machine specs
  • Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
  • 16 GB of memory
  • Fedora release 23 (Twenty Three)
  • Kernel - 4.5.7-202.fc23.x86_64
postgresql.conf
  • effective_cache_size = 5GB
  • shared_buffers = 3GB
  • work_mem = 10MB
  • maintenance_work_mem = 800MB
  • wal_buffers = 16MB
Kernel parameters
  • vm.overcommit_memory=2
  • kernel.shmmax = 8340893696
  • kernel.shmall = 2036351
Versions:
  • PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), 64-bit
  • POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3" LIBJSON="0.12" RASTER

Many thanks,

--
Ivan

Re: Update statement results in Out of memory

От
Adrian Klaver
Дата:
On 07/06/2016 02:13 AM, Ivan Bianchi wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer
> <http://postgis.net/docs/ST_Buffer.html> function into a table
> of 4.257.769 rows, but after 6 hours, an /Out of memory/ error appears
> and the kernel starts killing processes until a /Kernel Panic/ shows up.
>
> I have simplified the buffer target geometry and also added a gist index
> to that column.
>
> The statement is the following:
>
>     psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>     ST_Buffer(simplified_geometry, 0.005);"

I would say the issue is the above, you are running in a single
transaction. Given that an UPDATE in Postgres is a DELETE/INSERT and
that both the new and old rows have to be kept around until the
transaction completes I see only problems with doing it this way.

>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>                   total        used        freeshared  buff/cache
>     available
>     Mem:            15G        1.5G         12G        503M        1.4G
>             13G
>     Swap:          7.8G          0B        7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory
> parameters are quite low for the machine specs. I understand I can split
> the process and paginate the rows, but I can't see why I can't deal with
> this full statement right now.

See above.

>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
> *
> *
> _Machine specs_
>
>   * Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
>   * 16 GB of memory
>   * Fedora release 23 (Twenty Three)
>   * Kernel - 4.5.7-202.fc23.x86_64
>
> _postgresql.conf_
>
>   * effective_cache_size = 5GB
>   * shared_buffers = 3GB
>   * work_mem = 10MB
>
>   * maintenance_work_mem = 800MB
>   * wal_buffers = 16MB
>
> _Kernel parameters_
>
>   * vm.overcommit_memory=2
>
>   * kernel.shmmax = 8340893696
>   * kernel.shmall = 2036351
>
> _Versions:_
>
>   * PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
>     20160406 (Red Hat 5.3.1-6), 64-bit
>   * POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel.
>     4.9.1, 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26"
>     LIBXML="2.9.3" LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Update statement results in Out of memory

От
Paul Ramsey
Дата:
Running a multi-million row update will take a long time.
It's possible you've exposed a memory leak in ST_Buffer (the older
your version of GEOS, the more likely that is) but it's also possible
you're just running a really long update.
I find for batch processing purposes that creating fresh tables is far
preferable:

CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;

If you still see memory issues with the above then you probably do
have a leak, *or* you're just running buffer on a sufficiently large
input geometry or with a large enough radius to blow up the memory
naturally.

P


On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer function into a
> table of 4.257.769 rows, but after 6 hours, an Out of memory error appears
> and the kernel starts killing processes until a Kernel Panic shows up.
>
> I have simplified the buffer target geometry and also added a gist index to
> that column.
>
> The statement is the following:
>>
>> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>> ST_Buffer(simplified_geometry, 0.005);"
>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>>               total        used        free shared  buff/cache   available
>> Mem:            15G        1.5G         12G        503M        1.4G
>> 13G
>> Swap:          7.8G          0B        7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory parameters
> are quite low for the machine specs. I understand I can split the process
> and paginate the rows, but I can't see why I can't deal with this full
> statement right now.
>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
>
> Machine specs
>
> Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> 16 GB of memory
> Fedora release 23 (Twenty Three)
> Kernel - 4.5.7-202.fc23.x86_64
>
> postgresql.conf
>
> effective_cache_size = 5GB
> shared_buffers = 3GB
> work_mem = 10MB
>
> maintenance_work_mem = 800MB
> wal_buffers = 16MB
>
> Kernel parameters
>
> vm.overcommit_memory=2
>
> kernel.shmmax = 8340893696
> kernel.shmall = 2036351
>
> Versions:
>
> PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> 20160406 (Red Hat 5.3.1-6), 64-bit
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan


Re: Update statement results in Out of memory

От
Rémi Cura
Дата:
You could check the max number of points in your geometries :

SELECT max(ST_NumPoints(geom))
FROM ...

Of course you could still have invalid / abberant geometry,
which you could also check (ST_IsValid, St_IsSimple).

You could solve both those hypotheses if you could perform your buffer by batch.

Cheers,
Rémi-C

2016-07-06 15:36 GMT+02:00 Paul Ramsey <pramsey@cleverelephant.ca>:
Running a multi-million row update will take a long time.
It's possible you've exposed a memory leak in ST_Buffer (the older
your version of GEOS, the more likely that is) but it's also possible
you're just running a really long update.
I find for batch processing purposes that creating fresh tables is far
preferable:

CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;

If you still see memory issues with the above then you probably do
have a leak, *or* you're just running buffer on a sufficiently large
input geometry or with a large enough radius to blow up the memory
naturally.

P


On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer function into a
> table of 4.257.769 rows, but after 6 hours, an Out of memory error appears
> and the kernel starts killing processes until a Kernel Panic shows up.
>
> I have simplified the buffer target geometry and also added a gist index to
> that column.
>
> The statement is the following:
>>
>> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>> ST_Buffer(simplified_geometry, 0.005);"
>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>>               total        used        free shared  buff/cache   available
>> Mem:            15G        1.5G         12G        503M        1.4G
>> 13G
>> Swap:          7.8G          0B        7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory parameters
> are quite low for the machine specs. I understand I can split the process
> and paginate the rows, but I can't see why I can't deal with this full
> statement right now.
>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
>
> Machine specs
>
> Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> 16 GB of memory
> Fedora release 23 (Twenty Three)
> Kernel - 4.5.7-202.fc23.x86_64
>
> postgresql.conf
>
> effective_cache_size = 5GB
> shared_buffers = 3GB
> work_mem = 10MB
>
> maintenance_work_mem = 800MB
> wal_buffers = 16MB
>
> Kernel parameters
>
> vm.overcommit_memory=2
>
> kernel.shmmax = 8340893696
> kernel.shmall = 2036351
>
> Versions:
>
> PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> 20160406 (Red Hat 5.3.1-6), 64-bit
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Update statement results in Out of memory

От
Ivan Bianchi
Дата:
Thanks for all for the good feedback,

I realize that there's no problem with the configuration, and I assume that the full update statement is a bad decision by itself, so I made a batch process with single updates statements which commits every 50 single operation.

What I can discover now is that a memory leak appeared with a specific geometry of 6624 points. The memory starts falling dramatically untill the system collapses. So now, I can assume that the hanging was caused by the ST_Buffer function in that geometry (which I have been able to trace thanks to the split). I've already checked that the geometry is valid with ST_isValid(), so I should ask to the PostGIS people for further information.

So my learning here is that a very large single transaction is a bad practice by itself, not only for the (more than probably) memory errors, but for the loss of tracking in case of error.

Best regards,
Ivan

2016-07-06 15:42 GMT+02:00 Rémi Cura <remi.cura@gmail.com>:
You could check the max number of points in your geometries :

SELECT max(ST_NumPoints(geom))
FROM ...

Of course you could still have invalid / abberant geometry,
which you could also check (ST_IsValid, St_IsSimple).

You could solve both those hypotheses if you could perform your buffer by batch.

Cheers,
Rémi-C

2016-07-06 15:36 GMT+02:00 Paul Ramsey <pramsey@cleverelephant.ca>:
Running a multi-million row update will take a long time.
It's possible you've exposed a memory leak in ST_Buffer (the older
your version of GEOS, the more likely that is) but it's also possible
you're just running a really long update.
I find for batch processing purposes that creating fresh tables is far
preferable:

CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;

If you still see memory issues with the above then you probably do
have a leak, *or* you're just running buffer on a sufficiently large
input geometry or with a large enough radius to blow up the memory
naturally.

P


On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer function into a
> table of 4.257.769 rows, but after 6 hours, an Out of memory error appears
> and the kernel starts killing processes until a Kernel Panic shows up.
>
> I have simplified the buffer target geometry and also added a gist index to
> that column.
>
> The statement is the following:
>>
>> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>> ST_Buffer(simplified_geometry, 0.005);"
>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>>               total        used        free shared  buff/cache   available
>> Mem:            15G        1.5G         12G        503M        1.4G
>> 13G
>> Swap:          7.8G          0B        7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory parameters
> are quite low for the machine specs. I understand I can split the process
> and paginate the rows, but I can't see why I can't deal with this full
> statement right now.
>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
>
> Machine specs
>
> Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> 16 GB of memory
> Fedora release 23 (Twenty Three)
> Kernel - 4.5.7-202.fc23.x86_64
>
> postgresql.conf
>
> effective_cache_size = 5GB
> shared_buffers = 3GB
> work_mem = 10MB
>
> maintenance_work_mem = 800MB
> wal_buffers = 16MB
>
> Kernel parameters
>
> vm.overcommit_memory=2
>
> kernel.shmmax = 8340893696
> kernel.shmall = 2036351
>
> Versions:
>
> PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> 20160406 (Red Hat 5.3.1-6), 64-bit
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
Ivan