Re: Understanding memory usage

Поиск
Список
Период
Сортировка
От Daniele Varrazzo
Тема Re: Understanding memory usage
Дата
Msg-id CA+mi_8ZVuL1+igjsO1wOTK9vvZ7C4MBSrRcgAOQ0KnZf2gPMKA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Understanding memory usage  (Damiano Albani <damiano.albani@gmail.com>)
Ответы Re: Understanding memory usage
Список psycopg

On Wed, Oct 30, 2013 at 5:24 PM, Damiano Albani <damiano.albani@gmail.com> wrote:
Hello,


On Tue, Oct 29, 2013 at 12:23 AM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:

Because the result is returned to the client as the response for the
query and is stored inside the cursor. fetch*() only return it to
Python.

So why does calling "fetch*()" uses additional memory then? Does it copy the data returned from the database?

Data in the cursor is stored in the form of a PQresult structure, which is an opaque object for which the libpq provides access function.

Such data is converted into Python objects when fetch*() is used. This usually implies a copy, because e.g. Python strings own their data, but even returning numbers to Python generally implies creating new instances.


By the way, I've re-run my tests but focused on the VmRSS metric, which represents how much actual physical memory is used by the process.

And I got the same behavior, that is almost no memory is reclaimed after having fetched a large number of rows.
For instance, if I fetch 2 millions small rows, memory usage peaks around 500 MB and then only lowers to ~ 450 MB after data is freed.

What to you mean as "freed"? Have you deleted the cursor and made sure the gc reclaimed it? The cursor doesn't destroy the internal data until it is deleted or another query is run (because after fetchall() you can invoke scroll(0) and return it to Python again). And of course when the data returned by fetch() is released depends on the client usage. After a big query you may see memory usage going down as soon as you execute "select 1 from false" because the result is replaced by a smaller one.

 
On the other hand, fetching 100 large rows amounts to a 3 GB peak, which subsequently falls back to 10 MB.

So is it a problem related to Psycopg itself or rather how Python handles memory in general?

The only "problem" you may attribute to Psycopg is if you find an unbound usage of the memory. If you run some piece of code in a loop and see memory increasing linearly you have found a leak. Otherwise you can attribute the artefacts you see to the Python GC.
 

-- Daniele

В списке psycopg по дате отправления:

Предыдущее
От: Damiano Albani
Дата:
Сообщение: Re: Understanding memory usage
Следующее
От: Damiano Albani
Дата:
Сообщение: Re: Understanding memory usage