Обсуждение: controlling memory management with regard to a specific query (or groups of connections)

Поиск
Список
Период
Сортировка

controlling memory management with regard to a specific query (or groups of connections)

От
Jonathan Vanasco
Дата:
As a temporary fix I need to write some uploaded image files to PostgreSQL until a task server can read/process/delete
them.  

The problem I've run into (via server load tests that model our production environment), is that these read/writes end
uppushing the indexes used by other queries out of memory -- causing them to be re-read from disk.   These files can be
anywherefrom 200k to 5MB. 

has anyone dealt with situations like this before and has any suggestions?  I could use a dedicated db connection if
thatwould introduce any options.  



Re: controlling memory management with regard to a specific query (or groups of connections)

От
Roxanne Reid-Bennett
Дата:
On 11/18/2015 5:10 PM, Jonathan Vanasco wrote:
> As a temporary fix I need to write some uploaded image files to PostgreSQL until a task server can
read/process/deletethem. 
>
> The problem I've run into (via server load tests that model our production environment), is that these read/writes
endup pushing the indexes used by other queries out of memory -- causing them to be re-read from disk.   These files
canbe anywhere from 200k to 5MB. 
>
> has anyone dealt with situations like this before and has any suggestions?  I could use a dedicated db connection if
thatwould introduce any options. 

We have a system that loads a bunch of files up to be processed - we
queue them for processing behind the scenes.  We don't load them into
Postgres before processing.  We put them in a temp directory and just
save the location of the file to the database.  This configuration does
have limitations.  Post-processing can not be load balanced across
servers unless the temp directory is  shared.

I'm sure you'll get more DB centric answers from others on the list.

Roxanne

--
[At other schools] I think the most common fault in general is to teach students how to pass exams instead of teaching
themthe science. 
Donald Knuth



Re: controlling memory management with regard to a specific query (or groups of connections)

От
Bill Moran
Дата:
On Wed, 18 Nov 2015 20:10:00 -0500
Jonathan Vanasco <postgres@2xlp.com> wrote:

> As a temporary fix I need to write some uploaded image files to PostgreSQL until a task server can
read/process/deletethem.   
>
> The problem I've run into (via server load tests that model our production environment), is that these read/writes
endup pushing the indexes used by other queries out of memory -- causing them to be re-read from disk.   These files
canbe anywhere from 200k to 5MB. 
>
> has anyone dealt with situations like this before and has any suggestions?  I could use a dedicated db connection if
thatwould introduce any options.  

PostgreSQL doesn't have any provisions for preferring one thing
or another for storing in memory.

The easiest thing I can think would be to add memory to the machine
(or configure Postgres to use more) such that those files aren't
pushing enough other pages out of memory to have a problematic
impact.

Another idea would be to put the image database on a different
physical server, or run 2 instances of Postgres on a single
server with the files in one database configured with a low
shared_buffers value, and the rest of the data on the other
database server configured with higher shared_buffers.

I know these probably aren't the kind of answers you're looking
for, but I don't have anything better to suggest; and the rest
of the mailing list seems to be devoid of ideas as well.

--
Bill Moran


Re: controlling memory management with regard to a specific query (or groups of connections)

От
Roxanne Reid-Bennett
Дата:
On 11/19/2015 12:29 PM, Bill Moran wrote:
> On Wed, 18 Nov 2015 20:10:00 -0500
> Jonathan Vanasco <postgres@2xlp.com> wrote:
>
>> As a temporary fix I need to write some uploaded image files to PostgreSQL until a task server can
read/process/deletethem. 
>>
>> The problem I've run into (via server load tests that model our production environment), is that these read/writes
endup pushing the indexes used by other queries out of memory -- causing them to be re-read from disk.   These files
canbe anywhere from 200k to 5MB. 

> ... PostgreSQL doesn't have any provisions for preferring one thing or
> another for storing in memory. The easiest thing I can think would be
> to add memory to the machine (or configure Postgres to use more) such
> that those files aren't pushing enough other pages out of memory to
> have a problematic impact.

Perhaps this is just noise - but how is "just a" 5Mb file upload pushing
critical matter out of memory ?  Throttle your file uploads ...

Roxanne



Re: controlling memory management with regard to a specific query (or groups of connections)

От
Jonathan Vanasco
Дата:
Thanks. Unfortunately, this is in a clustered environment.   NFS and other shared drive systems won't scale well.  I'd
needto run a service that can serve/delete the local files, which is why I'm just stashing it in Postgres for now.   

> On Nov 19, 2015, at 2:26 AM, Roxanne Reid-Bennett <rox@tara-lu.com> wrote:
>
> We have a system that loads a bunch of files up to be processed - we queue them for processing behind the scenes.  We
don'tload them into Postgres before processing.  We put them in a temp directory and just save the location of the file
tothe database.  This configuration does have limitations.  Post-processing can not be load balanced across servers
unlessthe temp directory is  shared. 
>
> I'm sure you'll get more DB centric answers from others on the list.
>
> Roxanne


Re: controlling memory management with regard to a specific query (or groups of connections)

От
Jeff Janes
Дата:
On Wed, Nov 18, 2015 at 5:10 PM, Jonathan Vanasco <postgres@2xlp.com> wrote:
> As a temporary fix I need to write some uploaded image files to PostgreSQL until a task server can
read/process/deletethem. 
>
> The problem I've run into (via server load tests that model our production environment), is that these read/writes
endup pushing the indexes used by other queries out of memory -- causing them to be re-read from disk.   These files
canbe anywhere from 200k to 5MB. 

Are you storing them as large object, or as bytea?  Can you share the
load testing scripts?

Cheers,

Jeff