Обсуждение: [Q] optmizing postgres for 'single client' / many small queries

Поиск
Список
Период
Сортировка

[Q] optmizing postgres for 'single client' / many small queries

От
"V S P"
Дата:
Hi,
our application is using Postgres in a rather unusuall way.
It is used by a GUI application to store several hundred
thousand 'parameters'.  Basically it is used like a big INI
file.

There are about 50 tables with various parameters.

The application typicall goes like this

select id, child_tb_key_id, <fields with parms> from tb1

then for each selected row above
     select from the child table do a select (like the above)

and so on -- many levels deep



I know that it is not a proper way to use SQL
Instead we should be selecting many rows at once, joining them/etc

But it is  what it is now...

Queries are very fast though, Postgres reports that the
all the queries for a typical 'load' operation take 0.8 seconds
-- however overall time that the GUI user perceives is 8 seconds.
Out of that 8 seconds a big chunk is in the sending of the SQL
statements/receiving results back -- just network traffic, parsing/etc

There are total about 2400 queries that happen in that period of time
(just selects)




I am trying to figure out how can I optimize PG configuration
to suite such a contrived deployment of Postgres.

For example, we do not mind PG running on the same machine
as the Client app (it is connected via Qt Sql Pg plugin (so it uses
Pg native access library underneath).

Are there any optmization can be done for that?


Also this is a 'single' client/single connection system
what optimizations can be done for that?

and finally since most of the queries are very quick index-based
selects what can be done to optimize the traffic between pg and
the client?



thank you in advance for
any recommendations/pointers.




--
Vlad P
author of C++  ORM  http://github.com/vladp/CppOrm/tree/master


--
http://www.fastmail.fm - Send your email first class


Re: [Q] optmizing postgres for 'single client' / many small queries

От
Pavel Stehule
Дата:
Hello

Are you sure, so you have to use PostgreSQL - maybe SQLite or
memcached is better for your task.

regards
Pavel Stehule

2009/9/2 V S P <toreason@fastmail.fm>:
> Hi,
> our application is using Postgres in a rather unusuall way.
> It is used by a GUI application to store several hundred
> thousand 'parameters'.  Basically it is used like a big INI
> file.
>
> There are about 50 tables with various parameters.
>
> The application typicall goes like this
>
> select id, child_tb_key_id, <fields with parms> from tb1
>
> then for each selected row above
>     select from the child table do a select (like the above)
>
> and so on -- many levels deep
>
>
>
> I know that it is not a proper way to use SQL
> Instead we should be selecting many rows at once, joining them/etc
>
> But it is  what it is now...
>
> Queries are very fast though, Postgres reports that the
> all the queries for a typical 'load' operation take 0.8 seconds
> -- however overall time that the GUI user perceives is 8 seconds.
> Out of that 8 seconds a big chunk is in the sending of the SQL
> statements/receiving results back -- just network traffic, parsing/etc
>
> There are total about 2400 queries that happen in that period of time
> (just selects)
>
>
>
>
> I am trying to figure out how can I optimize PG configuration
> to suite such a contrived deployment of Postgres.
>
> For example, we do not mind PG running on the same machine
> as the Client app (it is connected via Qt Sql Pg plugin (so it uses
> Pg native access library underneath).
>
> Are there any optmization can be done for that?
>
>
> Also this is a 'single' client/single connection system
> what optimizations can be done for that?
>
> and finally since most of the queries are very quick index-based
> selects what can be done to optimize the traffic between pg and
> the client?
>
>
>
> thank you in advance for
> any recommendations/pointers.
>
>
>
>
> --
> Vlad P
> author of C++  ORM  http://github.com/vladp/CppOrm/tree/master
>
>
> --
> http://www.fastmail.fm - Send your email first class
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [Q] optmizing postgres for 'single client' / many small queries

От
"V S P"
Дата:
Hi,
yes, I am sure I have to continue supporting Postgres
at this time, it would take enormous effor to change to something else
But yes, sqlite or tokiocabinet in my view would be good options (the
decison was made
some time ago, unfortunately).



On Wed, 02 Sep 2009 19:49 +0200, "Pavel Stehule"
<pavel.stehule@gmail.com> wrote:
> Hello
>
> Are you sure, so you have to use PostgreSQL - maybe SQLite or
> memcached is better for your task.
>
> regards
> Pavel Stehule
>
> 2009/9/2 V S P <toreason@fastmail.fm>:
> > Hi,
> > our application is using Postgres in a rather unusuall way.
> > It is used by a GUI application to store several hundred
> > thousand 'parameters'.  Basically it is used like a big INI
> > file.
> >
> > There are about 50 tables with various parameters.
> >
> > The application typicall goes like this
> >
> > select id, child_tb_key_id, <fields with parms> from tb1
> >
> > then for each selected row above
> >     select from the child table do a select (like the above)
> >
> > and so on -- many levels deep
> >
> >
> >
> > I know that it is not a proper way to use SQL
> > Instead we should be selecting many rows at once, joining them/etc
> >
> > But it is  what it is now...
> >
> > Queries are very fast though, Postgres reports that the
> > all the queries for a typical 'load' operation take 0.8 seconds
> > -- however overall time that the GUI user perceives is 8 seconds.
> > Out of that 8 seconds a big chunk is in the sending of the SQL
> > statements/receiving results back -- just network traffic, parsing/etc
> >
> > There are total about 2400 queries that happen in that period of time
> > (just selects)
> >
> >
> >
> >
> > I am trying to figure out how can I optimize PG configuration
> > to suite such a contrived deployment of Postgres.
> >
> > For example, we do not mind PG running on the same machine
> > as the Client app (it is connected via Qt Sql Pg plugin (so it uses
> > Pg native access library underneath).
> >
> > Are there any optmization can be done for that?
> >
> >
> > Also this is a 'single' client/single connection system
> > what optimizations can be done for that?
> >
> > and finally since most of the queries are very quick index-based
> > selects what can be done to optimize the traffic between pg and
> > the client?
> >
> >
> >
> > thank you in advance for
> > any recommendations/pointers.
> >
> >
> >
> >
> > --
> > Vlad P
> > author of C++  ORM  http://github.com/vladp/CppOrm/tree/master
> >
> >
> > --
> > http://www.fastmail.fm - Send your email first class
> >
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >
--
Vlad P
author of C++  ORM  http://github.com/vladp/CppOrm/tree/master


--
http://www.fastmail.fm - Accessible with your email software
                          or over the web


Re: [Q] optmizing postgres for 'single client' / many small queries

От
Tom Lane
Дата:
"V S P" <toreason@fastmail.fm> writes:
> The application typicall goes like this

> select id, child_tb_key_id, <fields with parms> from tb1

> then for each selected row above
>      select from the child table do a select (like the above)

> and so on -- many levels deep

Seems like you need to fix your data representation so that this
operation can be collapsed into one query.  The main problem looks
to be a bogus decision to have separate child tables rather than
one big table with an extra key column.

            regards, tom lane

Re: [Q] optmizing postgres for 'single client' / many small queries

От
"V S P"
Дата:
Well, actually
somebody has written a C++ ORM
that allows to do things (just a sketch)

class L: CDbCapable
{
  public:
  int prop1;
  int prop2;
}


class A:  CDbCapable
{
  QArray<L>   list_of_props_xyz;

}


int main ()
{
   A inst1;

   inst1.create_or_update_DbSchemaIfNeeded();


   inst1.readFromDb();
   //modifying something

   inst1.writeToDb();

}



As it is well known C++ lack of Reflection prevents it from having
standardise  Data serialization
libraries to files or to Databases.

So in-house a mechanism was developed to do the above.  It took some
time and it is not possible to just yank it out.


Of course, internally in the ORM's implementation a somewhat
questionable decision was made that
to process arrays of 'children' for a given instance would require
separate SQL statements.


That's where the problem comes from, I understand what needs to be done
to redesign the approach/etc.  And that
will take more time than currently is available.

Therefore, I just wanted to ask if there there are some things in Pg
that can I can experiment with
(local client/server communcations via IPC,  reducing the speed of SQL
parses, any other possible tricks)



Thank you  in advance




On Wed, 02 Sep 2009 14:26 -0400, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> "V S P" <toreason@fastmail.fm> writes:
> > The application typicall goes like this
>
> > select id, child_tb_key_id, <fields with parms> from tb1
>
> > then for each selected row above
> >      select from the child table do a select (like the above)
>
> > and so on -- many levels deep
>
> Seems like you need to fix your data representation so that this
> operation can be collapsed into one query.  The main problem looks
> to be a bogus decision to have separate child tables rather than
> one big table with an extra key column.
>
>             regards, tom lane
--
Vlad P
author of C++  ORM  http://github.com/vladp/CppOrm/tree/master


--
http://www.fastmail.fm - mmm... Fastmail...


Re: [Q] optmizing postgres for 'single client' / many small queries

От
Tom Lane
Дата:
"V S P" <toreason@fastmail.fm> writes:
> Well, actually
> somebody has written a C++ ORM
> [ that is causing all your problems and you say you can't discard ]

Just out of curiosity, does anyone know of any ORM anywhere that doesn't
suck?  They seem to be uniformly awful, at least in terms of their
interfaces to SQL databases.  If there were some we could recommend,
maybe people would be less stuck with these bogus legacy architectures.

            regards, tom lane

Re: [Q] optmizing postgres for 'single client' / many small queries

От
"V S P"
Дата:
I do not know of any for C++.

That's why I started my own (which is not the one used for the problem I
am having  :- ) )
http://github.com/vladp/CppOrm
it works with Pg 8.3+ and VC++ compiler sofar (but support for more
platforms and Dbs will be added in the future).
My Orm is not really an ORM because I did not implement anything that
would 'traverse' object instance
relationships (which is what the ORM in question here is doing).
Instead I just automagically generate SQL code for insert/update/deletes
for classes that map to tables (one-to-one).


The basic problem is that C++ standards comittee in my view just
sucks... i do not have any better words
for it.  It is because of lack of reflection (ability to identify at
runtime variable names/functions names)
that an ORM, or HTTP session storage/retrival mechanism, JSON/XML
parsers that parse text right into class instances
-- cannot be implemented
Basically the things that are needed to deal with 'Typeless' data at
runtime (such that XML/JSON/Database queries)
and map that data to the C++ object instances.

Which is in the 'high-level view' why C++ is not used for web
development.


Yes there are 'attempts' in that area -- but all are different, require
quite a bit of sophistication
and are not complete (The reflection mechanism I implemented for my
cpporm is not complete either).

If C++ would have supported Reflection -- the there would be
C++_Hibernate, C++_LINQ, C++_json, C++_xml, C++_HTTP, C++_HTTPSession
and so on... (and no they would have been memory hogs -- thanks to now
standard reference counting in C++ via shared_ptr and good use of
allocators)


sorry for the rant,
still looking for any bright ideas on optimizing for many small
queries/local db host situations.

Thanks









On Wed, 02 Sep 2009 14:45 -0400, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> "V S P" <toreason@fastmail.fm> writes:
> > Well, actually
> > somebody has written a C++ ORM
> > [ that is causing all your problems and you say you can't discard ]
>
> Just out of curiosity, does anyone know of any ORM anywhere that doesn't
> suck?  They seem to be uniformly awful, at least in terms of their
> interfaces to SQL databases.  If there were some we could recommend,
> maybe people would be less stuck with these bogus legacy architectures.
>
>             regards, tom lane
--
Vlad P
author of C++  ORM  http://github.com/vladp/CppOrm/tree/master


--
http://www.fastmail.fm - Faster than the air-speed velocity of an
                          unladen european swallow


Re: [Q] optmizing postgres for 'single client' / many small queries

От
David Fetter
Дата:
On Wed, Sep 02, 2009 at 02:45:39PM -0400, Tom Lane wrote:
> "V S P" <toreason@fastmail.fm> writes:
> > Well, actually somebody has written a C++ ORM
> > [ that is causing all your problems and you say you can't discard ]
>
> Just out of curiosity, does anyone know of any ORM anywhere that
> doesn't suck?  They seem to be uniformly awful, at least in terms of
> their interfaces to SQL databases.  If there were some we could
> recommend, maybe people would be less stuck with these bogus legacy
> architectures.

Hibernate has the very nice feature of being able to get out of your
way.  Properly used, it can keep completely out of the business of
making (wrong) guesses based on DDL, which is what ORMs often do.
DBIx::Class <http://search.cpan.org/dist/DBIx-Class/> has gone a long
way in the right direction.

Ones which (attempt to) dictate decisions about DDL are just off the
map. :P

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [Q] optmizing postgres for 'single client' / many small queries

От
Richard Broersma
Дата:
On Wed, Sep 2, 2009 at 4:35 PM, David Fetter<david@fetter.org> wrote:

> Hibernate has the very nice feature of being able to get out of your
> way.  Properly used, it can keep completely out of the business of
> making (wrong) guesses based on DDL, which is what ORMs often do.
> DBIx::Class <http://search.cpan.org/dist/DBIx-Class/> has gone a long
> way in the right direction.
>
> Ones which (attempt to) dictate decisions about DDL are just off the
> map. :P

David, do you know how well these kinds of ORMs work when it come to
mapping  non-trivial schema designs?  For example, how would these
work when creating a mapping for the multiple inheritance design that
you've blogged about earlier?


--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

Re: [Q] optmizing postgres for 'single client' / many small queries

От
David Fetter
Дата:
On Thu, Sep 03, 2009 at 07:24:50AM -0700, Richard Broersma wrote:
> On Wed, Sep 2, 2009 at 4:35 PM, David Fetter<david@fetter.org> wrote:
>
> > Hibernate has the very nice feature of being able to get out of
> > your way.  Properly used, it can keep completely out of the
> > business of making (wrong) guesses based on DDL, which is what
> > ORMs often do.  DBIx::Class
> > <http://search.cpan.org/dist/DBIx-Class/> has gone a long way in
> > the right direction.
> >
> > Ones which (attempt to) dictate decisions about DDL are just off
> > the map. :P
>
> David, do you know how well these kinds of ORMs work when it come to
> mapping  non-trivial schema designs?  For example, how would these
> work when creating a mapping for the multiple inheritance design
> that you've blogged about earlier?

If your mapper only does the job of mapping, you can choose
classes/objects and then map them to the appropriate, possibly
parameterized, SQL queries, which the DB people can then freely
rearrange.

One nice feature of such a system is that the DBA and/or DB developer
has a way to know what the client code expects.  In OO terms, there
are public interfaces--everything mentioned in the ORM layer--and
private interfaces--DDL, DML, and DCL--to the database.

Programmers who like to use object-oriented languages and methods
should be happy about this object-oriented approach to database
management, but for some reason, a lot of them don't understand that
the idea of public and private interfaces applies to what they (too
simplistically, much of the time) think of as "the persistence layer."

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [Q] optmizing postgres for 'single client' / many small queries

От
Dimitri Fontaine
Дата:
Hi,

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Just out of curiosity, does anyone know of any ORM anywhere that doesn't
> suck?  They seem to be uniformly awful, at least in terms of their
> interfaces to SQL databases.  If there were some we could recommend,
> maybe people would be less stuck with these bogus legacy architectures.

It seems like people interrested into ORMs are the one who do not want
to tackle SQL... and for people having some time to spend on the
possibility of finding a good ORM:
  http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx

A more practical discussion seems to be here (I still have to read it):
  http://omniti.com/seeds/orms-done-right

Regards,
--
dim

PS: In short my advice is always to choose an ORM where it's easy to
bypass query generation, and stick to not letting it generate
SQL. Sometime basic CRUD is ok though (INSERT/UPDATE/DELETE one
object/row at a time).

Re: [Q] optmizing postgres for 'single client' / many small queries

От
William Temperley
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Just out of curiosity, does anyone know of any ORM anywhere that doesn't
> suck?  They seem to be uniformly awful, at least in terms of their
> interfaces to SQL databases.  If there were some we could recommend,
> maybe people would be less stuck with these bogus legacy architectures.

I personally like sqlalchemy - http://www.sqlalchemy.org/.

Some of the good things:
1. Table reflection, i.e. dynamic generation of mapped python classes
- great at development time - no mismatch between db and client side
code.
2. Easily extensible - about 10 lines of code gives you an interface
to Geos geometries retrieved from PostGIS
3. It seems to be possible to produce arbitrarily complex sql
statements. OK, they're usually less readable than sql, but for
dynamic query generation it definitely beats manually chopping up
text.

Cheers, Will T

Re: [Q] optmizing postgres for 'single client' / many small queries

От
Jasen Betts
Дата:
On 2009-09-02, V S P <toreason@fastmail.fm> wrote:
> Hi,
> our application is using Postgres in a rather unusuall way.
> It is used by a GUI application to store several hundred
> thousand 'parameters'.  Basically it is used like a big INI
> file.
>
> There are about 50 tables with various parameters.
>
> The application typicall goes like this
>
> select id, child_tb_key_id, <fields with parms> from tb1
>
> then for each selected row above
>      select from the child table do a select (like the above)
>
> and so on -- many levels deep
>
> I know that it is not a proper way to use SQL
> Instead we should be selecting many rows at once, joining them/etc
>
> But it is  what it is now...

rewriting it as a single query will speed up operation.

> Queries are very fast though, Postgres reports that the
> all the queries for a typical 'load' operation take 0.8 seconds
> -- however overall time that the GUI user perceives is 8 seconds.
> Out of that 8 seconds a big chunk is in the sending of the SQL
> statements/receiving results back -- just network traffic, parsing/etc

0.8s processing 7.2s latency

> There are total about 2400 queries that happen in that period of time
> (just selects)

> I am trying to figure out how can I optimize PG configuration
> to suite such a contrived deployment of Postgres.

> For example, we do not mind PG running on the same machine
> as the Client app (it is connected via Qt Sql Pg plugin (so it uses
> Pg native access library underneath).
>
> Are there any optmization can be done for that?

try using libpq directly.

> Also this is a 'single' client/single connection system
> what optimizations can be done for that?

asynchronous queries if you can know what to ask second before the answer to
the first query

> and finally since most of the queries are very quick index-based
> selects what can be done to optimize the traffic between pg and
> the client?

unix domain sockets

> thank you in advance for
> any recommendations/pointers.

but serously bite the bullet and redo it a a single query that returns
2400 rows instead of 2400 queries that return one row.

or slurp all the tables into dictionaries (hash tables/arrays/whatever) in your
QT application. and write your queries in C++ instead of SQL.