Обсуждение: Getting time of a postgresql-request

От:
"Kai Behncke"
Дата:

Dear users,

I try to optimize the time of my Postgresql-requests, but for that, the first step,
I of course need to get that time.

I know that with:

EXPLAIN ANALYSE SELECT bundesland from
 bundesland WHERE ST_Contains(the_geom, $punktgeometrie_start) AND
 ST_Contains(the_geom, $punktgeometrie_ende)

I can get that time on command line.

But I would like to get it in a php-script, like

$timerequest_result=pg_result($timerequest,0);

(well, that does not work).

I wonder: Is there another way to get the time a request needs?
How do you handle this?

Thank you very much, Kai

--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

От:
Russell Smith
Дата:

Kai Behncke wrote:
>
> But I would like to get it in a php-script, like
>
> $timerequest_result=pg_result($timerequest,0);
>
> (well, that does not work).
>
> I wonder: Is there another way to get the time a request needs?
> How do you handle this?
>
$time = microtime()
$result = pg_result($query);
echo "Time to run query and return result to PHP: ".(microtime() - $time);

Something like that.

Regards

Russell

От:
Pierre Frédéric Caillaud
Дата:

On Tue, 18 Aug 2009 06:25:57 +0200, Russell Smith <>
wrote:

> Kai Behncke wrote:
>>
>> But I would like to get it in a php-script, like
>>
>> $timerequest_result=pg_result($timerequest,0);
>>
>> (well, that does not work).
>>
>> I wonder: Is there another way to get the time a request needs?
>> How do you handle this?
>>
> $time = microtime()
> $result = pg_result($query);
> echo "Time to run query and return result to PHP: ".(microtime() -
> $time);
>
> Something like that.
>
> Regards
>
> Russell
>

I use the following functions wich protect against SQL injections, make
using the db a lot easier, and log query times to display at the bottom of
the page.
It is much less cumbersome than PEAR::DB or pdo which force you to use
prepared statements (slower if you throw them away after using them just
once)

db_query( "SELECT * FROM stuff WHERE a=%s AND b=%s", array( $a, $b ))

db_query( "SELECT * FROM stuff WHERE id IN (%s) AND b=%s", array(
$list_of_ints, $b ))

------------

function db_quote_query( $sql, $params=false )
{
    // if no params, send query raw
    if( $params === false )    return $sql;
    if( !is_array( $params )) $params = array( $params );

    // quote params
    foreach( $params as $key => $val )
    {
        if( is_array( $val ))
            $params[$key] = implode( ', ', array_map( intval, $val ));
        else
            $params[$key] = is_null($val)?'NULL':("'".pg_escape_string($val)."'");;
    }
    return vsprintf( $sql, $params );
}

function db_query( $sql, $params=false )
{
    // it's already a query
    if( is_resource( $sql ))
        return $sql;

    $sql = db_quote_query( $sql, $params );

    $t = getmicrotime( true );
    if( DEBUG > 1 )    xdump( $sql );
    $r = pg_query( $sql );
    if( !$r )
    {
        if( DEBUG > 1 )
        {
            echo "<div class=bigerror><b>Erreur PostgreSQL :</b><br
/>".htmlspecialchars(pg_last_error())."<br /><br /><b>Requête</b> :<br
/>".$sql."<br /><br /><b>Traceback </b>:<pre>";
            foreach( debug_backtrace() as $t ) xdump( $t );
            echo "</pre></div>";
        }
        die();
    }
    if( DEBUG > 1)    xdump( $r );
    global $_global_queries_log, $_mark_query_time;
    $_mark_query_time = getmicrotime( true );
    $_global_queries_log[] = array( $_mark_query_time-$t, $sql );
    return $r;
}

От:
Fabio La Farcioli
Дата:

Hi to all,

i am developing a web app for thousands users (1.000/2.000).

Each user have a 2 table of work...I finally have 2.000 (users) x 2
tables = 4.000 tables!

Postgres support an elevate number of tables??
i have problem of performance ???


Thanks

Sorry for my english

От:
Jochen Erwied
Дата:

Thursday, August 20, 2009, 9:01:30 AM you wrote:

> i am developing a web app for thousands users (1.000/2.000).

> Each user have a 2 table of work...I finally have 2.000 (users) x 2
> tables = 4.000 tables!

If all tables are created equal, I would rethink the design. Instead of
using 2 tables per user I'd use 2 tables with one column specifying the
user(-id).

Especially changes in table layout would require you to change up to 2000
tables, which is prone to errors...

--
Jochen Erwied     |   home:      +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work:   +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile:        +49-173-5404164


От:
Craig Ringer
Дата:

On Thu, 2009-08-20 at 09:01 +0200, Fabio La Farcioli wrote:

> Each user have a 2 table of work...I finally have 2.000 (users) x 2
> tables = 4.000 tables!

Hmm, ok. Does each user really need two tables each? Why?

Does the set of tables for each user have a different structure? Or are
you separating them so you can give each user a separate database role
and ownership of their own tables?


> Postgres support an elevate number of tables??

Thousands? Sure.

> i have problem of performance ???
>
Yes, you probably will. There is a cost to having _lots_ of tables in
PostgreSQL in terms of maintaining table statistics, autovacuum work,
etc. I doubt it'll be too bad at 4000 tables, but if your user numbers
keep growing it could become a problem.

Other concerns are that it'll also be hard to maintain your design,
difficult to write queries that read data from more than one user, etc.
If you need to change the schema of your user tables you're going to
have to write custom tools to automate it. It could get very clumsy.

Instead of one or two tables per user, perhaps you should keep the data
in just a few tables, with a composite primary key that includes the
user ID. eg given the user table:

CREATE TABLE user (
  id SERIAL PRIMARY KEY,
  name text
);

instead of:

CREATE TABLE user1_tablea(
  id INTEGER PRIMARY KEY,
  blah text,
  blah2 integer
);

CREATE TABLE user2_tablea(
  id INTEGER PRIMARY KEY,
  blah text,
  blah2 integer
);

... etc ...


you might write:

CREATE TABLE tablea  (
  user_id INTEGER REFERENCES user(id),
  id INTEGER,
  PRIMARY KEY(user_id, id),
  blah text,
  blah2 integer
);


You can, of course, partition this table into blocks of user-IDs behind
the scenes, but your partitioning is invisible to your web app and can
be done solely for performance reasons. You don't have to try juggling
all these little tables.


Note that whether this is a good idea DOES depend on how much data
you're going to have. If each user table will have _lots_ of data, then
individual tables might be a better approach after all. It's also a
benefit if you do intend to give each user their own database role.

--
Craig Ringer



От:
Fabio La Farcioli
Дата:

Craig Ringer ha scritto:
> On Thu, 2009-08-20 at 09:01 +0200, Fabio La Farcioli wrote:
>
>> Each user have a 2 table of work...I finally have 2.000 (users) x 2
>> tables = 4.000 tables!
>
> Hmm, ok. Does each user really need two tables each? Why?
>
> Does the set of tables for each user have a different structure? Or are
> you separating them so you can give each user a separate database role
> and ownership of their own tables?
>
No no...

>> i have problem of performance ???
>>
> Yes, you probably will. There is a cost to having _lots_ of tables in
> PostgreSQL in terms of maintaining table statistics, autovacuum work,
> etc. I doubt it'll be too bad at 4000 tables, but if your user numbers
> keep growing it could become a problem.
>
The number of the user probably will increase with the time...

> Other concerns are that it'll also be hard to maintain your design,
> difficult to write queries that read data from more than one user, etc.
> If you need to change the schema of your user tables you're going to
> have to write custom tools to automate it. It could get very clumsy.
>
It's true...i don't think to this problem..


> Note that whether this is a good idea DOES depend on how much data
> you're going to have. If each user table will have _lots_ of data, then
> individual tables might be a better approach after all. It's also a
> benefit if you do intend to give each user their own database role.

Every table have between 1.000 and 100.000(MAX) records...

Do you think i don't have problem in performance ??
The user only view the record whit its user_id....

I am thinking to redesign the DB


От:
Craig James
Дата:

Fabio La Farcioli wrote:
> i am developing a web app for thousands users (1.000/2.000).
>
> Each user have a 2 table of work...I finally have 2.000 (users) x 2
> tables = 4.000 tables!
>
> Postgres support an elevate number of tables??
> i have problem of performance ???

We have run databases with over 100,000 tables with no problems.

However, we found that it's not a good idea to have a table-per-user design.  As you get more users, it is hard to
maintainthe database.  Most of the time there are only a few users active. 

So, we create a single large "archive" table, identical to the per-user table except that it also has a user-id column.
When a user hasn't logged in for a few hours, a cron process copies their tables into the large archive table, and
returnstheir personal tables to a "pool" of available tables. 

When the user logs back in, a hidden part of the login process gets a table from the pool of available tables, assigns
itto this user, and copies the user's  data from the archive into this personal table.  They are now ready to work.
Thiswhole process takes just a fraction of a second for most users. 

We keep a pool of about 200 tables, which automatically will expand (create more tables) if needed, but we've never had
morethan 200 users active at one time. 

Craig

От:
Greg Stark
Дата:

On Thu, Aug 20, 2009 at 9:16 PM, Craig James<> wrote:
> Fabio La Farcioli wrote:
>>
>> i am developing a web app for thousands users (1.000/2.000).
>>
>> Each user have a 2 table of work...I finally have 2.000 (users) x 2 tables
>> = 4.000 tables!
>>
>> Postgres support an elevate number of tables??
>> i have problem of performance ???

What you want is a multi-column primary key where userid is part of
the key. You don't want to have a separate table for each user unless
each user has their own unique set of columns.


> When the user logs back in, a hidden part of the login process gets a table
> from the pool of available tables, assigns it to this user, and copies the
> user's  data from the archive into this personal table.  They are now ready
> to work. This whole process takes just a fraction of a second for most
> users.

And what does all this accomplish?


--
greg
http://mit.edu/~gsstark/resume.pdf

От:
Craig James
Дата:

Greg Stark wrote:
> What you want is a multi-column primary key where userid is part of
> the key. You don't want to have a separate table for each user unless
> each user has their own unique set of columns.

Not always true.

>> When the user logs back in, a hidden part of the login process gets a table
>> from the pool of available tables, assigns it to this user, and copies the
>> user's  data from the archive into this personal table.  They are now ready
>> to work. This whole process takes just a fraction of a second for most
>> users.
>
> And what does all this accomplish?

The primary difference is between

  delete from big_table where userid = xx

vesus

  truncate user_table

There are also significant differences in performance for large inserts, because a single-user table almost never needs
indexesat all, whereas a big table for everyone has to have at least one user-id column that's indexed. 

In our application, the per-user tables are "hitlists" -- scratch lists that are populated something like this.  The
hitlistis something like this: 

   create table hitlist_xxx (
     row_id integer,
     sortorder integer default nextval('hitlist_seq_xxx')
   )


   truncate table hitlist_xxx;
   select setval(hitlist_seq_xxx, 1, false);
   insert into hitlist_xxx (row_id) (select some_id from ... where ... order by ...);

Once the hitlist is populated, the user can page through it quickly with no further searching, e.g. using a web app.

We tested the performance using a single large table in Postgres, and it was not nearly what we needed.  These hitlists
tendto be transitory, and the typical operation is to discard the entire list and create a new one.  Sometimes the user
willsort the entire list based on some criterion, which also requires a copy/delete/re-insert using a new order-by. 

With both Oracle and Postgres, truncate is MUCH faster than delete, and the added index needed for a single large table
onlymakes it worse.  With Postgres, the repeated large delete/insert makes for tables that need a lot of vacuuming and
indexbloat, further hurting performance. 

Craig

От:
Greg Stark
Дата:

On Thu, Aug 20, 2009 at 11:18 PM, Craig James<> wrote:
> Greg Stark wrote:
>>
>> What you want is a multi-column primary key where userid is part of
>> the key. You don't want to have a separate table for each user unless
>> each user has their own unique set of columns.
> Not always true.
...
> The primary difference is between
>  delete from big_table where userid = xx
> vesus
>  truncate user_table


This is a valid point but it's a fairly special case. For most
applications the overhead of deleting records and having to run vacuum
will be manageable and a small contribution to the normal vacuum
traffic. Assuming the above is necessary is a premature optimization
which is probably unnecessary.


> There are also significant differences in performance for large inserts,
> because a single-user table almost never needs indexes at all, whereas a big
> table for everyone has to have at least one user-id column that's indexed.

Maintaining indexes isn't free but one index is hardly going to be a
dealbreaker.

> Once the hitlist is populated, the user can page through it quickly with no
> further searching, e.g. using a web app.

The "traditional" approach to this would be a temporary table. However
in the modern world of web applications where the user session does
not map directly to a database session that no longer works (well it
never really worked in Postgres where temporary tables are not so
lightweight :( ).

It would be nice to have a solution to that where you could create
lightweight temporary objects which belong to an "application session"
which can be picked up by a different database connection each go
around.

--
greg
http://mit.edu/~gsstark/resume.pdf

От:
Alvaro Herrera
Дата:

Greg Stark wrote:

> It would be nice to have a solution to that where you could create
> lightweight temporary objects which belong to an "application session"
> which can be picked up by a different database connection each go
> around.

It would be useful:

CREATE SCHEMA session1234 UNLOGGED
  CREATE TABLE hitlist ( ... );

Each table in the "session1234" schema would not be WAL-logged, and
would be automatically dropped on crash recovery (actually the whole
schema would be).  But while the server is live it behaves like a
regular schema/table and can be seen by all backends (i.e. not temp)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

От:
Greg Stark
Дата:

On Fri, Aug 21, 2009 at 1:38 AM, Alvaro
Herrera<> wrote:
> Greg Stark wrote:
>
>> It would be nice to have a solution to that where you could create
>> lightweight temporary objects which belong to an "application session"
>> which can be picked up by a different database connection each go
>> around.
>
> It would be useful:
>
> CREATE SCHEMA session1234 UNLOGGED
>  CREATE TABLE hitlist ( ... );
>
> Each table in the "session1234" schema would not be WAL-logged, and
> would be automatically dropped on crash recovery (actually the whole
> schema would be).  But while the server is live it behaves like a
> regular schema/table and can be seen by all backends (i.e. not temp)

I don't think unlogged is the only, and perhaps not even the most
important, desirable property.

I would want these objects not to cause catalog churn. I might have
thousands of sessions being created all the time and creating new rows
and index pointers which have to be vacuumed would be a headache.

I would actually want the objects to be invisible to other sessions,
at least by default. You would have to have the handle for the
application session to put them into your scope and then you would get
them all en masse. This isn't so much for security -- I would be fine
if there was a back door if you have the right privileges -- but for
application design, so application queries could use prepared plans
without modifying the query to point to hard code the session
information within them and be replanned.

I'm not sure if they should use shared buffers or local buffers. As
long as only one backend at a time could access them it would be
possible to use local buffers and evict them all when the handle is
given up. But that means giving up any caching benefit across
sessions. On the other hand it means they'll be much lighter weight
and easier to make safely unlogged than if they lived in shared
buffers.

These are just some brainstorming ideas, I don't have a clear vision
of how to achieve all this yet. This does sound a lot like the SQL
standard temp table discussion and I think Tom and I are still at odds
on that. Creating new catalog entries for them gives up -- what I
think is the whole point of their design -- their lack of DDL
overhead. But my design above means problems for transactional
TRUNCATE and other DDL.


--
greg
http://mit.edu/~gsstark/resume.pdf

От:
Jerry Champlin
Дата:

I think this requirement can be lumped into the category of "right
hammer, right nail" instead of the "one hammer, all nails" category.
There are many memory only or disk backed memory based key value
stores which meet your requirements like Reddis and memcached.

-Jerry

Jerry Champlin|Absolute Performance Inc.

On Aug 20, 2009, at 5:52 PM, Greg Stark <> wrote:

> On Thu, Aug 20, 2009 at 11:18 PM, Craig James<
> > wrote:
>> Greg Stark wrote:
>>>
>>> What you want is a multi-column primary key where userid is part of
>>> the key. You don't want to have a separate table for each user
>>> unless
>>> each user has their own unique set of columns.
>> Not always true.
> ...
>> The primary difference is between
>>  delete from big_table where userid = xx
>> vesus
>>  truncate user_table
>
>
> This is a valid point but it's a fairly special case. For most
> applications the overhead of deleting records and having to run vacuum
> will be manageable and a small contribution to the normal vacuum
> traffic. Assuming the above is necessary is a premature optimization
> which is probably unnecessary.
>
>
>> There are also significant differences in performance for large
>> inserts,
>> because a single-user table almost never needs indexes at all,
>> whereas a big
>> table for everyone has to have at least one user-id column that's
>> indexed.
>
> Maintaining indexes isn't free but one index is hardly going to be a
> dealbreaker.
>
>> Once the hitlist is populated, the user can page through it quickly
>> with no
>> further searching, e.g. using a web app.
>
> The "traditional" approach to this would be a temporary table. However
> in the modern world of web applications where the user session does
> not map directly to a database session that no longer works (well it
> never really worked in Postgres where temporary tables are not so
> lightweight :( ).
>
> It would be nice to have a solution to that where you could create
> lightweight temporary objects which belong to an "application session"
> which can be picked up by a different database connection each go
> around.
>
> --
> greg
> http://mit.edu/~gsstark/resume.pdf
>
> --
> Sent via pgsql-performance mailing list (
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

От:
Robert Haas
Дата:

On Thu, Aug 20, 2009 at 8:38 PM, Alvaro
Herrera<> wrote:
> Greg Stark wrote:
>
>> It would be nice to have a solution to that where you could create
>> lightweight temporary objects which belong to an "application session"
>> which can be picked up by a different database connection each go
>> around.
>
> It would be useful:
>
> CREATE SCHEMA session1234 UNLOGGED
>  CREATE TABLE hitlist ( ... );
>
> Each table in the "session1234" schema would not be WAL-logged, and
> would be automatically dropped on crash recovery (actually the whole
> schema would be).  But while the server is live it behaves like a
> regular schema/table and can be seen by all backends (i.e. not temp)

+1.  In fact, I don't even see why the "unlogged" property needs to be
a schema property.  I think you could just add a table reloption.
(There are some possible foot-gun scenarios if the option were changed
subsequent to table creation, so we'd either need to decide how to
deal with those, or decide not to allow it.)

...Robert

От:
"Kevin Grittner"
Дата:

Greg Stark <> wrote:

> Creating new catalog entries for [temp tables] gives up -- what I
> think is the whole point of their design -- their lack of DDL
> overhead.

As long as we're brainstorming...  Would it make any sense for temp
tables to be created as in-memory tuplestores up to the point that we
hit the temp_buffers threshold?  Creating and deleting a whole set of
disk files per temp table is part of what makes them so heavy.
(There's still the issue of dealing with the catalogs, of course....)

-Kevin

От:
Mike Ivanov
Дата:

Fabio La Farcioli wrote:
> Hi to all,
>
> i am developing a web app for thousands users (1.000/2.000).
>
> Each user have a 2 table of work...I finally have 2.000 (users) x 2
> tables = 4.000 tables!

As a someone with a ~50K-table database, I can tell you it's definitely
possible to survive with such a layout :-)

However, expect very slow (hours) pg_dump, \dt and everything else that
requires reading schema information for the whole db.


Mike


От:
Greg Stark
Дата:

On Tue, Sep 1, 2009 at 1:19 AM, Mike Ivanov<> wrote:
>> i am developing a web app for thousands users (1.000/2.000).
>>
>> Each user have a 2 table of work...I finally have 2.000 (users) x 2 tables
>> = 4.000 tables!
>
> As a someone with a ~50K-table database, I can tell you it's definitely
> possible to survive with such a layout :-)

The usual recommendation is to have a single table (or two tables in
this case) with userid forming part of the primary key in addition to
whatever identifies the records within the user's set of data. You may
not expect to be need to run queries which combine multiple users'
data now but you will eventually.

This doesn't work so great when each user is going to be specifying
their own custom schema on the fly but that's not really what
relational databases were designed for. For that you might want to
look into the hstore contrib module or  something like CouchDB (which
can be combined with Postgres I hear)


--
greg
http://mit.edu/~gsstark/resume.pdf

От:
Mike Ivanov
Дата:

Greg Stark wrote:
> You may
> not expect to be need to run queries which combine multiple users'
> data now but you will eventually.
>

We store cross-user data in a separate schema, which solves all *our*
problems.

> This doesn't work so great when each user is going to be specifying
> their own custom schema on the fly

This works fine, at least we didn't encounter any issues with that.

> but that's not really what
> relational databases were designed for.

Sometimes you have to.. you know, unusual things to meet some specific
requirements, like independent user schemas. It's not a conventional web
app we run :-)

I'm not arguing this is a bit extremal approach, but if one is forced to
go this path, it's quite walkable ;-)

Mike