Обсуждение: pgbench \for or similar loop

Поиск
Список
Период
Сортировка

pgbench \for or similar loop

От
Alvaro Herrera
Дата:
Hi,

Today (and previously) I wished that pgbench had a mechanism to help
create simple random databases.  For example, I could create a table
"tenk" and fill it with random stuff like

\setrandom foo 1 10000
insert into foo values (:foo)

Now I have to run this 10000 times or something like that.  But I don't
want a transaction for each of those, so I had a desire for something
like this:

begin;
\for iterator 1 10000\setrandom foo 1 :iteratorinsert into foo values (:foo);
\end
commit;

Would something like this be acceptable?

-- 
Álvaro Herrera <alvherre@alvh.no-ip.org>


Re: pgbench \for or similar loop

От
Merlin Moncure
Дата:
On Mon, Apr 18, 2011 at 4:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Hi,
>
> Today (and previously) I wished that pgbench had a mechanism to help
> create simple random databases.  For example, I could create a table
> "tenk" and fill it with random stuff like
>
> \setrandom foo 1 10000
> insert into foo values (:foo)
>
> Now I have to run this 10000 times or something like that.  But I don't
> want a transaction for each of those, so I had a desire for something
> like this:
>
> begin;
> \for iterator 1 10000
>  \setrandom foo 1 :iterator
>  insert into foo values (:foo);
> \end
> commit;
>
> Would something like this be acceptable?

*) what does this do that isn't already possible with 'DO' (not being
snarky, genuinely curious)?

*) should psql get some of these features?  simple logic and looping
would be a nice addition?

merlin


Re: pgbench \for or similar loop

От
Alvaro Herrera
Дата:
Excerpts from Merlin Moncure's message of lun abr 18 18:26:54 -0300 2011:
> On Mon, Apr 18, 2011 at 4:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> > begin;
> > \for iterator 1 10000
> >  \setrandom foo 1 :iterator
> >  insert into foo values (:foo);
> > \end
> > commit;
> >
> > Would something like this be acceptable?
> 
> *) what does this do that isn't already possible with 'DO' (not being
> snarky, genuinely curious)?

Uhm, not sure.  I'm not really used to having DO available so I didn't
think about it.  I'll look at it a bit more.

> *) should psql get some of these features?  simple logic and looping
> would be a nice addition?

I dunno.  They have been proposed and shot down in the past.  Apparently
people don't want psql to become too powerful.  ("But that would make
psql turing complete! Soon you're going to want to have \while on it!").
I think pgbench is supposed to be designed to handle data in bulk which
is why I started using it for this in the first place.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Mon, Apr 18, 2011 at 06:02:53PM -0300, Alvaro Herrera wrote:
> Hi,
> 
> Today (and previously) I wished that pgbench had a mechanism to help
> create simple random databases.  For example, I could create a table
> "tenk" and fill it with random stuff like
> 
> \setrandom foo 1 10000
> insert into foo values (:foo)
> 
> Now I have to run this 10000 times or something like that.  But I don't
> want a transaction for each of those, so I had a desire for something
> like this:
> 
> begin;
> \for iterator 1 10000
>  \setrandom foo 1 :iterator
>  insert into foo values (:foo);
> \end
> commit;
> 
> Would something like this be acceptable?

Are existing mechanisms (WITH and DO) insufficient for the purpose?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Mon, Apr 18, 2011 at 5:37 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Merlin Moncure's message of lun abr 18 18:26:54 -0300 2011:
>> On Mon, Apr 18, 2011 at 4:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
>> > begin;
>> > \for iterator 1 10000
>> >  \setrandom foo 1 :iterator
>> >  insert into foo values (:foo);
>> > \end
>> > commit;
>> >
>> > Would something like this be acceptable?
>>
>> *) what does this do that isn't already possible with 'DO' (not being
>> snarky, genuinely curious)?
>
> Uhm, not sure.  I'm not really used to having DO available so I didn't
> think about it.  I'll look at it a bit more.
>
>> *) should psql get some of these features?  simple logic and looping
>> would be a nice addition?
>
> I dunno.  They have been proposed and shot down in the past.  Apparently
> people don't want psql to become too powerful.  ("But that would make
> psql turing complete! Soon you're going to want to have \while on it!").
> I think pgbench is supposed to be designed to handle data in bulk which
> is why I started using it for this in the first place.

[ woops, just realized that i sent this response off-list the first time ]

I do think that DO covers a lot of the same territory that could
usefully be addressed by a more powerful backslash-command language in
psql.  It's in some ways quite a bit more powerful, because it's
available from any client, and it knows about data types, which psql
doesn't, so things like \while are going to be awkward (what
comparison semantics will it use?).  The only advantage I can really
see to doing that stuff on the client side is that you could do things
like \connect and \prompt that wouldn't make sense on the server side.
Maybe that's useful enough to make it worth doing: I'm not sure.

Now pgbench is a bit of a different case, because presumably in that
case it matters somewhat whether the work happens on the client side
or the server side, though I think in your particular case not really.
Actually in that case it seems like you could do the whole thing in
one SQL statement even without DO, using INSERT INTO foo SELECT ...
FROM generate_series(1,10000) g.

If we are going to add more scripting capability, it would be nice to
have a bit of consistency between pgbench and psql in terms of these
backslash commands, and maybe some kind of sketch of what the overall
design looks like.  For example, if \for is defined as a loop over
integers, whatdya do if you want to loop over query results or arrays?
See recent discussions of these issues in relation to plpgsql.  I
don't really see a reason to oppose adding functionality like this
categorically, but I do think it should be carefully thought out and
well-designed so we don't box ourselves into a corner; and we should
know what use cases we are shooting at.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
Merlin Moncure
Дата:
On Tue, Apr 19, 2011 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 18, 2011 at 5:37 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> Excerpts from Merlin Moncure's message of lun abr 18 18:26:54 -0300 2011:
>>> On Mon, Apr 18, 2011 at 4:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>>
>>> > begin;
>>> > \for iterator 1 10000
>>> >  \setrandom foo 1 :iterator
>>> >  insert into foo values (:foo);
>>> > \end
>>> > commit;
>>> >
>>> > Would something like this be acceptable?
>>>
>>> *) what does this do that isn't already possible with 'DO' (not being
>>> snarky, genuinely curious)?
>>
>> Uhm, not sure.  I'm not really used to having DO available so I didn't
>> think about it.  I'll look at it a bit more.
>>
>>> *) should psql get some of these features?  simple logic and looping
>>> would be a nice addition?
>>
>> I dunno.  They have been proposed and shot down in the past.  Apparently
>> people don't want psql to become too powerful.  ("But that would make
>> psql turing complete! Soon you're going to want to have \while on it!").
>> I think pgbench is supposed to be designed to handle data in bulk which
>> is why I started using it for this in the first place.
>
> [ woops, just realized that i sent this response off-list the first time ]
>
> I do think that DO covers a lot of the same territory that could
> usefully be addressed by a more powerful backslash-command language in
> psql.  It's in some ways quite a bit more powerful, because it's
> available from any client, and it knows about data types, which psql
> doesn't, so things like \while are going to be awkward (what
> comparison semantics will it use?).  The only advantage I can really
> see to doing that stuff on the client side is that you could do things
> like \connect and \prompt that wouldn't make sense on the server side.

Well, you missed one big advantage: with DO you are 'one transaction'
limited -- this makes it unsuitable for certain classes of maintenance
tasks (no vacuum etc), creating a lot of tables, long running (even
infinite) scripts etc.  It would remain a boutique feature more or
less, but would add a lot of value to psql scripting which is
particularly suffering from adequate error handling.

Ultimately if you have stored procedures then you have the best of
both worlds especially if you had a method of sending the procedure
body as you can with functions and DO.   A psql meta language would do
in a pinch though, and it would be a lot easier to implement -- don't
like the bifurcated implementation (psql and pgbench) though.

merlin


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Tue, Apr 19, 2011 at 11:56 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> I do think that DO covers a lot of the same territory that could
>> usefully be addressed by a more powerful backslash-command language in
>> psql.  It's in some ways quite a bit more powerful, because it's
>> available from any client, and it knows about data types, which psql
>> doesn't, so things like \while are going to be awkward (what
>> comparison semantics will it use?).  The only advantage I can really
>> see to doing that stuff on the client side is that you could do things
>> like \connect and \prompt that wouldn't make sense on the server side.
>
> Well, you missed one big advantage: with DO you are 'one transaction'
> limited -- this makes it unsuitable for certain classes of maintenance
> tasks (no vacuum etc), creating a lot of tables, long running (even
> infinite) scripts etc.  It would remain a boutique feature more or
> less, but would add a lot of value to psql scripting which is
> particularly suffering from adequate error handling.

Ah, good point.  I assume you mean "inadequate" rather than "adequate".

> Ultimately if you have stored procedures then you have the best of
> both worlds especially if you had a method of sending the procedure
> body as you can with functions and DO.

Also true.

> A psql meta language would do
> in a pinch though, and it would be a lot easier to implement -- don't
> like the bifurcated implementation (psql and pgbench) though.

Yeah.  I was wondering if anyone was gung-ho enough about this to
implement some kind of library that both programs could draw on.

It probably wouldn't be super-hard, if we could agree on a rough design.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
Christopher Browne
Дата:
On Tue, Apr 19, 2011 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Yeah.  I was wondering if anyone was gung-ho enough about this to
> implement some kind of library that both programs could draw on.
>
> It probably wouldn't be super-hard, if we could agree on a rough design.

It seems to me that the Mo Betta answer would be to implement the
fabled "stored procedure" language, that has, as its distinctive, the
capability to control transactions.  That would have the capability of
being used in places other than just inside psql.

And it would be a good way for scripting things like specialized
vacuum and analyze regimens, which cannot be done inside stored
functions today.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Tue, Apr 19, 2011 at 12:27 PM, Christopher Browne <cbbrowne@gmail.com> wrote:
> On Tue, Apr 19, 2011 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Yeah.  I was wondering if anyone was gung-ho enough about this to
>> implement some kind of library that both programs could draw on.
>>
>> It probably wouldn't be super-hard, if we could agree on a rough design.
>
> It seems to me that the Mo Betta answer would be to implement the
> fabled "stored procedure" language, that has, as its distinctive, the
> capability to control transactions.  That would have the capability of
> being used in places other than just inside psql.
>
> And it would be a good way for scripting things like specialized
> vacuum and analyze regimens, which cannot be done inside stored
> functions today.

Well, I'm all good with that, too, but am not fired up about either
one to implement it myself.  So I think it's going to come down to
what the person doing the work feels most strongly about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
Alvaro Herrera
Дата:
Excerpts from Robert Haas's message of mar abr 19 13:33:27 -0300 2011:
> On Tue, Apr 19, 2011 at 12:27 PM, Christopher Browne <cbbrowne@gmail.com> wrote:
> > On Tue, Apr 19, 2011 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> Yeah.  I was wondering if anyone was gung-ho enough about this to
> >> implement some kind of library that both programs could draw on.
> >>
> >> It probably wouldn't be super-hard, if we could agree on a rough design.
> >
> > It seems to me that the Mo Betta answer would be to implement the
> > fabled "stored procedure" language, that has, as its distinctive, the
> > capability to control transactions.  That would have the capability of
> > being used in places other than just inside psql.
> >
> > And it would be a good way for scripting things like specialized
> > vacuum and analyze regimens, which cannot be done inside stored
> > functions today.
> 
> Well, I'm all good with that, too, but am not fired up about either
> one to implement it myself.  So I think it's going to come down to
> what the person doing the work feels most strongly about.

I'm not at all fired up about stored procedures.  The \for pgbench
feature I'm proposing is 2 orders of magnitude less code than that.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pgbench \for or similar loop

От
Merlin Moncure
Дата:
On Tue, Apr 19, 2011 at 11:49 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of mar abr 19 13:33:27 -0300 2011:
>> On Tue, Apr 19, 2011 at 12:27 PM, Christopher Browne <cbbrowne@gmail.com> wrote:
>> > On Tue, Apr 19, 2011 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> Yeah.  I was wondering if anyone was gung-ho enough about this to
>> >> implement some kind of library that both programs could draw on.
>> >>
>> >> It probably wouldn't be super-hard, if we could agree on a rough design.
>> >
>> > It seems to me that the Mo Betta answer would be to implement the
>> > fabled "stored procedure" language, that has, as its distinctive, the
>> > capability to control transactions.  That would have the capability of
>> > being used in places other than just inside psql.
>> >
>> > And it would be a good way for scripting things like specialized
>> > vacuum and analyze regimens, which cannot be done inside stored
>> > functions today.
>>
>> Well, I'm all good with that, too, but am not fired up about either
>> one to implement it myself.  So I think it's going to come down to
>> what the person doing the work feels most strongly about.
>
> I'm not at all fired up about stored procedures.  The \for pgbench
> feature I'm proposing is 2 orders of magnitude less code than that.

...and gets you some of the same benefits.  are you sure it belongs in
pgbench though, and not psql?  it would be pretty weird to have to
switch to pgbench to do fancy sql scripting...i'd happily do it
though.

merlin


Re: pgbench \for or similar loop

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Robert Haas's message of mar abr 19 13:33:27 -0300 2011:
>> Well, I'm all good with that, too, but am not fired up about either
>> one to implement it myself.  So I think it's going to come down to
>> what the person doing the work feels most strongly about.

> I'm not at all fired up about stored procedures.  The \for pgbench
> feature I'm proposing is 2 orders of magnitude less code than that.

I think what that really translates to is "I don't want to bother doing
the careful design work that Robert talked about". -1 for that approach.

I generally feel that such a feature would be better off done
server-side --- after all, there's more clients in the world than psql
and pgbench, and not all of them could use a C library even if we had
one.  But in either case the coding work is going to be dwarfed by the
design work, if it's done right and not just the-first-hack-that-
comes-to-mind.
        regards, tom lane


Re: pgbench \for or similar loop

От
Andres Freund
Дата:
On Tuesday, April 19, 2011 07:22:54 PM Tom Lane wrote:
> I generally feel that such a feature would be better off done
> server-side --- after all, there's more clients in the world than psql
> and pgbench, and not all of them could use a C library even if we had
> one.  But in either case the coding work is going to be dwarfed by the
> design work, if it's done right and not just the-first-hack-that-
> comes-to-mind.
On the other hand doing it client side stresses a different part of the code, 
so it might be pretty sensible for pgbench...

Andres


Re: pgbench \for or similar loop

От
Aidan Van Dyk
Дата:
On Tue, Apr 19, 2011 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think what that really translates to is "I don't want to bother doing
> the careful design work that Robert talked about". -1 for that approach.

As someone not doing any of that work, agreed ;-)

> I generally feel that such a feature would be better off done
> server-side --- after all, there's more clients in the world than psql
> and pgbench, and not all of them could use a C library even if we had
> one.  But in either case the coding work is going to be dwarfed by the
> design work, if it's done right and not just the-first-hack-that-
> comes-to-mind.

And for the "first-hack-that-comes-to-mind", I find my self pulling
out the named fifo trick all the time, and just leaving my for/loop/if
logic  in bash writing SQL commands to the fifo, occasionally getting
psql to write an answer to a file that I then read back in bash....

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Tue, Apr 19, 2011 at 12:27:45PM -0400, Christopher Browne wrote:
> On Tue, Apr 19, 2011 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Yeah.  I was wondering if anyone was gung-ho enough about this to
> > implement some kind of library that both programs could draw on.
> >
> > It probably wouldn't be super-hard, if we could agree on a rough design.
> 
> It seems to me that the Mo Betta answer would be to implement the
> fabled "stored procedure" language, that has, as its distinctive, the
> capability to control transactions.  That would have the capability of
> being used in places other than just inside psql.
> 
> And it would be a good way for scripting things like specialized
> vacuum and analyze regimens, which cannot be done inside stored
> functions today.

This seems like a proposal that's evolving toward a long-standing
TODO, namely autonomous transactions.

At the time it was added, it went into the "Exotic Features" section,
which I believe needs extensive reworking, if not outright deletion.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
"Kevin Grittner"
Дата:
Aidan Van Dyk <aidan@highrise.ca> wrote:
> And for the "first-hack-that-comes-to-mind", I find my self
> pulling out the named fifo trick all the time, and just leaving my
> for/loop/if logic  in bash writing SQL commands to the fifo,
> occasionally getting psql to write an answer to a file that I then
> read back in bash....
I'm not clear on exactly what you're proposing there, but the thing
I've considered doing is having threads to try to keep a FIFO queue
populated with a configurable transaction mix, while a configurable
number of worker threads pull those transactions off the queue and
submit them to the server.  The transactions would need to be
scripted in some way such that they could query a value and then use
it in another statement, or use flow control for conditional
execution or looping.  And, of course, there would need to be a way
do define conditions under which a transaction would roll back and
retry from the beginning -- with the retry being a separate count
and the failed attempt not counted in the TPS numbers.
It would take that much infrastructure to have a performance test
which would give numbers which would correspond well to an actual
production load in our environment.  It still wouldn't be quite as
good as actually logging production activity and playing it back,
but it would come pretty close with a lot less work per test.
-Kevin


Re: pgbench \for or similar loop

От
Aidan Van Dyk
Дата:
On Tue, Apr 19, 2011 at 1:57 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Aidan Van Dyk <aidan@highrise.ca> wrote:
>
>> And for the "first-hack-that-comes-to-mind", I find my self
>> pulling out the named fifo trick all the time, and just leaving my
>> for/loop/if logic  in bash writing SQL commands to the fifo,
>> occasionally getting psql to write an answer to a file that I then
>> read back in bash....
>
> I'm not clear on exactly what you're proposing there, but the thing
> I've considered doing is having threads to try to keep a FIFO queue
> populated with a configurable transaction mix, while a configurable
> number of worker threads pull those transactions off the queue and
> submit them to the server.  The transactions would need to be
> scripted in some way such that they could query a value and then use
> it in another statement, or use flow control for conditional
> execution or looping.  And, of course, there would need to be a way
> do define conditions under which a transaction would roll back and
> retry from the beginning -- with the retry being a separate count
> and the failed attempt not counted in the TPS numbers.
>
> It would take that much infrastructure to have a performance test
> which would give numbers which would correspond well to an actual
> production load in our environment.  It still wouldn't be quite as
> good as actually logging production activity and playing it back,
> but it would come pretty close with a lot less work per test.

Well, I don't think I'm doing anything nearly as complicated as what
your'e thinking...

I'm talking about simple stuff like:

mkfifo psql.fifo
exec 4> psql.fifo
psql < psql.fifo&
...
for i in $(seq 1 1000)
do   echo "SELECT 1;" >&4
done

Couple that with:  echo "\o /path/to/some/file" >&4
and other \settitngs, and I can use bash for all my logic, and just
feed lines/statements to psql to have them executed as I wish, with
output directed/formated as I wish...


--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: pgbench \for or similar loop

От
Alvaro Herrera
Дата:
Excerpts from Tom Lane's message of mar abr 19 14:22:54 -0300 2011:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Excerpts from Robert Haas's message of mar abr 19 13:33:27 -0300 2011:
> >> Well, I'm all good with that, too, but am not fired up about either
> >> one to implement it myself.  So I think it's going to come down to
> >> what the person doing the work feels most strongly about.
> 
> > I'm not at all fired up about stored procedures.  The \for pgbench
> > feature I'm proposing is 2 orders of magnitude less code than that.
> 
> I think what that really translates to is "I don't want to bother doing
> the careful design work that Robert talked about". -1 for that approach.

No, actually it doesn't.  They're different features.  I don't have a
problem spending time designing it; I do have a problem with designing
something that I'm not interested in.

> I generally feel that such a feature would be better off done
> server-side --- after all, there's more clients in the world than psql
> and pgbench, and not all of them could use a C library even if we had
> one.

Why do we have pgbench at all in the first place?  Surely we could
rewrite it in plpgsql with proper stored procedures.

> But in either case the coding work is going to be dwarfed by the
> design work, if it's done right and not just the-first-hack-that-
> comes-to-mind.

If the feature we're talking about is \for and similar control
structures in pgbench, then no.  I'm not really interested in doing
server-side stuff for this kind of thing, mainly because I don't think
it belongs in the server in the first place.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pgbench \for or similar loop

От
Jeff Janes
Дата:
On Tue, Apr 19, 2011 at 11:22 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Tom Lane's message of mar abr 19 14:22:54 -0300 2011:
>> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> > Excerpts from Robert Haas's message of mar abr 19 13:33:27 -0300 2011:
>> >> Well, I'm all good with that, too, but am not fired up about either
>> >> one to implement it myself.  So I think it's going to come down to
>> >> what the person doing the work feels most strongly about.
>>
>> > I'm not at all fired up about stored procedures.  The \for pgbench
>> > feature I'm proposing is 2 orders of magnitude less code than that.
>>
>> I think what that really translates to is "I don't want to bother doing
>> the careful design work that Robert talked about". -1 for that approach.
>
> No, actually it doesn't.  They're different features.  I don't have a
> problem spending time designing it; I do have a problem with designing
> something that I'm not interested in.
>
>> I generally feel that such a feature would be better off done
>> server-side --- after all, there's more clients in the world than psql
>> and pgbench, and not all of them could use a C library even if we had
>> one.
>
> Why do we have pgbench at all in the first place?  Surely we could
> rewrite it in plpgsql with proper stored procedures.

By "proper", do you mean "with autonomous transactions"?  I don't see
how you could possibly get pgbench functionality into plgsql without
that.

Also, sometimes the libpq stuff is an important part of what you need
to be benchmarking, but I suspect that was part of your rhetorical
point.




Cheers,

Jeff


Re: pgbench \for or similar loop

От
Pavel Stehule
Дата:
Hello

I played with psql extensions two years ago - it can do it

http://okbob.blogspot.com/2009/03/experimental-psql.html

The source code is available on pgfoundry

Regards

Pavel Stehule

2011/4/19 David Fetter <david@fetter.org>:
> On Mon, Apr 18, 2011 at 06:02:53PM -0300, Alvaro Herrera wrote:
>> Hi,
>>
>> Today (and previously) I wished that pgbench had a mechanism to help
>> create simple random databases.  For example, I could create a table
>> "tenk" and fill it with random stuff like
>>
>> \setrandom foo 1 10000
>> insert into foo values (:foo)
>>
>> Now I have to run this 10000 times or something like that.  But I don't
>> want a transaction for each of those, so I had a desire for something
>> like this:
>>
>> begin;
>> \for iterator 1 10000
>>  \setrandom foo 1 :iterator
>>  insert into foo values (:foo);
>> \end
>> commit;
>>
>> Would something like this be acceptable?
>
> Are existing mechanisms (WITH and DO) insufficient for the purpose?
>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
> Hello
> 
> I played with psql extensions two years ago - it can do it

It's interesting, but it doesn't solve the fundamental problem, which
is to allow every client, not just psql, to do this.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
Pavel Stehule
Дата:
2011/4/20 David Fetter <david@fetter.org>:
> On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
>> Hello
>>
>> I played with psql extensions two years ago - it can do it
>
> It's interesting, but it doesn't solve the fundamental problem, which
> is to allow every client, not just psql, to do this.

then you need a real procedures that allows a explicit transaction
handling. I am thinking so this task is feasible. We just need a
implementation of CALL statement that runs a PL handler outside
statement transaction. Probably some extension for PLPerl or PLPython
can be more simple then  PLpgSQL for begin.

This is exactly task for "procedures"

Regards

Pavel

>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: pgbench \for or similar loop

От
Alvaro Herrera
Дата:
Excerpts from David Fetter's message of mié abr 20 10:54:56 -0300 2011:
> On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
> > Hello
> > 
> > I played with psql extensions two years ago - it can do it
> 
> It's interesting, but it doesn't solve the fundamental problem, which
> is to allow every client, not just psql, to do this.

Why is this problem fundamental?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pgbench \for or similar loop

От
Merlin Moncure
Дата:
On Wed, Apr 20, 2011 at 9:14 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from David Fetter's message of mié abr 20 10:54:56 -0300 2011:
>> On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
>> > Hello
>> >
>> > I played with psql extensions two years ago - it can do it
>>
>> It's interesting, but it doesn't solve the fundamental problem, which
>> is to allow every client, not just psql, to do this.
>
> Why is this problem fundamental?

I happen to like your idea, even if we had stored procedures...they
have a lot of overlap but so what?.  We have server side \copy and
client side COPY -- both are useful.  Likewise, (getting back to the
original point of the thread), bechmarking via client scripting and
via procedure are also both useful.  Nobody will gripe if psql gets
more features like this -- some people really want to do this on the
client side and there are valid reasons to do that, say, to intermix
client local shell commands between sql lines.

merlin


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Wed, Apr 20, 2011 at 04:00:12PM +0200, Pavel Stehule wrote:
> 2011/4/20 David Fetter <david@fetter.org>:
> > On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
> >> Hello
> >>
> >> I played with psql extensions two years ago - it can do it
> >
> > It's interesting, but it doesn't solve the fundamental problem, which
> > is to allow every client, not just psql, to do this.
> 
> then you need a real procedures that allows a explicit transaction
> handling. I am thinking so this task is feasible. We just need a
> implementation of CALL statement that runs a PL handler outside
> statement transaction. Probably some extension for PLPerl or PLPython
> can be more simple then  PLpgSQL for begin.
> 
> This is exactly task for "procedures"

Autonomous transactions allow for a broader scope than CALL in that
they do not require that we use a language other than SQL.

They'd make the CALL functionality a good deal easier to implement,
though.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Wed, Apr 20, 2011 at 10:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Apr 20, 2011 at 9:14 AM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> Excerpts from David Fetter's message of mié abr 20 10:54:56 -0300 2011:
>>> On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
>>> > Hello
>>> >
>>> > I played with psql extensions two years ago - it can do it
>>>
>>> It's interesting, but it doesn't solve the fundamental problem, which
>>> is to allow every client, not just psql, to do this.
>>
>> Why is this problem fundamental?
>
> I happen to like your idea, even if we had stored procedures...they
> have a lot of overlap but so what?.  We have server side \copy and
> client side COPY -- both are useful.  Likewise, (getting back to the
> original point of the thread), bechmarking via client scripting and
> via procedure are also both useful.  Nobody will gripe if psql gets
> more features like this -- some people really want to do this on the
> client side and there are valid reasons to do that, say, to intermix
> client local shell commands between sql lines.

Yep, I agree.  However, I think it's completely reasonable, as I said
upthread, to ask people not to implement \for as a loop over an
integer range in pgbench without answering questions like:

1. What happens if someone wants the other kind of for loop, that
iterates until a condition is met?

2. Are we going to get a different and incompatible implementation in psql?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Wed, Apr 20, 2011 at 01:35:03PM -0400, Robert Haas wrote:
> On Wed, Apr 20, 2011 at 10:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> > On Wed, Apr 20, 2011 at 9:14 AM, Alvaro Herrera
> > <alvherre@commandprompt.com> wrote:
> >> Excerpts from David Fetter's message of mié abr 20 10:54:56 -0300 2011:
> >>> On Wed, Apr 20, 2011 at 08:05:07AM +0200, Pavel Stehule wrote:
> >>> > Hello
> >>> >
> >>> > I played with psql extensions two years ago - it can do it
> >>>
> >>> It's interesting, but it doesn't solve the fundamental problem, which
> >>> is to allow every client, not just psql, to do this.
> >>
> >> Why is this problem fundamental?
> >
> > I happen to like your idea, even if we had stored procedures...they
> > have a lot of overlap but so what?.  We have server side \copy and
> > client side COPY -- both are useful.  Likewise, (getting back to the
> > original point of the thread), bechmarking via client scripting and
> > via procedure are also both useful.  Nobody will gripe if psql gets
> > more features like this -- some people really want to do this on the
> > client side and there are valid reasons to do that, say, to intermix
> > client local shell commands between sql lines.
> 
> Yep, I agree.  However, I think it's completely reasonable, as I said
> upthread, to ask people not to implement \for as a loop over an
> integer range in pgbench without answering questions like:
> 
> 1. What happens if someone wants the other kind of for loop, that
> iterates until a condition is met?
> 
> 2. Are we going to get a different and incompatible implementation in psql?

It is precisely this kind of issue that leads me to believe it would
be counter-productive to come up with any client-specific hacks.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Wed, Apr 20, 2011 at 2:10 PM, David Fetter <david@fetter.org> wrote:
> It is precisely this kind of issue that leads me to believe it would
> be counter-productive to come up with any client-specific hacks.

These definitional issues exist on the server, too, and weren't
considered early enough there either.

Preventing people from working on the things they care about is not a
good idea.  There is no guarantee they will work on the things you
care about instead.  They may just do nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
David Fetter
Дата:
On Wed, Apr 20, 2011 at 02:12:25PM -0400, Robert Haas wrote:
> On Wed, Apr 20, 2011 at 2:10 PM, David Fetter <david@fetter.org> wrote:
> > It is precisely this kind of issue that leads me to believe it would
> > be counter-productive to come up with any client-specific hacks.
> 
> These definitional issues exist on the server, too, and weren't
> considered early enough there either.
> 
> Preventing people from working on the things they care about is not a
> good idea.  There is no guarantee they will work on the things you
> care about instead.  They may just do nothing.

We have situations where the "fix it in one spot" approach has
resulted in real, serious problems.  Try explaining to someone new to
the project why pg_dump and pg_dumpall are separate programs, for
example.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pgbench \for or similar loop

От
Robert Haas
Дата:
On Wed, Apr 20, 2011 at 2:23 PM, David Fetter <david@fetter.org> wrote:
> On Wed, Apr 20, 2011 at 02:12:25PM -0400, Robert Haas wrote:
>> On Wed, Apr 20, 2011 at 2:10 PM, David Fetter <david@fetter.org> wrote:
>> > It is precisely this kind of issue that leads me to believe it would
>> > be counter-productive to come up with any client-specific hacks.
>>
>> These definitional issues exist on the server, too, and weren't
>> considered early enough there either.
>>
>> Preventing people from working on the things they care about is not a
>> good idea.  There is no guarantee they will work on the things you
>> care about instead.  They may just do nothing.
>
> We have situations where the "fix it in one spot" approach has
> resulted in real, serious problems.  Try explaining to someone new to
> the project why pg_dump and pg_dumpall are separate programs, for
> example.

True, but I thought I had addressed that point fairly thoroughly in my
various replies.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgbench \for or similar loop

От
Greg Smith
Дата:
Alvaro Herrera wrote:
> Why do we have pgbench at all in the first place?  Surely we could
> rewrite it in plpgsql with proper stored procedures.
>   

pgbench gives you a driver program with the following useful properties:

1) Multiple processes are spawned and each gets its own connection
2) A time/transaction limit is enforced across all of the connections at 
once
3) Timing information is written to a client-side log file
4) The work of running the clients can happen on a remote system, so 
that it's possible to just test the server-side performance
5) The program is similar enough to any other regular client, using the 
standard libpq interface, that connection-related overhead should be 
similar to a real workload.

All of those have some challenges before you could duplicate them in a 
stored procedure context.

My opinion of this feature is similar to the one Aiden already 
expressed:  there's already so many ways to do this sort of thing using 
shell-oriented approaches (as well as generate_series) that it's hard to 
get too excited about implementing it directly in pgbench.  Part of the 
reason for adding the \shell and \setshell commands way to make tricky 
things like this possible without having to touch the pgbench code 
further.  I for example would solve the problem you're facing like this:

1) Write a shell script that generates the file I need
2) Call it from pgbench using \shell, passing the size it needs.  Have 
that write a delimited file with the data required.
3) Import the whole thing with COPY.

And next thing you know you've even got the CREATE/COPY optimization as 
a possibility to avoid WAL, as well as the ability to avoid creating the 
data file more than once if the script is smart enough.

Sample data file generation can be difficult; most of the time I'd 
rather solve in a general programming language.  The fact that simple 
generation cases could be done with the mechanism you propose is true.  
However, this only really helps cases that are too complicated to 
express with generate_series, yet not so complicated that you really 
want a full programming language to generate the data.  I don't think 
there's that much middle ground in that use case.

But if this is what you think makes your life easier, I'm not going to 
tell you you're wrong.  And I don't feel that your desire for this 
features means you must tackle a more complicated thing instead--even 
though what I personally would much prefer is something making this sort 
of thing easier to do in regression tests, too.  That's a harder 
problem, though, and you're only volunteering to solve an easier one 
than that.

Stepping aside from debate over usefulness, my main code concern is that 
each time I look at the pgbench code for yet another tacked on bit, it's 
getting increasingly creakier and harder to maintain.  It's never going 
to be a good benchmark driver program capable of really complicated 
tasks.  And making it try keeps piling on the risk of breaking it for 
its intended purpose of doing simple tests.  If you can figure out how 
to keep the code contortions to implement the feature under control, 
there's some benefit there.  I can't think of a unique reason for it; 
again, lots of ways to solve this already.  But I'd probably use it if 
it were there.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: pgbench \for or similar loop

От
Greg Smith
Дата:
Kevin Grittner wrote:
> I'm not clear on exactly what you're proposing there, but the thing
> I've considered doing is having threads to try to keep a FIFO queue
> populated with a configurable transaction mix, while a configurable
> number of worker threads pull those transactions off the queue and...
>   

This is like the beginning of an advertisement for how Tsung is useful 
for simulating complicated workloads.  The thought of growing pgbench to 
reach that level of capabilities makes my head hurt.

When faced with this same issue, the sysbench team decided to embed Lua 
as their scripting language; sample scripts:  
http://bazaar.launchpad.net/~sysbench-developers/sysbench/0.5/files/head:/sysbench/tests/db/

This isn't very well known because the whole MySQL community fracturing 
has impacted their ability to actually release this overhaul--seems like 
they spend all their time just trying to add support for each new engine 
of the month.  I don't even like Lua, yet this still seems like a much 
better idea than trying to build on top of the existing pgbench codebase.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Re: pgbench \for or similar loop

От
Pavel Stehule
Дата:
Hello

>
> This isn't very well known because the whole MySQL community fracturing has
> impacted their ability to actually release this overhaul--seems like they
> spend all their time just trying to add support for each new engine of the
> month.  I don't even like Lua, yet this still seems like a much better idea
> than trying to build on top of the existing pgbench codebase.
>

Lua is probably the most light language designed for this purposes.

A integration of Lua to psql or pgbench can be useful

Pavel


> --
> Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: pgbench \for or similar loop

От
Dimitri Fontaine
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> Kevin Grittner wrote:
>> I'm not clear on exactly what you're proposing there, but the thing
>> I've considered doing is having threads to try to keep a FIFO queue
>> populated with a configurable transaction mix, while a configurable
>> number of worker threads pull those transactions off the queue and...
>
> This is like the beginning of an advertisement for how Tsung is useful for
> simulating complicated workloads.  The thought of growing pgbench to reach
> that level of capabilities makes my head hurt.

+1 for having a look at Tsung here.  You'll be glad not to have to
reinvent all what it already does.

> When faced with this same issue, the sysbench team decided to embed Lua as
> their scripting language; sample scripts:

I would tend to prefer some scheme (guile comes to the mind but that's
GPL), being an Emacs user.  Also I've seen projects pick lua then down
the road regret the choice (http://julien.danjou.info/blog/2008.html).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr