Обсуждение: SELECT's take a long time compared to other DBMS

Поиск
Список
Период
Сортировка

SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
I have a table with 102,384 records in it, each record is 934 bytes.

Using the follow select statement:
  SELECT * from <table>

PG Info: version 7.3.4 under cygwin on Windows 2000
ODBC: version 7.3.100

Machine: 500 Mhz/ 512MB RAM / IDE HDD


Under PG:  Data is returned in 26 secs!!
Under SQL Server:  Data is returned in 5 secs.
Under SQLBase:     Data is returned in 6 secs.
Under SAPDB:        Data is returned in 7 secs.

This is the ONLY table in the database and only 1 user.

And yes I did a vacuum.

Is this normal behavior for PG?

Thanks



Re: SELECT's take a long time compared to other DBMS

От
Rudi Starcevic
Дата:
Hi,


>And yes I did a vacuum.
>

Did you 'Analyze' too ?

Cheers
Rudi.


Re: SELECT's take a long time compared to other DBMS

От
Rod Taylor
Дата:
> Under PG:  Data is returned in 26 secs!!
> Under SQL Server:  Data is returned in 5 secs.
> Under SQLBase:     Data is returned in 6 secs.
> Under SAPDB:        Data is returned in 7 secs.

What did you use as the client? Do those times include ALL resulting
data or simply the first few lines?

PostgreSQL performance on windows (via Cygwin) is known to be poor.
Do you receive similar results with 7.4 beta 2?

Вложения

Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
Yes I Analyze also, but there was no need to because it was a fresh brand
new database.

"Rudi Starcevic" <rudi@oasis.net.au> wrote in message
news:3F569012.3090209@oasis.net.au...
> Hi,
>
>
> >And yes I did a vacuum.
> >
>
> Did you 'Analyze' too ?
>
> Cheers
> Rudi.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>



Re: SELECT's take a long time compared to other DBMS

От
Rudi Starcevic
Дата:
Hi,

>Yes I Analyze also, but there was no need to because it was a fresh brand
>new database.
>

Hmm ... Sorry I'm not sure then. I only use Linux with PG.
Even though it's 'brand new' you still need to Analyze so that any
Indexes etc. are built.

I'll keep an eye on this thread - Good luck.

Regards
Rudi.




Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
All queries were ran on the SERVER for all of the databases I tested.

This is all resulting data for all of the databases that I tested.


"Rod Taylor" <rbt@rbt.ca> wrote in message
news:1062637505.84923.7.camel@jester...



Re: SELECT's take a long time compared to other DBMS

От
Neil Conway
Дата:
On Wed, 2003-09-03 at 21:32, Rudi Starcevic wrote:
> Hmm ... Sorry I'm not sure then. I only use Linux with PG.
> Even though it's 'brand new' you still need to Analyze so that any
> Indexes etc. are built.

ANALYZE doesn't build indexes, it only updates the statistics used by
the query optimizer (and in any case, "select * from <foo>" has only one
reasonable query plan anyway).

-Neil


Re: SELECT's take a long time compared to other DBMS

От
Christopher Browne
Дата:
Quoth "Relaxin" <me@yourhouse.com>:
> Yes I Analyze also, but there was no need to because it was a fresh
> brand new database.

That is _absolutely not true_.

It is not true with any DBMS that uses a cost-based optimizer.
Cost-based optimizers need some equivalent to ANALYZE in order to
collect statistics to allow them to pick any path other than a
sequential scan.

In this particular case, a seq scan is pretty likely to be the best
answer when there is no WHERE clause on the query.

Actually, it doesn't make all that much sense that the other systems
would be terribly much faster, because they obviously need to do some
processing on 102,384 records.

Can you tell us what you were *actually* doing?  Somehow it sounds as
though the other databases were throwing away the data whereas
PostgreSQL was returning it all "kawhump!" in one batch.

What programs were you using to submit the queries?
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/oses.html
"Computers let you make more  mistakes faster than any other invention
in  human  history,  with  the  possible  exception  of  handguns  and
tequila."  -- Mitch Radcliffe

Re: SELECT's take a long time compared to other DBMS

От
Christopher Browne
Дата:
In the last exciting episode, "Relaxin" <me@yourhouse.com> wrote:
> All queries were ran on the SERVER for all of the databases I tested.

Queries obviously run "on the server."  That's kind of the point of
the database system being a "client/server" system.

The question is what client program(s) you used to process the result
sets.  I'd be surprised to see any client process 100K records in any
meaningful way in much less than 30 seconds.  Rendering that much data
into a console will take some time.  Drawing it into cells on a GUI
window will take a lot more time.

Supposing you were using a graphical client, it would be unsurprising
for it to have submitted something equivalent to "limit 30 rows" (or
whatever you can display on screen), and defer further processing 'til
later. If that were the case, then 26s to process the whole thing
would be a lot more efficient than 5-6s to process a mere 30 rows...

> This is all resulting data for all of the databases that I tested.

You seem to have omitted "all resulting data."
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://www.ntlug.org/~cbbrowne/sap.html
"Women who seek to be equal to men lack ambition. "
-- Timothy Leary

Re: SELECT's take a long time compared to other DBMS

От
"Nick Fankhauser"
Дата:
> Yes I Analyze also, but there was no need to because it was a fresh brand
> new database.

This apparently wasn't the source of problem since he did an analyze anyway,
but my impression was that a fresh brand new database is exactly the
situation where an analyze is needed- ie: a batch of data has just been
loaded and stats haven't been collected yet.

Am I mistaken?

-Nick



Re: SELECT's take a long time compared to other DBMS

От
Tom Lane
Дата:
"Nick Fankhauser" <nickf@ontko.com> writes:
> This apparently wasn't the source of problem since he did an analyze anyway,
> but my impression was that a fresh brand new database is exactly the
> situation where an analyze is needed- ie: a batch of data has just been
> loaded and stats haven't been collected yet.

Indeed.  But as someone else already pointed out, a seqscan is the only
reasonable plan for an unqualified "SELECT whatever FROM table" query;
lack of stats wouldn't deter the planner from arriving at that
conclusion.

My guess is that the OP is failing to account for some client-side
inefficiency in absorbing a large query result.

            regards, tom lane

Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
> Can you tell us what you were *actually* doing?  Somehow it sounds as
> though the other databases were throwing away the data whereas
> PostgreSQL was returning it all "kawhump!" in one batch.

All of the databases that I tested the query against gave me immediate
access to ANY row of the resultset once the data had been returned.
Ex. If  I'm currently at the first row and then wanted to goto the 100,000
row, I would be there immediately, and if I wanted to then goto the 5
row...same thing, I have the record immediately!

The other databases I tested against stored the entire resultset on the
Server, I'm not sure what PG does...It seems that brings the entire
resultset client side.
If that is the case, how can I have PG store the resultset on the Server AND
still allow me immediate access to ANY row in the resultset?


> What programs were you using to submit the queries?
I used the same program for all of the database.  I was using ODBC as
connectivity.




Re: SELECT's take a long time compared to other DBMS

От
"Shridhar Daithankar"
Дата:
On 4 Sep 2003 at 0:48, Relaxin wrote:
> All of the databases that I tested the query against gave me immediate
> access to ANY row of the resultset once the data had been returned.
> Ex. If  I'm currently at the first row and then wanted to goto the 100,000
> row, I would be there immediately, and if I wanted to then goto the 5
> row...same thing, I have the record immediately!
>
> The other databases I tested against stored the entire resultset on the
> Server, I'm not sure what PG does...It seems that brings the entire
> resultset client side.
> If that is the case, how can I have PG store the resultset on the Server AND
> still allow me immediate access to ANY row in the resultset?

You can use a cursor and get only required rows.


Bye
 Shridhar

--
Nick the Greek's Law of Life:    All things considered, life is 9 to 5 against.


Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
All rows are required.

""Shridhar Daithankar"" <shridhar_daithankar@persistent.co.in> wrote in
message news:3F573E8B.31916.A1063F8@localhost...
> On 4 Sep 2003 at 0:48, Relaxin wrote:
> > All of the databases that I tested the query against gave me immediate
> > access to ANY row of the resultset once the data had been returned.
> > Ex. If  I'm currently at the first row and then wanted to goto the
100,000
> > row, I would be there immediately, and if I wanted to then goto the 5
> > row...same thing, I have the record immediately!
> >
> > The other databases I tested against stored the entire resultset on the
> > Server, I'm not sure what PG does...It seems that brings the entire
> > resultset client side.
> > If that is the case, how can I have PG store the resultset on the Server
AND
> > still allow me immediate access to ANY row in the resultset?
>
> You can use a cursor and get only required rows.
>
>
> Bye
>  Shridhar
>
> --
> Nick the Greek's Law of Life: All things considered, life is 9 to 5
against.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>



Re: SELECT's take a long time compared to other DBMS

От
Hannu Krosing
Дата:
Relaxin kirjutas N, 04.09.2003 kell 03:28:
> I have a table with 102,384 records in it, each record is 934 bytes.

I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide
drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes
each and did

[hannu@fuji hannu]$ time psql test100k -c 'select * from test' >
/dev/null

real    0m3.970s
user    0m0.980s
sys     0m0.570s

so it seems definitely not a problem with postgres as such, but perhaps
with Cygwin and/or ODBC driver

I also ran the same query  using the "standard" pg adapter:

>>> import pg, time
>>>
>>> con = pg.connect('test100k')
>>>
>>> def getall():
...     t1 = time.time()
...     res = con.query('select * from test')
...     t2 = time.time()
...     list = res.getresult()
...     t3 = time.time()
...     print t2 - t1, t3-t2
...
>>> getall()
3.27637195587 1.10105705261
>>> getall()
3.07413101196 0.996125936508
>>> getall()
3.03377199173 1.07322502136

which gave similar results

------------------------------
Hannu




Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
So after you did that, where able to position to ANY record within the
resultset?

Ex. Position 100,000; then  to Position 5; then to position 50,000, etc...

If you are able to do that and have your positioned row available to you
immediately, then I'll believe that it's the ODBC driver.

"Hannu Krosing" <hannu@tm.ee> wrote in message
news:1062673303.5200.135.camel@fuji.krosing.net...
> Relaxin kirjutas N, 04.09.2003 kell 03:28:
> > I have a table with 102,384 records in it, each record is 934 bytes.
>
> I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide
> drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes
> each and did
>
> [hannu@fuji hannu]$ time psql test100k -c 'select * from test' >
> /dev/null
>
> real    0m3.970s
> user    0m0.980s
> sys     0m0.570s
>
> so it seems definitely not a problem with postgres as such, but perhaps
> with Cygwin and/or ODBC driver
>
> I also ran the same query  using the "standard" pg adapter:
>
> >>> import pg, time
> >>>
> >>> con = pg.connect('test100k')
> >>>
> >>> def getall():
> ...     t1 = time.time()
> ...     res = con.query('select * from test')
> ...     t2 = time.time()
> ...     list = res.getresult()
> ...     t3 = time.time()
> ...     print t2 - t1, t3-t2
> ...
> >>> getall()
> 3.27637195587 1.10105705261
> >>> getall()
> 3.07413101196 0.996125936508
> >>> getall()
> 3.03377199173 1.07322502136
>
> which gave similar results
>
> ------------------------------
> Hannu
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>



Re: SELECT's take a long time compared to other DBMS

От
Jean-Luc Lachance
Дата:
You forgot that the original poster's query was:
  SELECT * from <table>

This should require a simple table scan.  NO need for stats.
Either the table has not been properly vacuumed or he's got seq_scan
off...

JLL


Nick Fankhauser wrote:
>
> > Yes I Analyze also, but there was no need to because it was a fresh brand
> > new database.
>
> This apparently wasn't the source of problem since he did an analyze anyway,
> but my impression was that a fresh brand new database is exactly the
> situation where an analyze is needed- ie: a batch of data has just been
> loaded and stats haven't been collected yet.
>
> Am I mistaken?
>
> -Nick
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
The table has been Vacuumed and seq_scan is turned on.

"Jean-Luc Lachance" <jllachan@nsd.ca> wrote in message
news:3F5753B9.F4A5A63F@nsd.ca...
> You forgot that the original poster's query was:
>   SELECT * from <table>
>
> This should require a simple table scan.  NO need for stats.
> Either the table has not been properly vacuumed or he's got seq_scan
> off...
>
> JLL
>
>
> Nick Fankhauser wrote:
> >
> > > Yes I Analyze also, but there was no need to because it was a fresh
brand
> > > new database.
> >
> > This apparently wasn't the source of problem since he did an analyze
anyway,
> > but my impression was that a fresh brand new database is exactly the
> > situation where an analyze is needed- ie: a batch of data has just been
> > loaded and stats haven't been collected yet.
> >
> > Am I mistaken?
> >
> > -Nick
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>



Re: SELECT's take a long time compared to other DBMS

От
Hannu Krosing
Дата:
Relaxin kirjutas N, 04.09.2003 kell 17:35:
> So after you did that, where able to position to ANY record within the
> resultset?
>
> Ex. Position 100,000; then  to Position 5; then to position 50,000, etc...

not in the case of :
   time psql test100k -c 'select * from test' > /dev/null
as the whole result would be written to dev null (i.e discarded)

Yes in case of python: after doing

res = con.query('select * from test') # 3 sec - perform query
list = res.getresult()                # 1 sec - construct list of tuples

the whole 128k records are in a python list ,
so that i can immediately access any record by python list syntax,
ie list[5], list[50000] etc.

> If you are able to do that and have your positioned row available to you
> immediately, then I'll believe that it's the ODBC driver.

It can also be the Cygwin port, which is known to have several problems,
and if you run both your client and server on the same machine, then it
can also be an interaction of the two processes (cygwin/pgsql server and
native win32 ODBC client) not playing together very well.

> "Hannu Krosing" <hannu@tm.ee> wrote in message
> news:1062673303.5200.135.camel@fuji.krosing.net...
> > Relaxin kirjutas N, 04.09.2003 kell 03:28:
> > > I have a table with 102,384 records in it, each record is 934 bytes.
> >
> > I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide
> > drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes
> > each and did
> >
> > [hannu@fuji hannu]$ time psql test100k -c 'select * from test' >
> > /dev/null
> >
> > real    0m3.970s
> > user    0m0.980s
> > sys     0m0.570s
> >
> > so it seems definitely not a problem with postgres as such, but perhaps
> > with Cygwin and/or ODBC driver
> >
> > I also ran the same query  using the "standard" pg adapter:
> >
> > >>> import pg, time
> > >>>
> > >>> con = pg.connect('test100k')
> > >>>
> > >>> def getall():
> > ...     t1 = time.time()
> > ...     res = con.query('select * from test')
> > ...     t2 = time.time()
> > ...     list = res.getresult()
> > ...     t3 = time.time()
> > ...     print t2 - t1, t3-t2
> > ...
> > >>> getall()
> > 3.27637195587 1.10105705261
> > >>> getall()
> > 3.07413101196 0.996125936508
> > >>> getall()
> > 3.03377199173 1.07322502136
> >
> > which gave similar results
-------------------
Hannu


Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
I had these same issues with the PeerDirect version also.
"Hannu Krosing" <hannu@tm.ee> wrote in message
news:1062693009.6174.21.camel@fuji.krosing.net...
> Relaxin kirjutas N, 04.09.2003 kell 17:35:
> > So after you did that, where able to position to ANY record within the
> > resultset?
> >
> > Ex. Position 100,000; then  to Position 5; then to position 50,000,
etc...
>
> not in the case of :
>    time psql test100k -c 'select * from test' > /dev/null
> as the whole result would be written to dev null (i.e discarded)
>
> Yes in case of python: after doing
>
> res = con.query('select * from test') # 3 sec - perform query
> list = res.getresult()                # 1 sec - construct list of tuples
>
> the whole 128k records are in a python list ,
> so that i can immediately access any record by python list syntax,
> ie list[5], list[50000] etc.
>
> > If you are able to do that and have your positioned row available to you
> > immediately, then I'll believe that it's the ODBC driver.
>
> It can also be the Cygwin port, which is known to have several problems,
> and if you run both your client and server on the same machine, then it
> can also be an interaction of the two processes (cygwin/pgsql server and
> native win32 ODBC client) not playing together very well.
>
> > "Hannu Krosing" <hannu@tm.ee> wrote in message
> > news:1062673303.5200.135.camel@fuji.krosing.net...
> > > Relaxin kirjutas N, 04.09.2003 kell 03:28:
> > > > I have a table with 102,384 records in it, each record is 934 bytes.
> > >
> > > I created a test database on my Linux (RH9) laptop with 30GB/4200RPM
ide
> > > drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930
bytes
> > > each and did
> > >
> > > [hannu@fuji hannu]$ time psql test100k -c 'select * from test' >
> > > /dev/null
> > >
> > > real    0m3.970s
> > > user    0m0.980s
> > > sys     0m0.570s
> > >
> > > so it seems definitely not a problem with postgres as such, but
perhaps
> > > with Cygwin and/or ODBC driver
> > >
> > > I also ran the same query  using the "standard" pg adapter:
> > >
> > > >>> import pg, time
> > > >>>
> > > >>> con = pg.connect('test100k')
> > > >>>
> > > >>> def getall():
> > > ...     t1 = time.time()
> > > ...     res = con.query('select * from test')
> > > ...     t2 = time.time()
> > > ...     list = res.getresult()
> > > ...     t3 = time.time()
> > > ...     print t2 - t1, t3-t2
> > > ...
> > > >>> getall()
> > > 3.27637195587 1.10105705261
> > > >>> getall()
> > > 3.07413101196 0.996125936508
> > > >>> getall()
> > > 3.03377199173 1.07322502136
> > >
> > > which gave similar results
> -------------------
> Hannu
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>



Re: SELECT's take a long time compared to other DBMS

От
William Yu
Дата:
Relaxin wrote:
> I have a table with 102,384 records in it, each record is 934 bytes.
>
> Using the follow select statement:
>   SELECT * from <table>
>
> PG Info: version 7.3.4 under cygwin on Windows 2000
> ODBC: version 7.3.100
>
> Machine: 500 Mhz/ 512MB RAM / IDE HDD
>
> Under PG:  Data is returned in 26 secs!!
> Under SQL Server:  Data is returned in 5 secs.
> Under SQLBase:     Data is returned in 6 secs.
> Under SAPDB:        Data is returned in 7 secs.

I created a similar table (934 bytes, 102K records) on a slightly faster
machine: P3/800 + 512MB RAM + IDE HD. The server OS is Solaris 8 x86 and
the version is 7.3.3.

On the server (via PSQL client) : 7.5 seconds
Using ODBC under VFPW: 10.5 seconds

How that translates to what you should see, I'm not sure. Assuming it
was just the CPU difference, you should see numbers of roughly 13
seconds. But the documentation says PG under CYGWIN is significantly
slower than PG under UNIX so your mileage may vary...

Have you changed any of the settings yet in postgresql.conf,
specifically the shared_buffers setting?


Re: SELECT's take a long time compared to other DBMS

От
"scott.marlowe"
Дата:
On Wed, 3 Sep 2003, Relaxin wrote:

> I have a table with 102,384 records in it, each record is 934 bytes.
>
> Using the follow select statement:
>   SELECT * from <table>
>
> PG Info: version 7.3.4 under cygwin on Windows 2000
> ODBC: version 7.3.100
>
> Machine: 500 Mhz/ 512MB RAM / IDE HDD
>
>
> Under PG:  Data is returned in 26 secs!!
> Under SQL Server:  Data is returned in 5 secs.
> Under SQLBase:     Data is returned in 6 secs.
> Under SAPDB:        Data is returned in 7 secs.

This is typical of postgresql under cygwin, it's much faster under a Unix
OS like Linux or BSD.  That said, you CAN do some things to help speed it
up, the biggest being tuning the shared_buffers to be something large
enough to hold a fair bit of data.  Set the shared_buffers to 1000,
restart, and see if things get better.

Running Postgresql in a unix emulation layer is guaranteed to make it
slow.  If you've got a spare P100 with 128 Meg of RAM you can throw redhat
9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your
500MHZ cygwin box, and might even keep up with the other databases on that
machine as well.


Re: SELECT's take a long time compared to other DBMS

От
"scott.marlowe"
Дата:
You would "get" all rows, but they'd be stored server side until your
client asked for them.

I.e. a cursor would level the field here, since you say that the other
test cases stored the entire result set on the server.  Or did I
misunderstand what you meant there?

On Thu, 4 Sep 2003, Relaxin wrote:

> All rows are required.
>
> ""Shridhar Daithankar"" <shridhar_daithankar@persistent.co.in> wrote in
> message news:3F573E8B.31916.A1063F8@localhost...
> > On 4 Sep 2003 at 0:48, Relaxin wrote:
> > > All of the databases that I tested the query against gave me immediate
> > > access to ANY row of the resultset once the data had been returned.
> > > Ex. If  I'm currently at the first row and then wanted to goto the
> 100,000
> > > row, I would be there immediately, and if I wanted to then goto the 5
> > > row...same thing, I have the record immediately!
> > >
> > > The other databases I tested against stored the entire resultset on the
> > > Server, I'm not sure what PG does...It seems that brings the entire
> > > resultset client side.
> > > If that is the case, how can I have PG store the resultset on the Server
> AND
> > > still allow me immediate access to ANY row in the resultset?
> >
> > You can use a cursor and get only required rows.
> >
> >
> > Bye
> >  Shridhar
> >
> > --
> > Nick the Greek's Law of Life: All things considered, life is 9 to 5
> against.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>


Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
>
> Have you changed any of the settings yet in postgresql.conf,
> specifically the shared_buffers setting?
>

fsync = false
tcpip_socket = true
shared_buffers = 128



Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
I reset the shared_buffers to 1000 from 128, but it made no difference.

""scott.marlowe"" <scott.marlowe@ihs.com> wrote in message
news:Pine.LNX.4.33.0309041625300.28714-100000@css120.ihs.com...
> On Wed, 3 Sep 2003, Relaxin wrote:
>
> > I have a table with 102,384 records in it, each record is 934 bytes.
> >
> > Using the follow select statement:
> >   SELECT * from <table>
> >
> > PG Info: version 7.3.4 under cygwin on Windows 2000
> > ODBC: version 7.3.100
> >
> > Machine: 500 Mhz/ 512MB RAM / IDE HDD
> >
> >
> > Under PG:  Data is returned in 26 secs!!
> > Under SQL Server:  Data is returned in 5 secs.
> > Under SQLBase:     Data is returned in 6 secs.
> > Under SAPDB:        Data is returned in 7 secs.
>
> This is typical of postgresql under cygwin, it's much faster under a Unix
> OS like Linux or BSD.  That said, you CAN do some things to help speed it
> up, the biggest being tuning the shared_buffers to be something large
> enough to hold a fair bit of data.  Set the shared_buffers to 1000,
> restart, and see if things get better.
>
> Running Postgresql in a unix emulation layer is guaranteed to make it
> slow.  If you've got a spare P100 with 128 Meg of RAM you can throw redhat
> 9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your
> 500MHZ cygwin box, and might even keep up with the other databases on that
> machine as well.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>



Re: SELECT's take a long time compared to other DBMS

От
Christopher Browne
Дата:
A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com> wrote:
>> Have you changed any of the settings yet in postgresql.conf,
>> specifically the shared_buffers setting?
>
> fsync = false
> tcpip_socket = true
> shared_buffers = 128

Change fsync to true (you want your data to survive, right?) and
increase shared buffers to something that represents ~10% of your
system memory, in blocks of 8K.

So, if you have 512MB of RAM, then the total blocks is 65536, and it
would likely be reasonable to increase shared_buffers to 1/10 of that,
or about 6500.

What is the value of effective_cache_size?  That should probably be
increased a whole lot, too.  If you are mainly just running the
database on your system, then it would be reasonable to set it to most
of memory, or
  (* 1/2 (/ (* 512 1024 1024) 8192))
32768.

None of this is likely to substantially change the result of that one
query, however, and it seems quite likely that it is because
PostgreSQL is honestly returning the whole result set of ~100K rows at
once, whereas the other DBMSes are probably using cursors to return
only the few rows of the result that you actually looked at.
--
"cbbrowne","@","cbbrowne.com"
http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
kiss, a last cigarette, or any other form of last request."
<http://www.eviloverlord.com/>

Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
Thank you Christopher.

> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.

I turned it off just in the hope that things would run faster.

> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.

Finally, someone who will actually assume/admit that it is returning the
entire result set to the client.
Where as other DBMS manage the records at the server.

I hope PG could fix/enhance this issue.

There are several issues that's stopping our company from going with PG
(with paid support, if available), but this seems to big the one at the top
of the list.

The next one is the handling of BLOBS.  PG handles them like no other system
I have ever come across.

After that is a native Windows port, but we would deal cygwin (for a very
little while) if these other issues were handled.

Thanks





"Christopher Browne" <cbbrowne@acm.org> wrote in message
news:m3fzjc58ll.fsf@chvatal.cbbrowne.com...
> A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com>
wrote:
> >> Have you changed any of the settings yet in postgresql.conf,
> >> specifically the shared_buffers setting?
> >
> > fsync = false
> > tcpip_socket = true
> > shared_buffers = 128
>
> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.
>
> So, if you have 512MB of RAM, then the total blocks is 65536, and it
> would likely be reasonable to increase shared_buffers to 1/10 of that,
> or about 6500.
>
> What is the value of effective_cache_size?  That should probably be
> increased a whole lot, too.  If you are mainly just running the
> database on your system, then it would be reasonable to set it to most
> of memory, or
>   (* 1/2 (/ (* 512 1024 1024) 8192))
> 32768.
>
> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.
> --
> "cbbrowne","@","cbbrowne.com"
> http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
> Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
> kiss, a last cigarette, or any other form of last request."
> <http://www.eviloverlord.com/>



Re: SELECT's take a long time compared to other DBMS

От
Neil Conway
Дата:
On Thu, 2003-09-04 at 22:13, Relaxin wrote:
> Finally, someone who will actually assume/admit that it is returning the
> entire result set to the client.
> Where as other DBMS manage the records at the server.

Is there a reason you can't use cursors (explicitely, or via ODBC if it
provides some glue on top of them) to keep the result set on the server?

http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html
http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html

> The next one is the handling of BLOBS.  PG handles them like no other system
> I have ever come across.

Just FYI, you can use both the lo_*() functions, as well as simple
bytea/text columns (which can be very large in PostgreSQL).

-Neil



Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
> Is there a reason you can't use cursors (explicitely, or via ODBC if it
> provides some glue on top of them) to keep the result set on the server?
>
> http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html
> http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html

I can only use generally accepted forms of connectivity (ie. ODBC, ADO or
OLEDB).
This is what many of the people on the Windows side are going to need,
because most of us are going to be converting from an existing already
established system, such as Oracle, SQL Server or DB2, all of which have 1
or more of the 3 mentioned above.


> > The next one is the handling of BLOBS.  PG handles them like no other
system
> > I have ever come across.
>
> Just FYI, you can use both the lo_*() functions, as well as simple
> bytea/text columns (which can be very large in PostgreSQL).
>
> -Neil

I know PG has a ODBC driver (that's all I've been using), but it or PG just
doesn't handle BLOBS the way people on the Windows side (don't know about
Unix) are use too.

There is this conversion to octet that must be performed on the data , I
don't understand why, but I guess there was a reason for it long ago, but it
seems that it can now be modified to just accept ANY byte you give it and
then store it without any manipulation of the data.
This will make Postgresql much more portable for the Windows developers...no
need for any special handling for a data type that all large RDBMS support.


Thanks



Re: SELECT's take a long time compared to other DBMS

От
Eirik Oeverby
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I think the problem is the ODBC driver NOT using cursors properly even
if it should. The database itself is not doing anything it shouldn't do,
in fact it has all the needed functionality to handle this request in a
fast and effective way - just like any other respectable RDBMS.

I don't know what ODBC driver you are using, and how it is configrued -
and I never actually used PostgreSQL with ODBC myself. However in the
applications I have developed we DO use 'standardized' DB access
libraries, which work on just about any DBMS you throw them at.
In our development system, which is running on a low-end dual P2-433mhz
box with IDE drives, we routinely test both simple queries as yours and
more complex ones, which at times returns several hundred
thousand (or sometimes even millions) of rows. And processing time is,
generally speaking, in range with what you are seeing on the other
DBMSes you have.

So if PG is indeed returning ALL the rows, it is because it is
explicitly told to by the ODBC driver, so you need to look there to find
the problem. Could there be some kind of connection parameters you are
overlooking, or is the driver too old? Just throwing out ideas here,
most likely you have already thought about it :)

Just thought I'd point out that this is NOT expected behaviour from PG
itself.

/Eirik

On Thu, 4 Sep 2003 21:59:01 -0700
"Relaxin" <noname@spam.com> wrote:

> > Is there a reason you can't use cursors (explicitely, or via ODBC if
> > it provides some glue on top of them) to keep the result set on the
> > server?
> >
> > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html
> > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html
>
> I can only use generally accepted forms of connectivity (ie. ODBC, ADO
> or OLEDB).
> This is what many of the people on the Windows side are going to need,
> because most of us are going to be converting from an existing already
> established system, such as Oracle, SQL Server or DB2, all of which
> have 1 or more of the 3 mentioned above.
>
>
> > > The next one is the handling of BLOBS.  PG handles them like no
> > > other
> system
> > > I have ever come across.
> >
> > Just FYI, you can use both the lo_*() functions, as well as simple
> > bytea/text columns (which can be very large in PostgreSQL).
> >
> > -Neil
>
> I know PG has a ODBC driver (that's all I've been using), but it or PG
> just doesn't handle BLOBS the way people on the Windows side (don't
> know about Unix) are use too.
>
> There is this conversion to octet that must be performed on the data ,
> I don't understand why, but I guess there was a reason for it long
> ago, but it seems that it can now be modified to just accept ANY byte
> you give it and then store it without any manipulation of the data.
> This will make Postgresql much more portable for the Windows
> developers...no need for any special handling for a data type that all
> large RDBMS support.
>
>
> Thanks
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 3: if posting/reading
> through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that
>       your message can get through to the mailing list cleanly


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/WBcAdAvR8ct7fEcRAvZPAJ9FgkYxck6Yh5gPeomk8QgWraeV0gCfQF/v
CjyihMwTdrEZo2Y5YBwLVrI=
=Ng2I
-----END PGP SIGNATURE-----

Re: SELECT's take a long time compared to other DBMS

От
"Patrick Hatcher"
Дата:
Relaxin,
I can't remember during this thread if you said you were using ODBC or not.
If you are, then your problem is with the ODBC driver.  You will need to
check the Declare/Fetch box or you will definitely bring back the entire
recordset.  For small a small recordset this is not a problem, but the
larger the recordset the slower the data is return to the client.  I played
around with the cache size on the driver and found a value between 100 to
200 provided good results.

HTH
Patrick Hatcher





                             
                    "Relaxin" <noname@spam.com>
                             
                    Sent by:                           To:     pgsql-performance@postgresql.org
                             
                    pgsql-performance-owner@post       cc:
                             
                    gresql.org                         Subject:     Re: [PERFORM] SELECT's take a long time compared to
otherDBMS                    

                             

                             
                    09/04/2003 07:13 PM
                             

                             




Thank you Christopher.

> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.

I turned it off just in the hope that things would run faster.

> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.

Finally, someone who will actually assume/admit that it is returning the
entire result set to the client.
Where as other DBMS manage the records at the server.

I hope PG could fix/enhance this issue.

There are several issues that's stopping our company from going with PG
(with paid support, if available), but this seems to big the one at the top
of the list.

The next one is the handling of BLOBS.  PG handles them like no other
system
I have ever come across.

After that is a native Windows port, but we would deal cygwin (for a very
little while) if these other issues were handled.

Thanks





"Christopher Browne" <cbbrowne@acm.org> wrote in message
news:m3fzjc58ll.fsf@chvatal.cbbrowne.com...
> A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com>
wrote:
> >> Have you changed any of the settings yet in postgresql.conf,
> >> specifically the shared_buffers setting?
> >
> > fsync = false
> > tcpip_socket = true
> > shared_buffers = 128
>
> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.
>
> So, if you have 512MB of RAM, then the total blocks is 65536, and it
> would likely be reasonable to increase shared_buffers to 1/10 of that,
> or about 6500.
>
> What is the value of effective_cache_size?  That should probably be
> increased a whole lot, too.  If you are mainly just running the
> database on your system, then it would be reasonable to set it to most
> of memory, or
>   (* 1/2 (/ (* 512 1024 1024) 8192))
> 32768.
>
> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.
> --
> "cbbrowne","@","cbbrowne.com"
> http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
> Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
> kiss, a last cigarette, or any other form of last request."
> <http://www.eviloverlord.com/>



---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend




Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
Expect that the Declare/Fetch only creates a forwardonly cursor, you can go
backwards thru the result set.

""Patrick Hatcher"" <PHatcher@macys.com> wrote in message
news:OFAD2A2CF4.499F8F67-ON88256D98.00527BCB-88256D98.00538130@fds.com...
>
> Relaxin,
> I can't remember during this thread if you said you were using ODBC or
not.
> If you are, then your problem is with the ODBC driver.  You will need to
> check the Declare/Fetch box or you will definitely bring back the entire
> recordset.  For small a small recordset this is not a problem, but the
> larger the recordset the slower the data is return to the client.  I
played
> around with the cache size on the driver and found a value between 100 to
> 200 provided good results.
>
> HTH
> Patrick Hatcher
>
>
>
>
>
>                     "Relaxin" <noname@spam.com>
>                     Sent by:                           To:
pgsql-performance@postgresql.org
>                     pgsql-performance-owner@post       cc:
>                     gresql.org                         Subject:     Re:
[PERFORM] SELECT's take a long time compared to other DBMS
>
>
>                     09/04/2003 07:13 PM
>
>
>
>
>
> Thank you Christopher.
>
> > Change fsync to true (you want your data to survive, right?) and
> > increase shared buffers to something that represents ~10% of your
> > system memory, in blocks of 8K.
>
> I turned it off just in the hope that things would run faster.
>
> > None of this is likely to substantially change the result of that one
> > query, however, and it seems quite likely that it is because
> > PostgreSQL is honestly returning the whole result set of ~100K rows at
> > once, whereas the other DBMSes are probably using cursors to return
> > only the few rows of the result that you actually looked at.
>
> Finally, someone who will actually assume/admit that it is returning the
> entire result set to the client.
> Where as other DBMS manage the records at the server.
>
> I hope PG could fix/enhance this issue.
>
> There are several issues that's stopping our company from going with PG
> (with paid support, if available), but this seems to big the one at the
top
> of the list.
>
> The next one is the handling of BLOBS.  PG handles them like no other
> system
> I have ever come across.
>
> After that is a native Windows port, but we would deal cygwin (for a very
> little while) if these other issues were handled.
>
> Thanks
>
>
>
>
>
> "Christopher Browne" <cbbrowne@acm.org> wrote in message
> news:m3fzjc58ll.fsf@chvatal.cbbrowne.com...
> > A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com>
> wrote:
> > >> Have you changed any of the settings yet in postgresql.conf,
> > >> specifically the shared_buffers setting?
> > >
> > > fsync = false
> > > tcpip_socket = true
> > > shared_buffers = 128
> >
> > Change fsync to true (you want your data to survive, right?) and
> > increase shared buffers to something that represents ~10% of your
> > system memory, in blocks of 8K.
> >
> > So, if you have 512MB of RAM, then the total blocks is 65536, and it
> > would likely be reasonable to increase shared_buffers to 1/10 of that,
> > or about 6500.
> >
> > What is the value of effective_cache_size?  That should probably be
> > increased a whole lot, too.  If you are mainly just running the
> > database on your system, then it would be reasonable to set it to most
> > of memory, or
> >   (* 1/2 (/ (* 512 1024 1024) 8192))
> > 32768.
> >
> > None of this is likely to substantially change the result of that one
> > query, however, and it seems quite likely that it is because
> > PostgreSQL is honestly returning the whole result set of ~100K rows at
> > once, whereas the other DBMSes are probably using cursors to return
> > only the few rows of the result that you actually looked at.
> > --
> > "cbbrowne","@","cbbrowne.com"
> > http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
> > Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
> > kiss, a last cigarette, or any other form of last request."
> > <http://www.eviloverlord.com/>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>



Re: SELECT's take a long time compared to other DBMS

От
Neil Conway
Дата:
On Fri, 2003-09-05 at 14:18, Relaxin wrote:
> Expect that the Declare/Fetch only creates a forwardonly cursor, you can go
> backwards thru the result set.

No, DECLARE can create scrollable cursors, read the ref page again. This
functionality is much improved in PostgreSQL 7.4, though.

-Neil



Re: SELECT's take a long time compared to other DBMS

От
"Relaxin"
Дата:
It is forward only in the ODBC driver.

"Neil Conway" <neilc@samurai.com> wrote in message
news:1062796189.447.9.camel@tokyo...
> On Fri, 2003-09-05 at 14:18, Relaxin wrote:
> > Expect that the Declare/Fetch only creates a forwardonly cursor, you can
go
> > backwards thru the result set.
>
> No, DECLARE can create scrollable cursors, read the ref page again. This
> functionality is much improved in PostgreSQL 7.4, though.
>
> -Neil
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>