Обсуждение: Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
Hi again, It seems unfortunate, but if I can't efficiently (i.e. no copying of whole table, no blocking and using index) browse the table, I might not be able to "sell" postgres into the commercial environment. :-( RAIMA Velocis might win :-( Please please help me solve this or make workarounds or anything. I would *really* like to see PosgreSQL to be playing against the Big (commercial) Boys ! What the whole problem really reduces to, is to be able to get the next / previous value in an index. If I can do that, I win. No SELECT (and thus no locking) and no huge copying !). It seems to me like something primitive, so it should be easy. But it isn't obvious to me. Please help. Below is more description and questions. On Thu, 22 Jan 1998, Dustin Sallings wrote: > > Well, first let the user look up the manufacturer's part, and then let > > him "browse" from there one (look at the next / previous ones), where > > "next" is defined as the one closest bigger than the current one (based on > > a sort attribute(s) on which I have an index). > > That's easy in a normal application type place where you keep state, if > you're doing it as a web based application, it's a different story... Assuming > you're writing a standalone application, you just keep a cursor to what you're > looking at, and move it back and forth as needed, otherwise, otherwise, you're > going to have to do the whole query again AFAIK. a few questions : 0. having a value of a field on which there is an index, how can I do : a) current_pointer = some_function("value_I_have"); b) next_pointer = some_other_function(current_pointer); c) one_tupple = yet_another_function(next_pointer); If I can accomplish a,b,c, then I win and I don't have to do questions 1..5 below. 1. I assume I have to say "declare cursor mycursor for SELECT field1,field2 FROM mytable ORDER BY one_of_fields_on_which_I_have_index" Am I right ? Or can I do it some other way ? 2. Will not the step (1.) *copy* the whole 40MB table ? (that would trash the system if several people at once want to "browse" through the table.) 3. Will not the step (1.) read-lock *lock* the whole table ? (i.e. while somebody is "browsing", I need other people to be able to update the database). 4. Will the step (1.) cause Postgres to use index for moving forth / back ? Because if it doesn't, it might take ages to either declare the cursor or to move forth / back. 5. If the step (1.) will not cause Postgres to use index, what will ? 6. How do I do it if I don't want to start "browsing" from the first row ? Thanx, Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
От
The Hermit Hacker
Дата:
On Fri, 23 Jan 1998, Jan Vicherek wrote: > Hi again, > > It seems unfortunate, but if I can't efficiently (i.e. no copying of > whole table, no blocking and using index) browse the table, I might not > be able to "sell" postgres into the commercial environment. :-( RAIMA > Velocis might win :-( > > Please please help me solve this or make workarounds or anything. I > would *really* like to see PosgreSQL to be playing against the Big > (commercial) Boys ! I'm curious, but can the "Big (commercial) Boys" do this? If so, can you please provide an example of which and how? Most of us here have access to an one or the other (me, Oracle) to use as a sample system...if we can prove that it does work on another system, then we have something to work with, but right now all I've seen is "I wish I could do this", and several examples on how to accomplish it using PostgreSQL, but that's it... > > What the whole problem really reduces to, is to be able to get the next > / previous value in an index. If I can do that, I win. No SELECT (and thus > no locking) and no huge copying !). It seems to me like something > primitive, so it should be easy. But it isn't obvious to me. Please help. If there is no SELECT, how do you get data out of an SQL database? *raised eyebrow* > 0. having a value of a field on which there is an index, how can I do : > a) current_pointer = some_function("value_I_have"); > b) next_pointer = some_other_function(current_pointer); > c) one_tupple = yet_another_function(next_pointer); > If I can accomplish a,b,c, then I win and I don't have to do questions > 1..5 below. Why not put a sequence field on the table so that you can do: select * from table where rowid = n; -or- select * from table where rowid = n - 1; -or- select * from table where rowid = n + 1; -or- select * from table where rowid >= n and rowid <= n+x; And create the index on rowid? Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Fri, 23 Jan 1998, The Hermit Hacker wrote: > I'm curious, but can the "Big (commercial) Boys" do this? If so, > can you please provide an example of which and how? Hmm, well, the one we are switching from does this ;-) (Informix 3.3 ALL-II C interface). It's not SQL, tho. > Most of us here have > access to an one or the other (me, Oracle) to use as a sample system...if > we can prove that it does work on another system, then we have something > to work with > all I've seen is ... and several examples on how to accomplish it using > PostgreSQL, but that's it... Wait, have you seen here an example that accomplishes this which wouldn't need the whole table copied and wouldn't lock the table against updates ? > > What the whole problem really reduces to, is to be able to get the next > > / previous value in an index. If I can do that, I win. No SELECT (and thus > > no locking) and no huge copying !). It seems to me like something > > primitive, so it should be easy. But it isn't obvious to me. Please help. > > If there is no SELECT, how do you get data out of an SQL database? > *raised eyebrow* Sorry, I meant no SELECT that couln't be done through index and so wouldn't copy huge amounts of data into a result table. And wouldn't lock the whole table. > > 0. having a value of a field on which there is an index, how can I do : > > a) current_pointer = some_function("value_I_have"); > > b) next_pointer = some_other_function(current_pointer); > > c) one_tupple = yet_another_function(next_pointer); > > If I can accomplish a,b,c, then I win and I don't have to do questions > > 1..5 below. > > Why not put a sequence field on the table so that you can do: > select * from table where rowid = n; -or- > select * from table where rowid = n - 1; -or- > select * from table where rowid = n + 1; -or- > select * from table where rowid >= n and rowid <= n+x; > > And create the index on rowid? Because I also need to be able to INSERT rows. That would require renumeration of half the table (remember, it's 40MB, 400,000 rows) every time I do an INSERT. I *still* think that there *has to* be a way to find a value that is immediatelly next to one I have. This seems like such a primitive operation. Even the backend must be doing it on some operations, it would seem. Maybe even in SQL. Maybe something like (I'm not an SQL expert) : "SELECT IndexField from MyTable where InxdexField > 'my_current_value' and IndexField < ("all IndexFields that are bigger than the IndexField searched for") Important : I'm not looking for a "pure SQL" solution. I'm writing a C emulation library, so if it can be achieved via a call to a C Postgres function, it would be great. Thanx, Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
От
The Hermit Hacker
Дата:
On Fri, 23 Jan 1998, Jan Vicherek wrote: > On Fri, 23 Jan 1998, The Hermit Hacker wrote: > > > I'm curious, but can the "Big (commercial) Boys" do this? If so, > > can you please provide an example of which and how? > > Hmm, well, the one we are switching from does this ;-) (Informix 3.3 > ALL-II C interface). It's not SQL, tho. Okay, well, ummm...now you are comparing apples to oranges then...if you wanted a non-SQL engine to replace Informix, PostgreSQL isn't what you are looking for :( > > all I've seen is ... and several examples on how to accomplish it using > > PostgreSQL, but that's it... > > Wait, have you seen here an example that accomplishes this which > wouldn't need the whole table copied and wouldn't lock the table against > updates ? First off, *nothing* you are going to be able to accomplish in *any* SQL engine is going to prevent locking the table against updates...the code that Bruce put in this afternoon for v6.3 is going to reduce the possibility of the lock causing a deadlock, but that is about it...the lock will still be created. > > Why not put a sequence field on the table so that you can do: > > select * from table where rowid = n; -or- > > select * from table where rowid = n - 1; -or- > > select * from table where rowid = n + 1; -or- > > select * from table where rowid >= n and rowid <= n+x; > > > > And create the index on rowid? > > Because I also need to be able to INSERT rows. That would require > renumeration of half the table (remember, it's 40MB, 400,000 rows) every > time I do an INSERT. Okay, you are confusing INSERT then...INSERT in SQL just adds a row, that's it...it doesn't perform any "sorting" on it...that's what the ORDER BY command does... ...but, I now understand what *you* mean by INSERT... > I *still* think that there *has to* be a way to find a value that is > immediatelly next to one I have. This seems like such a primitive > operation. Even the backend must be doing it on some operations, it would > seem. Not possible...INSERT into a table doesn't "merge" the record between its lower/higher bounds...it just adds it to the very end of the table. And an index doesn't "sort" the data either...that is what the ORDER BY clause is for... > Maybe even in SQL. Maybe something like (I'm not an SQL expert) : "SELECT > IndexField from MyTable where InxdexField > 'my_current_value' and > IndexField < ("all IndexFields that are bigger than the IndexField > searched for") From your sample above, is your first SQL call going to pull out all 40MB of data in one select statement, then your second 40MB minus the first X records? What you want to do is: begin declare mycursor cursor for select * from pg-user order by <somefield>; move $forward in FOO; fetch $retrieve in FOO; close foo; end; Basically, take your table, move to the $forward record in it, grab the next $retrieve records and then release the table. Your first time through, $forward might just equal 0...but when you run it the second time through, you pass it back a $forward value equal to $forward + $retrieve, so that you start at where you ended the first time. This is how I deal with one of my projects, where I have to do exactly what you are looking for... About the only part of the SELECT above is the ORDER BY, and alot of that is as much a restriction on your hardware then anything...the major performance boost in an ORDER BY is memory..keeping it off the hard drive. > Important : I'm not looking for a "pure SQL" solution. I'm writing a C > emulation library, so if it can be achieved via a call to a C Postgres > function, it would be great. You'd be better off looking at something like GDBM (which, by the way, also creates a lock against updates while another is reading the database)...unless I'm missing something, you aren't looking at doing anything that *requires* an SQL engine :( Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)
От
Jan Vicherek
Дата:
On Fri, 23 Jan 1998, The Hermit Hacker wrote: > On Fri, 23 Jan 1998, Jan Vicherek wrote: > > > On Fri, 23 Jan 1998, The Hermit Hacker wrote: > > > > > I'm curious, but can the "Big (commercial) Boys" do this? If so, > > > can you please provide an example of which and how? > > > > Hmm, well, the one we are switching from does this ;-) (Informix 3.3 > > ALL-II C interface). It's not SQL, tho. > > Okay, well, ummm...now you are comparing apples to oranges > then...if you wanted a non-SQL engine to replace Informix, PostgreSQL > isn't what you are looking for :( Hmm, I thought that PostgreSQL would have two interfaces : an SQL interface (doing all the optimizing, joining and the complicated stuff) and a low(er)-level C/C++/whatever interface which would allow to use less of the intelligence of PostgreSQL and more of it's retireval system, including indexing. > > > all I've seen is ... and several examples on how to accomplish it using > > > PostgreSQL, but that's it... > > > > Wait, have you seen here an example that accomplishes this which > > wouldn't need the whole table copied and wouldn't lock the table against > > updates ? > > First off, *nothing* you are going to be able to accomplish in > *any* SQL engine is going to prevent locking the table against > updates... Sorry, I wasn't precise again : I mean I can't stop other people from updating a table just because one guy is out there "browsing" through it. > > > Why not put a sequence field on the table so that you can do: > > > select * from table where rowid = n; -or- > > > select * from table where rowid = n - 1; -or- > > > select * from table where rowid = n + 1; -or- > > > select * from table where rowid >= n and rowid <= n+x; > > > > > > And create the index on rowid? > > > > Because I also need to be able to INSERT rows. That would require > > renumeration of half the table (remember, it's 40MB, 400,000 rows) every > > time I do an INSERT. > > Okay, you are confusing INSERT then...INSERT in SQL just adds a > row, that's it...it doesn't perform any "sorting" on it...that's what the > ORDER BY command does... The INSERT causes all indices on that table to be updated. If I used a field in the table to serve as a pseudo-index (user-level index, not system-level index), *then* that would require "manual" renumeration of rowid field in half the table every time I do an INSERT. > ...but, I now understand what *you* mean by INSERT... good :) > > I *still* think that there *has to* be a way to find a value that is > > immediatelly next to one I have. This seems like such a primitive > > operation. Even the backend must be doing it on some operations, it would > > seem. > > Not possible... Remeber that I have an index on the field that I'm looking for the "next" value. Is it not possible in such case ? Even by calling some lower-level C function to give me the next value ? > INSERT into a table doesn't "merge" the record > between its lower/higher bounds...it just adds it to the very end of the > table. And an index doesn't "sort" the data either...that is what the > ORDER BY clause is for... I guess I wasn't precise enough. I was aware of these two facts when I was writing the previous msgs. Index doesn't "sort". Index gets sorted. But pseudo-index doesn't even get sorted, I would have to sort it "manually" after every INSERT I would make. > > Maybe even in SQL. Maybe something like (I'm not an SQL expert) : "SELECT > > IndexField from MyTable where InxdexField > 'my_current_value' and > > IndexField < ("all IndexFields that are bigger than the IndexField > > searched for") > > From your sample above, is your first SQL call going to pull out > all 40MB of data in one select statement, then your second 40MB minus the > first X records? Hmm, I could have hoped that the optimizer would recognize what I'm trying to do and would only pull out that one record, but I guess not. Never mind this attempt then. > What you want to do is: > > begin > declare mycursor cursor for select * from pg-user order by <somefield>; > move $forward in FOO; > fetch $retrieve in FOO; > close foo; > end; (I guess you meant "close mycursor", not "close foo".) I've just tried this on half-the-size table, and it took 70 seconds :-( meray=> begin; BEGIN meray=> declare mycursor cursor for select * from master order by mpartno; SELECT meray=> fetch forward 1 in mycursor; close mycursor; end; mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip -------------------+--------+--------+-----+--------+-------+-----+--------+---------------- F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG (1 row) meray=> close mycursor; CLOSE meray=> end; END meray=> \d masterind Table = masterind +----------------------------------+----------------------------------+-------+ | Field | Type | Length| +----------------------------------+----------------------------------+-------+ | mpartno | (bp)char | 19 | +----------------------------------+----------------------------------+-------+ Since the ORDER BY field (mpartno) has an index on it, why wouldn't Postgres just take that index and go with it one-by-one ?! Such approach wouldn't take 70 seconds on idle P200+ 64MB, but would take 1/2 a second. Is such approach pricipally/fundamentally wrong, or is this correct approach, and postgres just needs some more code written for this thing to work ? (I.e. would other things break if this was implemented ?) -- whereas single select (which uses index) takes half a second. meray=> select * from master where mpartno = 'F 0 AVZK '; mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip -------------------+--------+--------+-----+--------+-------+-----+--------+---------------- F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG (2 rows) meray=> Now I want do see the row that is in the masterind after the first, in this case it contains the same data. If we can get "this new approach" to work, > Basically, take your table, move to the $forward record in it, > grab the next $retrieve records and then release the table. But this creates 70-second lock on the table every time I want to see next record :-( > > Important : I'm not looking for a "pure SQL" solution. I'm writing a C > > emulation library, so if it can be achieved via a call to a C Postgres > > function, it would be great. > > You'd be better off looking at something like GDBM (which, by the > way, also creates a lock against updates while another is reading the > database)...unless I'm missing something, you aren't looking at doing > anything that *requires* an SQL engine :( This looks like a sad suggestion to me. I'll kick and scream just that I wouldn't have to go with Velocis RAIMA's lower-level C interface. (I'm just staring into its manual to figure out the sequence of C calls I would have to make to accomplish this.) Thanx to all, Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
> You'd be better off looking at something like GDBM (which, by the > way, also creates a lock against updates while another is reading the > database)...unless I'm missing something, you aren't looking at doing > anything that *requires* an SQL engine :( I agree. GDBM is a fine system for such uses. -- Bruce Momjian maillist@candle.pha.pa.us
> > On Fri, 23 Jan 1998, The Hermit Hacker wrote: > > > I'm curious, but can the "Big (commercial) Boys" do this? If so, > > can you please provide an example of which and how? > > Hmm, well, the one we are switching from does this ;-) (Informix 3.3 > ALL-II C interface). It's not SQL, tho. Ah, the Informix ISAM interface that used to sit underneath the Informix SQL engine. > Important : I'm not looking for a "pure SQL" solution. I'm writing a C > emulation library, so if it can be achieved via a call to a C Postgres > function, it would be great. > You can put an index on the table, and embed a function inside the engine to spin through the index, getting valid rows. The code is already there to spin through the index, so you could just hook on to the index, get the 'tid' and use that to get data from the physical table. No locking, but you may get in trouble if other people are reading the index. Maybe a quick lock-unlock would do it for you. Not easy, because the engine really doesn't provide a user-friendly C interface for this, but it could be done. -- Bruce Momjian maillist@candle.pha.pa.us
On Fri, 23 Jan 1998, Bruce Momjian wrote: > > You'd be better off looking at something like GDBM (which, by the > > way, also creates a lock against updates while another is reading the > > database)...unless I'm missing something, you aren't looking at doing > > anything that *requires* an SQL engine :( > > I agree. GDBM is a fine system for such uses. We would like to convert to something that has more perspective, like PostgreSQL does. We are not converting to PostgreSQL just to get this one job done, but to layout a good platform for the future ... Yes ! PostgreSQL is highly regarded ! Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
On Fri, 23 Jan 1998, Jan Vicherek wrote: [big chop] > This looks like a sad suggestion to me. I'll kick and scream just that > I wouldn't have to go with Velocis RAIMA's lower-level C interface. (I'm > just staring into its manual to figure out the sequence of C calls I would > have to make to accomplish this.) If it is still the same as when it was called DBVista... then d_keyfind(xx,x,x) while(db_status == S_OK) { d_recread(x,x,x,x); d_keynext(ssss); } (No error checking in that code...<g>) ------------ end Raima note ------------------------------------ BUT - the functionality you want could be a special case of the Postgres backend and would benefit a very important real world scenario (scrolling inquiries) IF - declare cursor and select 'whole table' and 'order by a field which is a key' Treat as special - dont handle ANY data until the FETCH statement. IE. Defer the IO until the fetch happens - then access the row(s) via the index This would give an enormous boost in effective performance with the common application task of a scrolling inquiry against a large table. Implementing it would be transparent to current application code and would offer a competitive edge over other products without it. Normally I'm not one to support 'special case' coding tricks, but the public impact on visible performance probably justifies looking into this a bit further. I suspect the query processor can identify the situation. It would need to remember that it was special and leave enough control info to allow the 'fetch' processing to respond to that special case. Comments from the guru's who are up to their necks in the code ? Regards Ron O'Hara > > > Thanx to all, > > Jan > > -- Gospel of Jesus is the saving power of God for all who believe -- > Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza > >>> Free Software Union President ... www.fslu.org <<< > Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com > >
Hey ! You sound like you know exactly what I mean ! ;-) On Fri, 23 Jan 1998, Bruce Momjian wrote: > > Important : I'm not looking for a "pure SQL" solution. I'm writing a C > > emulation library, so if it can be achieved via a call to a C Postgres > > function, it would be great. > > You can put an index on the table, and embed a function inside the > engine to spin through the index, getting valid rows. Aha, this implies that in the index there are valid and non-valid rows. I guess those that are to be "valid" (no current transactions on that row) and the non-valid "those that are subject to an update lock / transaction". > The code is > already there to spin through the index, so you could just hook on to > the index, get the 'tid' and use that to get data from the physical > table. This sounds like what I meant. Would you have any pointers (to source files / functions I should read / understand / call ) ? > No locking, but you may get in trouble if other people are > reading the index. I don't see what trouble I could get into. Any hints appreciated. > Maybe a quick lock-unlock would do it for you. Do you mean lock-unlock on the table or on the index ? > Not easy, because the engine really doesn't provide a user-friendly C > interface for this, but it could be done. Can you point me to some of the user-nonfriendly files / functions I should use then ? Thanx a bunch (I have to have the implementation by tomorrow 10am), Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
Here is another exact perfect descrtiption of what I mean : (you don't have to read it again, it's only repost) Jan On Sat, 24 Jan 1998, Ron O'Hara wrote: > On Fri, 23 Jan 1998, Jan Vicherek wrote: > [big chop] > > > This looks like a sad suggestion to me. I'll kick and scream just that > > I wouldn't have to go with Velocis RAIMA's lower-level C interface. (I'm > > just staring into its manual to figure out the sequence of C calls I would > > have to make to accomplish this.) > > If it is still the same as when it was called DBVista... then > > d_keyfind(xx,x,x) > while(db_status == S_OK) > { > d_recread(x,x,x,x); > d_keynext(ssss); > } > (No error checking in that code...<g>) > > ------------ end Raima note ------------------------------------ Yea, it seems like that would be still the same ... thanx for saving me the time ! :) v--- Yes, yes yes, this is exactly what I'm trying to accompliesh ---v > BUT - the functionality you want could be a special case of the Postgres > backend and would benefit a very important real world scenario > (scrolling inquiries) > > IF - declare cursor > and select 'whole table' > and 'order by a field which is a key' > > Treat as special - dont handle ANY data until the FETCH statement. > IE. Defer the IO until the fetch happens - then access the row(s) > via the index > > This would give an enormous boost in effective performance with the common > application task of a scrolling inquiry against a large table. > Implementing it would be transparent to current application code and would > offer a competitive edge over other products without it. yes yes yes > Normally I'm not one to support 'special case' coding tricks, but the > public impact on visible performance probably justifies looking into this > a bit further. > I suspect the query processor can identify the situation. It would need to > remember that it was special and leave enough control info to allow the > 'fetch' processing to respond to that special case. > > Comments from the guru's who are up to their necks in the code ? ( I am eager ;-) Jan -- Gospel of Jesus is the saving power of God for all who believe -- Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza >>> Free Software Union President ... www.fslu.org <<< Interactive Electronic Design Inc. -#- PGP: finger honza@ied.com
On Sat, 24 Jan 1998, Ron O'Hara wrote: [ description of spercial-case deleted ] > Comments from the guru's who are up to their necks in the code ? Hmm, I don't know how busy you developers are, so I don't know what should I tell the customer tomorrow. Any proximity guesses ? Thanx, Jan
Re: Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)
От
The Hermit Hacker
Дата:
On Fri, 23 Jan 1998, Jan Vicherek wrote: > an SQL interface (doing all the optimizing, joining and the complicated > stuff) and > a low(er)-level C/C++/whatever interface which would allow to use less > of the intelligence of PostgreSQL and more of it's retireval system, > including indexing. *scratch head* indexing != sorting...indexing doesn't produce sorted data...indexing provides a means to very quickly retrieve *segments* of data from a large table of data... > > First off, *nothing* you are going to be able to accomplish in > > *any* SQL engine is going to prevent locking the table against > > updates... > > Sorry, I wasn't precise again : I mean I can't stop other people from > updating a table just because one guy is out there "browsing" through it. The new code that Bruce put in *should* provide for better schedualing of the locks on a table. Bruce can explain it alot better then me, but the way I understand it, before, if someone had a READ lock on the table and someone came in looking for a WRITE lock, they got held out. If, while the READ lock was still in effect, someone else came in to READ the table, they got in, effectively pushing the WRITE lock further down the queue. With Bruce's patches, that WRITE lock will prevent another READ from grabbing a "lock" until after the WRITE finishes. As I haven't had a chance to look at it yet, I'm curious as to how often someone sees a deadlock with the new code. I'm running an radiusd accounting system on v6.2.1 that has one table in it that is: postgres@zeus> ls -lt rad* -rw------- 1 postgres wheel 2678784 Jan 23 23:01 radlog_uniq_id -rw------- 1 postgres wheel 1564672 Jan 23 23:01 radlog_stop -rw------- 1 postgres wheel 2031616 Jan 23 23:01 radlog_start -rw------- 1 postgres wheel 6406144 Jan 23 23:01 radlog -rw------- 1 postgres wheel 1572864 Jan 23 23:01 radlog_userid -rw------- 1 postgres wheel 79519744 Jan 23 22:51 radhist -rw------- 1 postgres wheel 16818176 Jan 23 22:51 radhist_userid -rw------- 1 postgres wheel 16793600 Jan 23 22:51 radhist_uniq_id -rw------- 1 postgres wheel 11927552 Jan 23 22:51 radhist_stop -rw------- 1 postgres wheel 11919360 Jan 23 22:51 radhist_start Those are just two tables in it. The only problem that I've yet to have with it, as far as "deadlocks" are concerned (deadlocks mean that one lock is in place while a write lock is trying to happen) is during a vacuum...and that's running on an old 486 with one hard drive and *just upgraded to* 64Meg of RAM. Oh...and I'd say a good 75% of the activities on those above tables is multiple simultaneous updates...and the only thing that disrupts that is the vacuum process... Vadim...what is that humongous database you are running? And what are you running it on? > > > I *still* think that there *has to* be a way to find a value that is > > > immediatelly next to one I have. This seems like such a primitive > > > operation. Even the backend must be doing it on some operations, it would > > > seem. > > > > Not possible... > > Remeber that I have an index on the field that I'm looking for the > "next" value. > Is it not possible in such case ? No, as the index isn't "ordered"...there is no logical next/prev record in an index > Even by calling some lower-level C function to give me the next value ? there are no lower-level C functions...for that, you'd be looking at something like GDBM... > Index doesn't "sort". Index gets sorted. But pseudo-index doesn't even > get sorted, I would have to sort it "manually" after every INSERT I would > make. Index gets sorted by...? > I've just tried this on half-the-size table, and it took 70 seconds :-( > > meray=> begin; > BEGIN > meray=> declare mycursor cursor for select * from master order by mpartno; > SELECT > meray=> fetch forward 1 in mycursor; > close mycursor; > end; > mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip > -------------------+--------+--------+-----+--------+-------+-----+--------+---------------- > F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG > (1 row) > > meray=> close mycursor; > CLOSE > meray=> end; > END > meray=> \d masterind > > Table = masterind > +----------------------------------+----------------------------------+-------+ > | Field | Type | Length| > +----------------------------------+----------------------------------+-------+ > | mpartno | (bp)char | 19 | > +----------------------------------+----------------------------------+-------+ > > Since the ORDER BY field (mpartno) has an index on it, why wouldn't > Postgres just take that index and go with it one-by-one ?! Such approach An index isn't sorted. An index is only used with a WHERE clause (Bruce, I'm not misrepresenting that, am I?). A 'select * from table' will only do a Sequential Scan through the table for the data it needs. acctng=> select count(start) from radhist; count ------ 563258 (1 row) acctng=> explain select * from radhist; NOTICE:QUERY PLAN: Seq Scan on radhist (cost=27022.38 size=537648 width=72) EXPLAIN acctng=> explain select * from radhist where userid = 'scrappy'; NOTICE:QUERY PLAN: Index Scan on radhist (cost=2.05 size=1 width=72) EXPLAIN acctng=> Damn, I wasn't thinking. Okay...an index *is* sorted, in a sense. You index on *one* field of a table (with one exception that I know of, but I'm not getting into that here). The index only makes use of that if/when you do a WHERE clause on that field. acctng=> explain select * from radhist where port = 1; NOTICE:QUERY PLAN: Seq Scan on radhist (cost=27022.38 size=1 width=72) EXPLAIN Notice above it does a Seq Scan? I don't have an index on port, since I'll never do a WHERE "search" on it. > wouldn't take 70 seconds on idle P200+ 64MB, but would take 1/2 a second. How are you starting up postmaster? Take a look at the man page for postgres, which has an option: -S Specifies the amount of memory to be used by internal sorts before using disk files for sorting. This value is specified in 1k bytes, and defaults to 512. By default, very little memory is "reserved" for sorting (ORDER BY)...try increasing that and see if your performance improves any. My accounting server has it set for 10240 (10Meg), since it too had 64Meg of RAM and nobody expect as an accounting/RDBMS server. I could probably increase it past that, not sure... > Is such approach pricipally/fundamentally wrong, or is this correct > approach, and postgres just needs some more code written for this thing to > work ? (I.e. would other things break if this was implemented ?) See my above explanation...Bruce or Vadim could probably explain it better, as they are more solid in the internals themselves, but you are misunderstanding what an index is and what it is used for :( > -- whereas single select (which uses index) takes half a second. > meray=> select * from master where mpartno = 'F 0 AVZK '; > mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip > -------------------+--------+--------+-----+--------+-------+-----+--------+---------------- > F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG > F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG > (2 rows) > meray=> Assuming you have an index built on mpartno, then yes, this would use the index because (and only because) of the WHERE clause. Try using 'explain' at the beginning of your select statements...it will explain *what* it is going to do, but not actually do it. > Now I want do see the row that is in the masterind after the first, in > this case it contains the same data. > > If we can get "this new approach" to work, You can't, unless there is some logical "greater/less then" for mpartno, such that you could do something like (syntax will be wrong, its the concept that i'm showing): begin; declare cursor mycursor for select where mpartno >= 'F 0 AVZK ' ORDER by mpartno; fetch forward 1 from mycursor; close mycursor; end; That would speed up the search (should?) because now you are finding, in the index, all mpartno's that come after the one your stipulate, and then you are grabbing the first one off the stack...I don't guarantee it will be faster, because it also depends on how much data is returned. Just thought of it...after you load this table, you did do a 'vacuum analyze' on it, right? Just to get the stats up dated? > > Basically, take your table, move to the $forward record in it, > > grab the next $retrieve records and then release the table. > > But this creates 70-second lock on the table every time I want to see > next record :-( Most of that 70-seconds, I am guess, was spent in the ORDER by...try increasing -S to as much as you dare and see if that helps any... Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
От
The Hermit Hacker
Дата:
On Fri, 23 Jan 1998, Jan Vicherek wrote: > > You can put an index on the table, and embed a function inside the > > engine to spin through the index, getting valid rows. > > Aha, this implies that in the index there are valid and non-valid rows. > I guess those that are to be "valid" (no current transactions on that > row) and the non-valid "those that are subject to an update lock / > transaction". There is no record level locking available at this point in time, only table level. Bruce, just curious...this whole discussion...would it be moot if we did have record level locking? Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Fri, 23 Jan 1998, Jan Vicherek wrote: > > > On Sat, 24 Jan 1998, Ron O'Hara wrote: > > [ description of spercial-case deleted ] > > > Comments from the guru's who are up to their necks in the code ? > > Hmm, I don't know how busy you developers are, so I don't know what > should I tell the customer tomorrow. Any proximity guesses ? IMHO...very little chance of this happening in the next several months...Bruce/Vadim might say differently, but somehow I doubt it :( There are too many things currently on the go that are needed for 99% of the users... Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Keep in mind that the entire database gets locked for as long as you've got it open in read/write mode, regardless of if you've read or written. On Fri, 23 January 1998, at 22:53:27, Bruce Momjian wrote: > > You'd be better off looking at something like GDBM (which, by the > > way, also creates a lock against updates while another is reading the > > database)...unless I'm missing something, you aren't looking at doing > > anything that *requires* an SQL engine :( > > I agree. GDBM is a fine system for such uses.