Re: Select where id in (LARGE LIST) ?
От | Csaba Nagy |
---|---|
Тема | Re: Select where id in (LARGE LIST) ? |
Дата | |
Msg-id | 1122976336.2837.220.camel@coppola.muc.ecircle.de обсуждение исходный текст |
Ответ на | Re: Select where id in (LARGE LIST) ? (Jasper Potts <jasper@jasperpotts.com>) |
Список | pgsql-jdbc |
In this case I wouldn't go with a boolean flag, setting it on a row will actually insert a new row in your table (that's what the updates really do). Then you're better off with the range approach, and keep your ranges in memory. I have no idea how big the query can get, I guess it should be somewhere in the documentation at some chapter dealing with limitations... Cheers, Csaba. On Tue, 2005-08-02 at 11:23, Jasper Potts wrote: > We are working on a desktop(Swing) application, postgresql is running > embedded in the application and is hence single user. We have a > scrollable view of grid of icons much like explorer. The data set is a > list of item ids resulting from a complex query that can take 30 > seconds. This data set of the query results are stored in table in the > database. The user then needs to be able to browse(scroll) through the > list of items and make a selection. Then they can perform operations on > that selection. Those operations require a way of the selection being > used in a database query. The options are: > > (1) Store the selection on memory on the client. > (2) Store the selection as a column in the query results table > (boolean maybe) > (3) Hybrid solution where the selection is in memory and in the database. > > The problem with (1) is how to transfer the selection to the db if it > gets large. This is where the chunking comes in. > The problem with the pure selection in the database route(2) is the > simple selection operations like clicking on an item to select only it. > Are too slow. > (3) could be good but is complex to implement with Threading/Locking etc. > > My current plan is to go with (1) but change the storage model for the > selection. The first idea was just a set of ids of selected items, > problem is if that is 100,000 then is hard to transfer to the WHERE part > of a query. The new idea is to store it as a list of ranges eg. > (26-32,143-198,10922-10923) this could then be sent as a "WHERE (row > >=26 and row <=32) or (row >=143 and row <=198) or (row >=10922 and row > <=10923)". As the user has to choose each range by hand it is unlikely > there could be more than 100 ranges. > > The big question here is whats the max length of a query sent though > JDBC to Postgresql??? > > Many Thanks > > Jasper > > Csaba Nagy wrote: > > >I don't quite understand what you're doing here, but smells to me for > >something which shouldn't be interactive in the first place. If some > >query expectedly exceeds a few seconds, we make the operation > >asynchronous, i.e. the user starts it, gets a page saying the thing is > >currently being done, and then he receives some kind of notification > >when the thing is finished, either on the web page (by periodically > >reloading) or by email for really long lasting things. > >Now the chunking method does have an overhead indeed, but it has a set > >of good properties too: it's easily interruptible, it won't bug your > >database down (it will let some other things go too between the chunks), > >and you won't get connection timeouts for really long operations... it's > >just more manageable in many respects. > >And if you do have some long lasting things, make sure you won't let the > >user hit reload 50 times... an asynchronous way of doing it will help in > >this regard too, cause then you know what the user did and reload will > >only show the status. > > > >HTH, > >Csaba. > > > > > > > >On Fri, 2005-07-29 at 21:18, Jasper Potts wrote: > > > > > >>I have been trying the chunk method and it wins in cases where the > >>number of ids is not too high. I was comparing: > >> > >>(1) joining the main table with a table that had a column with ids and a > >>boolean column for selected, both id columns have indexes. > >>(2) select over main table where id in (...) done in 100s with prepared > >>statement and results accumulated in java > >> > >>with a select that did a sum() of another column the results were: > >> > >>No. of Selected Items | Join Select Time in sec | Chunk Select > >>Time in sec > >>30 | 0.4 > >>| 0.007 > >>4000 | 0.5 > >>| 0.24 > >>30000 | 0.7 | > >>1.12 > >> > >>All of these were with 30,000 rows in main table from JDBC. > >> > >>These results don't take into account the time it takes to clear then > >>populate the select table with selection data from client. At the moment > >>this can take 1-2 seconds which is far from interactive for the user. > >>Which is why I am looking for a better method. > >> > >>Any other suggestions? :-) > >> > >>I would like to push the data size up to a million, which means the user > >>could in theory select a million rows and apply operation. This is going > >>to get very slow with the chunk method. > >> > >>Many Thanks > >> > >>Jasper > >> > >>Csaba Nagy wrote: > >> > >> > >> > >>>Jasper, > >>> > >>>You can chunk your operation. That means to only use 100 entries in one > >>>run, and repeat it until all ids were processed. Use a prepared > >>>statement, that will save you some overhead. The last chunk will have > >>>less entries than the parameter placeholders, so you will have to build > >>>a special last statement, or to set the superfluous parameters to null > >>>or to one of the values from the last chunk, depends on what kind of > >>>query you have. We do all our data import/export this way, and it works > >>>fine. > >>> > >>>Cheers, > >>>Csaba. > >>> > >>> > >>>On Fri, 2005-07-29 at 17:30, Jasper Potts wrote: > >>> > >>> > >>> > >>> > >>>>I am working on a gui application which has a list of ids of selected > >>>>items. To perform an operation on the selected items I do a > >>>>"select/update .. where id in(...)". There seems to be a limit of > >>>>100-200 items in the list. Is there a way of doing this with large > >>>>lists, maybe 10,000, 100,000 long? > >>>> > >>>>The best solution I have so far is to create a selection table and write > >>>>the select out to that and perform a join but the 100,000 inserts are slow. > >>>> > >>>>The ids are int8(long), so not talking large amounts of data, couple Mb > >>>>at most. Database and client running on same machine over localhost. > >>>> > >>>>Many Thanks > >>>> > >>>>Jasper > >>>> > >>>>---------------------------(end of broadcast)--------------------------- > >>>>TIP 9: In versions below 8.0, the planner will ignore your desire to > >>>> choose an index scan if your joining column's datatypes do not > >>>> match > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 6: explain analyze is your friend > >> > >> > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org
В списке pgsql-jdbc по дате отправления: