Обсуждение: Synchronized Scan benchmark results
I posted some fairly detailed benchmark results for my Synchronized Scan patch and it's interactions with Simon Riggs' Recycle Buffers patch here: http://j-davis.com/postgresql/patch15-results.html The results are in the form of log files that contain lots of useful debugging info: * log_executor_stats is on (meaning it shows cache hit rate) * the pid, timestamp, and pagenumber being retrieved (for every 5k pages read) * the duration of each scan The results are very positive and quite conclusive. However, the "sync_seqscan_offset" aspect of my patch, which attempts to use pages that were cached before the scan began, did not show a lot of promise. That aspect of my patch may end up being cut. The primary aspect of my patch, the Synchronized Scanning, performed great though. Even the CFQ scheduler, that does not appear to properly read ahead, performed substantially better than plain 8.2.3. And even better, Simon's patch didn't seem to hurt Synchronized Scans at all. Out of the 36 runs I did, a couple appear anomalous. I will retest those soon. Note: I posted the versions of the patches that I used for the tests on the page above. The version of Simon's patch that I used did not apply cleanly to 8.2.3, but the only problem appeared to be in copy.c, so I went ahead with the tests. If this somehow compromised the patch, then let me know. Regards,Jeff Davis
Jeff, Your conclusions sound great - can you perhaps put the timings in a column in your table so we can confirm them? - Luke On 4/2/07 4:14 PM, "Jeff Davis" <pgsql@j-davis.com> wrote: > I posted some fairly detailed benchmark results for my Synchronized Scan > patch and it's interactions with Simon Riggs' Recycle Buffers patch > here: > > http://j-davis.com/postgresql/patch15-results.html > > The results are in the form of log files that contain lots of useful > debugging info: > > * log_executor_stats is on (meaning it shows cache hit rate) > * the pid, timestamp, and pagenumber being retrieved (for every 5k pages > read) > * the duration of each scan > > The results are very positive and quite conclusive. > > However, the "sync_seqscan_offset" aspect of my patch, which attempts to > use pages that were cached before the scan began, did not show a lot of > promise. That aspect of my patch may end up being cut. > > The primary aspect of my patch, the Synchronized Scanning, performed > great though. Even the CFQ scheduler, that does not appear to properly > read ahead, performed substantially better than plain 8.2.3. And even > better, Simon's patch didn't seem to hurt Synchronized Scans at all. > > Out of the 36 runs I did, a couple appear anomalous. I will retest those > soon. > > Note: I posted the versions of the patches that I used for the tests on > the page above. The version of Simon's patch that I used did not apply > cleanly to 8.2.3, but the only problem appeared to be in copy.c, so I > went ahead with the tests. If this somehow compromised the patch, then > let me know. > > Regards, > Jeff Davis > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
On Mon, 2007-04-02 at 16:14 -0700, Jeff Davis wrote: > The results are very positive and quite conclusive. Can we show some summary results? I'm happy that the scans stay together all the way around, even handling the max-> 0 blockid transition well. So definite winner for me. > However, the "sync_seqscan_offset" aspect of my patch, which attempts to > use pages that were cached before the scan began, did not show a lot of > promise. That aspect of my patch may end up being cut. Yes, please remove :-) > The primary aspect of my patch, the Synchronized Scanning, performed > great though. Even the CFQ scheduler, that does not appear to properly > read ahead, performed substantially better than plain 8.2.3. And even > better, Simon's patch didn't seem to hurt Synchronized Scans at all. > > Out of the 36 runs I did, a couple appear anomalous. I will retest those > soon. Which ones were they? > Note: I posted the versions of the patches that I used for the tests on > the page above. The version of Simon's patch that I used did not apply > cleanly to 8.2.3, but the only problem appeared to be in copy.c, so I > went ahead with the tests. If this somehow compromised the patch, then > let me know. [It was never designed to apply cleanly to 8.2.3, as we might guess] That was v2, the current v3 should be OK because I removed the experimental COPY interaction. That will not have affected the tests. I would like to see some tests with different queries that have varying I/O and CPU requirements to see if they stay together too. That won't block the patch, but it will help everybody understand what the range of real world applicability there is in this. I'd guess this can benefit us sufficiently frequently in most cases that its worth it. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Mon, 2007-04-02 at 21:38 -0700, Luke Lonergan wrote: > Jeff, > > Your conclusions sound great - can you perhaps put the timings in a column > in your table so we can confirm them? > I just threw the logs up, which contain the timings involved. I will try to make graphs out of them, but the data is there. The logs contain: * The time a given backend fetches a page, if that page is an even multiple of 5k * The duration of a scan * The time the scan started * The cache hit ratio as reported by log_executor_stats Your right, I do need to summarize it and make it more visually accessible. Regards,Jeff Davis
On Tue, 2007-04-03 at 10:01 +0100, Simon Riggs wrote: > On Mon, 2007-04-02 at 16:14 -0700, Jeff Davis wrote: > > > The results are very positive and quite conclusive. > > Can we show some summary results? I should be able to make some graphs today. > I'm happy that the scans stay together all the way around, even handling > the max-> 0 blockid transition well. So definite winner for me. Yes, I was happy with the results. > > However, the "sync_seqscan_offset" aspect of my patch, which attempts to > > use pages that were cached before the scan began, did not show a lot of > > promise. That aspect of my patch may end up being cut. > > Yes, please remove :-) Ok. > > The primary aspect of my patch, the Synchronized Scanning, performed > > great though. Even the CFQ scheduler, that does not appear to properly > > read ahead, performed substantially better than plain 8.2.3. And even > > better, Simon's patch didn't seem to hurt Synchronized Scans at all. > > > > Out of the 36 runs I did, a couple appear anomalous. I will retest those > > soon. > > Which ones were they? This one stood out to me: * Machine 1, Linux AS, Sync Scan patch + Recycle Buffers patch, single scan: 204s Compared to these tests: * Machine 1, Linux AS, Sync Scan patch + Recycle Buffers patch, scan.rb: all 5 scans are below 170s. * Machine 1, Linux AS, Sync Scan patch only, scan.rb: 165s. That makes no sense to me, so it's probably a fluke (by which I mean some other activity on the system, perhaps swapping some large applications). The second two tests are consistent with all the other numbers I got, but the first one took 40 seconds longer than I would expect. I'll do a simple re-test tonight. > > Note: I posted the versions of the patches that I used for the tests on > > the page above. The version of Simon's patch that I used did not apply > > cleanly to 8.2.3, but the only problem appeared to be in copy.c, so I > > went ahead with the tests. If this somehow compromised the patch, then > > let me know. > > [It was never designed to apply cleanly to 8.2.3, as we might guess] > That was v2, the current v3 should be OK because I removed the > experimental COPY interaction. That will not have affected the tests. Good to know. > I would like to see some tests with different queries that have varying > I/O and CPU requirements to see if they stay together too. That won't > block the patch, but it will help everybody understand what the range of > real world applicability there is in this. I'd guess this can benefit us > sufficiently frequently in most cases that its worth it. I'll do some more varied tests. The best idea I've come up with so far is to do something that requires random seeking going concurrently with the scans. Pgbench would probably be a good idea too, since it's more standard. Regards,Jeff Davis
On Tue, 2007-04-03 at 10:37 -0700, Jeff Davis wrote: > > > The primary aspect of my patch, the Synchronized Scanning, performed > > > great though. Even the CFQ scheduler, that does not appear to properly > > > read ahead, performed substantially better than plain 8.2.3. And even > > > better, Simon's patch didn't seem to hurt Synchronized Scans at all. > > > > > > Out of the 36 runs I did, a couple appear anomalous. I will retest those > > > soon. > > > > Which ones were they? > > This one stood out to me: > > * Machine 1, Linux AS, Sync Scan patch + Recycle Buffers patch, single > scan: 204s > > Compared to these tests: > > * Machine 1, Linux AS, Sync Scan patch + Recycle Buffers patch, scan.rb: > all 5 scans are below 170s. > > * Machine 1, Linux AS, Sync Scan patch only, scan.rb: 165s. > > That makes no sense to me, so it's probably a fluke (by which I mean > some other activity on the system, perhaps swapping some large > applications). The second two tests are consistent with all the other > numbers I got, but the first one took 40 seconds longer than I would > expect. I'll do a simple re-test tonight. What did you set scan_recycle_buffers to? The default was zero. I think v2 of the patch interpreted that setting as meaning attempt to reuse the same buffer again immediately, which probably wouldn't be optimal. Which is why I issued v3... I think you'll need to set scan_recycle_buffers = 0 (==off in v3) and scan_recycle_buffers = 32 to get sensible comparison figures. So please can you use v3 for any further testing. Thanks. > > I would like to see some tests with different queries that have varying > > I/O and CPU requirements to see if they stay together too. That won't > > block the patch, but it will help everybody understand what the range of > > real world applicability there is in this. I'd guess this can benefit us > > sufficiently frequently in most cases that its worth it. > > I'll do some more varied tests. The best idea I've come up with so far > is to do something that requires random seeking going concurrently with > the scans. No, what I mean is different kinds of scans: - a simple scan like count(*) - a more complex one that does buckets of cycles per tuple - a hash join In particular, select count(*) isn't very representative. Using all Hash Joins would be a much better test - since IMHO that case is the common use case for this feature. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Wed, 2007-04-04 at 10:40 +0100, Simon Riggs wrote: > > That makes no sense to me, so it's probably a fluke (by which I mean > > some other activity on the system, perhaps swapping some large > > applications). The second two tests are consistent with all the other > > numbers I got, but the first one took 40 seconds longer than I would > > expect. I'll do a simple re-test tonight. > > What did you set scan_recycle_buffers to? The default was zero. > > I think v2 of the patch interpreted that setting as meaning attempt to > reuse the same buffer again immediately, which probably wouldn't be > optimal. Which is why I issued v3... I think you'll need to set > scan_recycle_buffers = 0 (==off in v3) and scan_recycle_buffers = 32 to > get sensible comparison figures. > I used v2 with default in those tests, so I think that means it used the same buffer. By the way, on another test I did that results came out at 165s, which is consistent with the other results. I think the time I ran that the machine must have been swapping out applications or something... who knows. > So please can you use v3 for any further testing. Thanks. I'll use v3 of the patch as located here: http://archives.postgresql.org/pgsql-hackers/2007-03/msg00709.php By the way, it might be easier to find the right one if the archives contained filenames for the attachments. Am I missing something obvious? > > > I would like to see some tests with different queries that have varying > > > I/O and CPU requirements to see if they stay together too. That won't > > > block the patch, but it will help everybody understand what the range of > > > real world applicability there is in this. I'd guess this can benefit us > > > sufficiently frequently in most cases that its worth it. > > > > I'll do some more varied tests. The best idea I've come up with so far > > is to do something that requires random seeking going concurrently with > > the scans. > > No, what I mean is different kinds of scans: > - a simple scan like count(*) Will use my same "scan.rb" benchmark. > - a more complex one that does buckets of cycles per tuple I'll use a modified "scan.rb" that does a computation in the select list (I'll call the function volatile so that it recomputes with each tuple). > - a hash join This is where I got stuck. * If it's one big ( > NBuffers/2 ) table and one small table, the small table will only serve to occupy some shared_buffers (right?) * If it's two big tables, a join would be a major operation. I don't think it would even choose a hash join in that situation, right? To summarize, in the next round of testing, I will * disable sync_seqscan_offset completely * use recycle_buffers=0 and 32 * I'll still test against 8.2.3 for consistency in case you suggest otherwise. Regards,Jeff Davis
On Wed, 2007-04-04 at 10:23 -0700, Jeff Davis wrote: > > - a hash join > > This is where I got stuck. > > * If it's one big ( > NBuffers/2 ) table and one small table, the small > table will only serve to occupy some shared_buffers (right? > * If it's two big tables, a join would be a major operation. I don't > think it would even choose a hash join in that situation, right? The large table will do a SeqScan though, so should hit your code. Just look at the EXPLAIN first. > To summarize, in the next round of testing, I will > * disable sync_seqscan_offset completely > * use recycle_buffers=0 and 32 > * I'll still test against 8.2.3 for consistency in case you suggest > otherwise. Sounds OK. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com