Synchronized Scan preliminary results
От | Jeff Davis |
---|---|
Тема | Synchronized Scan preliminary results |
Дата | |
Msg-id | 457C87C9.9010100@j-davis.com обсуждение исходный текст |
Ответы |
Re: Synchronized Scan preliminary results
(Josh Berkus <josh@agliodbs.com>)
|
Список | pgsql-hackers |
I posted a patch on -patches for reference. The preliminary results I got against that patch are mixed. I did 4 runs, each with an 11GB table on a machine with 1GB of RAM. Yes, I know this is bad, consumer grade hardware, but it controls for variables in the I/O layer (like the controller's read-ahead or I/O scheduling) and simplifies the test. Each test was 4 threads doing two iterations of a COUNT(*). The threads were started 1 minute apart. A single-threaded COUNT(*) takes about 334 seconds. My findings: (1) In the test with shared_buffers=24MB, the plain 8.2 took 2227 seconds to complete all threads (each thread finished 2 scans in about 2100 seconds), whereas with my patch it took 901 seconds (about 721 seconds per thread). (2) With shared_buffers=128MB, the plain 8.2 took 2842 seconds (about 2600 seconds per thread), whereas with my patch it took 899 (about 718 seconds per thread). Conclusion from tests: First, my patch can be effective. Second, the normal behavior is quite unpredictable itself: why did increasing shared_buffers destroy the performance? It didn't seem like enough of an increase to wipe out the OS buffer cache. Also, the scans were quite stable within a test, there weren't wild variations from one scan to the next. Luke also provided me with results of his own, but he tested on much better hardware (the patch he used is identical from a technical standpoint, but may have some cosmetic differences): "It uses five simultaneous scans as before, but this time the table is 120GB on a machine with 8GB of RAM. The data is stored on one non-raid disk, so a single scan should take about 33 minutes. The scans are started 5 minutes apart, so the last scan would end in about 3 hours if they were independent (they're not). 8.2 unmodified runs the test in about 4 hours and 20 minutes. With the first patch, it runs in about 5 hours and 30 minutes." So, Luke experienced an actual slowdown. I think that there is a lot of room for improvement over the normal behavior. First, I think we can make the normal behavior more predictable; and second, I think we can make the average case better when the scans are larger than the available memory. However, my patch has a long way to go. I need to figure out what causes the performance degradation in Luke's case. My plan is: (1) Try to add some better instrumentation to the patch, as Simon suggested. (2) I'll allocate a new machine and put Solaris on it and try to use DTrace. Maybe that will tell me something useful. I have limited experience with DTrace, so if someone else wants to help me let me know. (3) I'll make sure the patch only turns on if the table size is greater than some multiple of effective_cache_size. (4) Jim had the idea to start the scan before the hint, to take advantage of the already-existing cache trail. This can be done by only storing the hint if the scan is currently greater than the start location plus the amount we're subtracting from the hint. We can make that amount some fraction of effective_cache_size. (5) Should I issue a warning if there is a collision in the table? Florian Pflug raised the concern of mysterious performance regressions. (6) Heikki has the idea for each backend holding a bitmap of the blocks it has read in local memory. I like this idea a lot, but there are a lot of considerations. Others on this list have suggested that some level of enforced synchronization might help. Tom was interested in packs of scans moving together, and there was a lot of discussion along those lines. I think the results show that my patch needs to do something along those lines, because we don't want to actually lose performance in any case. However, I think it's too early to say what we need to do without tests. The challenge for me is that each of these tests take hours, and I can't necessarily reproduce problems. Thanks to everyone for your input so far. Regards,Jeff Davis
В списке pgsql-hackers по дате отправления: