Serial vs. Parallel Queries
От | Paul D. Boyle |
---|---|
Тема | Serial vs. Parallel Queries |
Дата | |
Msg-id | 199806112003.QAA19591@laue.chem.ncsu.edu обсуждение исходный текст |
Список | pgsql-general |
Hello, I wrote a PostgreSQL client (using libpq) which queries 7 different tables in a database I have constructed using PostgreSQL 6.2.1. The queries are more or less independent of one another so I thought this would be more efficient if I did some parallel or multiprocess programming. The program forks off a child for each table searched, and if necessary the child and parent use SysV messages queues to communicate. To see how much of an improvement over a serialized search, I added in some conditional compilation directives which used a single process sequential search of the seven tables needed for the returned results. I then did some timings using the 'time' command. The difference was not as big as I had hoped, and the there were several curiosities on which I'd like peoples' opinions. I ran the searches in parallel/serial pairs both the client and backend were on (different) linux boxes connected on our departmental subnet ethernet. Here are the timings (10 runs each): /* Parallel Search version: */ time sinfo x97101 0.010u 0.000s 0:04.55 0.2% 0+0k 0+0io 128pf+0w 0.010u 0.010s 0:03.03 0.6% 0+0k 0+0io 128pf+0w 0.010u 0.000s 0:05.44 0.1% 0+0k 0+0io 128pf+0w 0.010u 0.000s 0:03.55 0.2% 0+0k 0+0io 128pf+0w 0.010u 0.000s 0:05.56 0.1% 0+0k 0+0io 128pf+0w 0.000u 0.010s 0:04.08 0.2% 0+0k 0+0io 128pf+0w 0.010u 0.010s 0:04.77 0.4% 0+0k 0+0io 128pf+0w<- *.txt deleted before run 0.020u 0.010s 0:04.66 0.6% 0+0k 0+0io 128pf+0w<- *.txt deleted before run 0.010u 0.010s 0:05.93 0.3% 0+0k 0+0io 128pf+0w<- *.txt deleted before run 0.010u 0.010s 0:03.36 0.5% 0+0k 0+0io 128pf+0w<- *.txt deleted before run Averages: 0.01 0.006 0:04.47 0.32% /* Serialized Search Version: */ time sinfo_serial x97101 0.070u 0.020s 0:06.02 1.4% 0+0k 0+0io 180pf+0w 0.040u 0.030s 0:06.04 1.1% 0+0k 0+0io 180pf+0w 0.050u 0.030s 0:06.04 1.3% 0+0k 0+0io 180pf+0w 0.040u 0.020s 0:06.13 0.9% 0+0k 0+0io 180pf+0w 0.070u 0.000s 0:06.07 1.1% 0+0k 0+0io 180pf+0w 0.060u 0.020s 0:06.05 1.3% 0+0k 0+0io 180pf+0w 0.090u 0.040s 0:06.05 2.1% 0+0k 0+0io 180pf+0w<- *.txt deleted before run 0.060u 0.020s 0:06.04 1.3% 0+0k 0+0io 180pf+0w<- *.txt deleted before run 0.060u 0.030s 0:06.10 1.4% 0+0k 0+0io 180pf+0w<- *.txt deleted before run 0.050u 0.020s 0:06.11 1.1% 0+0k 0+0io 180pf+0w<- *.txt deleted before run Averages: 0.059 0.023 0:06.07 1.3% The *.txt files mentioned are the text files produced by the program. I was expecting the wall clock time (col 3) to be much shorter for the parallel search than for the serialized search, I was also expecting the wall clock time to be less than it is for either case. The user cpu and system cpu time are both much less than the wall clock time. My hypothesis is that the wall clock time is more related to either the network latency (probably not much) and/or the granularity of the record locking done by postgresql. My guess is that postgresql uses a fairly coarse locking mechanism. I would like to know if my "explanation" is correct. In any case, I would appreciate it if someone could supply a discussion of how postgresql locks records during a query. The queries I am doing with this client are "read-only" (i.e. SELECT's). Is there anyway to improve performance say with the -F switch invoked during the clients' query? Thanks, Paul -- Paul D. Boyle | boyle@laue.chem.ncsu.edu Director, X-ray Structural Facility | phone: (919) 515-7362 Department of Chemistry - Box 8204 | FAX: (919) 515-5079 North Carolina State University | Raleigh, NC, 27695-8204 http://laue.chem.ncsu.edu/web/xray.welcome.html
В списке pgsql-general по дате отправления:
Предыдущее
От: "Jackson, DeJuan"Дата:
Сообщение: RE: [GENERAL] Sequences : getting back the nextval() result on an insert