Обсуждение: Join with lower/upper limits doesn't scale well
I have the same schema in two different databases. In "smalldb", the two tables of interest have about 430,000 rows,
in"bigdb", the two tables each contain about 5.5 million rows. I'm processing the data, and for various reasons it
worksout well to process it in 100,000 row chunks. However, it turns out for the big schema, selecting 100,000 rows is
thelongest single step of the processing.
Below is the explain/analyze output of the query from each database. Since both tables are indexed on the joined
columns,I don't understand why the big table should be so much slower -- I hoped this would scale well, or at least
O(log(N)),not O(N).
What's going on here? I don't know if I'm reading this right, but it looks like the sort is taking all the time, but
thatdoesn't make sense because in both cases it's sorting 100,000 rows.
Thanks,
Craig
bigdb=> explain analyze
bigdb-> select r.row_num, m.molkeys from my_rownum r
bigdb-> join my_molkeys m on (r.version_id = m.version_id)
bigdb-> where r.row_num >= 100000 AND r.row_num < 200000
bigdb-> order by r.row_num;
Sort (cost=431000.85..431248.23 rows=98951 width=363) (actual time=46306.748..46417.448 rows=100000 loops=1)
Sort Key: r.row_num
-> Hash Join (cost=2583.59..422790.68 rows=98951 width=363) (actual time=469.010..45752.131 rows=100000 loops=1)
Hash Cond: ("outer".version_id = "inner".version_id)
-> Seq Scan on my_molkeys m (cost=0.00..323448.30 rows=5472530 width=363) (actual time=11.243..33299.933
rows=5472532loops=1)
-> Hash (cost=2336.21..2336.21 rows=98951 width=8) (actual time=442.260..442.260 rows=100000 loops=1)
-> Index Scan using i_chm_rownum_row_num on my_rownum r (cost=0.00..2336.21 rows=98951 width=8)
(actualtime=47.551..278.736 rows=100000 loops=1)
Index Cond: ((row_num >= 100000) AND (row_num < 200000))
Total runtime: 46543.163 ms
smalldb=> explain analyze
smalldb-> select r.row_num, m.molkeys from my_rownum r
smalldb-> join my_molkeys m on (r.version_id = m.version_id)
smalldb-> where r.row_num >= 100000 AND r.row_num < 200000
smalldb-> order by r.row_num;
Sort (cost=43598.23..43853.38 rows=102059 width=295) (actual time=4097.180..4207.733 rows=100000 loops=1)
Sort Key: r.row_num
-> Hash Join (cost=2665.09..35107.41 rows=102059 width=295) (actual time=411.635..3629.756 rows=100000 loops=1)
Hash Cond: ("outer".version_id = "inner".version_id)
-> Seq Scan on my_molkeys m (cost=0.00..23378.90 rows=459590 width=295) (actual time=8.563..2011.455
rows=459590loops=1)
-> Hash (cost=2409.95..2409.95 rows=102059 width=8) (actual time=402.867..402.867 rows=100000 loops=1)
-> Index Scan using i_chm_rownum_row_num_8525 on my_rownum r (cost=0.00..2409.95 rows=102059 width=8)
(actualtime=37.122..242.528 rows=100000 loops=1)
Index Cond: ((row_num >= 100000) AND (row_num < 200000))
Total runtime: 4333.501 ms
Table "bigdb.my_rownum"
Column | Type | Modifiers
------------+---------+-----------
version_id | integer |
parent_id | integer |
row_num | integer |
Indexes:
"i_chm_rownum_row_num" UNIQUE, btree (row_num)
"i_chm_rownum_version_id" UNIQUE, btree (version_id)
"i_chm_rownum_parent_id" btree (parent_id)
Table "bigdb.my_molkeys"
Column | Type | Modifiers
------------+---------+-----------
version_id | integer |
molkeys | text |
Indexes:
"i_chm_molkeys_version_id" UNIQUE, btree (version_id)
"Craig James" <craig_james@emolecules.com> writes:
> Below is the explain/analyze output of the query from each database. Since
> both tables are indexed on the joined columns, I don't understand why the
> big table should be so much slower -- I hoped this would scale well, or at
> least O(log(N)), not O(N).
...
> Sort (cost=431000.85..431248.23 rows=98951 width=363) (actual time=46306.748..46417.448 rows=100000 loops=1)
> Sort Key: r.row_num
> -> Hash Join (cost=2583.59..422790.68 rows=98951 width=363) (actual time=469.010..45752.131 rows=100000 loops=1)
> Hash Cond: ("outer".version_id = "inner".version_id)
> -> Seq Scan on my_molkeys m (cost=0.00..323448.30 rows=5472530 width=363) (actual time=11.243..33299.933
rows=5472532loops=1)
> -> Hash (cost=2336.21..2336.21 rows=98951 width=8) (actual time=442.260..442.260 rows=100000 loops=1)
> -> Index Scan using i_chm_rownum_row_num on my_rownum r (cost=0.00..2336.21 rows=98951 width=8)
(actualtime=47.551..278.736 rows=100000 loops=1)
> Index Cond: ((row_num >= 100000) AND (row_num < 200000))
> Total runtime: 46543.163 ms
It looks like most of the time is being spent in the sequential scan of the
my_molkeys at least 33 seconds out of 46 seconds is. The actual sort is taking
under a second (the hash join finishes after 45.7s and the sort finishes after
46.4s).
The rest of the time (about 13s) is actually being spent in the hash join
which makes me think it's overflowing work_mem and having to process the join
in two batches. You might be able to speed it up by raising work_mem for this
query (you can set work_mem locally using SET)
The row_num where clause only narrows down the set of rows coming from the
my_rownum table. If you want sublinear performance you would have to provide
some way for Postgres to narrow down the rows from my_molkeys without actually
having to read them all in.
With the query as is the only way I can see that happening would be if you had
an index on "my_molkey(version_id)" and "my_rownum(version_id) WHERE row_num
between 100000 and 200000". Then it could do a merge join between two index
scans.
Note than even then I'm surprised the optimizer is bothering with the index
for these queries, at least for the 400k case. Do you have enable_seqscan=off
or random_page_cost dialled way down?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
The two queries below produce different plans.
select r.version_id, r.row_num, m.molkeys from my_rownum r
join my_molkeys m on (r.version_id = m.version_id)
where r.version_id >= 3200000
and r.version_id < 3300000
order by r.version_id;
select r.version_id, r.row_num, m.molkeys from my_rownum r
join my_molkeys m on (r.version_id = m.version_id)
where r.version_id >= 3200000
and r.version_id < 3300000
and m.version_id >= 3200000
and m.version_id < 3300000
order by r.version_id;
I discovered this while looking at the plans for the first query. It seemed to be ignoring the fact that it could push
the"between" condition along to the second table, since the condition and the join are on the same indexed columns.
So,I added a redundant condition, and bingo, it was a lot faster. In the analysis shown below, the timing (about 1.0
and1.5 seconds respectively) are for a "hot" database that's been queried a couple of times. In real life on a "cold"
database,the times are more like 10 seconds and 21 seconds, so it's quite significant.
Thanks,
Craig
db=> explain analyze
db-> select r.version_id, r.row_num, m.molkeys from my_rownum r
db-> join my_molkeys m on (r.version_id = m.version_id)
db-> where r.version_id >= 3200000
db-> and r.version_id < 3300000
db-> order by r.version_id;
Sort (cost=264979.51..265091.06 rows=44620 width=366) (actual time=1424.126..1476.048 rows=46947 loops=1)
Sort Key: r.version_id
-> Nested Loop (cost=366.72..261533.64 rows=44620 width=366) (actual time=41.649..1186.331 rows=46947 loops=1)
-> Bitmap Heap Scan on my_rownum r (cost=366.72..41168.37 rows=44620 width=8) (actual time=41.616..431.783
rows=46947loops=1)
Recheck Cond: ((version_id >= 3200000) AND (version_id < 3300000))
-> Bitmap Index Scan on i_chm_rownum_version_id_4998 (cost=0.00..366.72 rows=44620 width=0) (actual
time=21.244..21.244rows=46947 loops=1)
Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
-> Index Scan using i_chm_molkeys_version_id on my_molkeys m (cost=0.00..4.93 rows=1 width=362) (actual
time=0.009..0.010rows=1 loops=46947)
Index Cond: ("outer".version_id = m.version_id)
Total runtime: 1534.638 ms
(10 rows)
db=> explain analyze
db-> select r.version_id, r.row_num, m.molkeys from my_rownum r
db-> join my_molkeys m on (r.version_id = m.version_id)
db-> where r.version_id >= 3200000
db-> and r.version_id < 3300000
db-> and m.version_id >= 3200000
db-> and m.version_id < 3300000
db-> order by r.version_id;
Sort (cost=157732.20..157732.95 rows=298 width=366) (actual time=985.383..1037.423 rows=46947 loops=1)
Sort Key: r.version_id
-> Hash Join (cost=41279.92..157719.95 rows=298 width=366) (actual time=502.875..805.402 rows=46947 loops=1)
Hash Cond: ("outer".version_id = "inner".version_id)
-> Index Scan using i_chm_molkeys_version_id on my_molkeys m (cost=0.00..115717.85 rows=47947 width=362)
(actualtime=0.023..117.270 rows=46947 loops=1)
Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
-> Hash (cost=41168.37..41168.37 rows=44620 width=8) (actual time=502.813..502.813 rows=46947 loops=1)
-> Bitmap Heap Scan on my_rownum r (cost=366.72..41168.37 rows=44620 width=8) (actual
time=41.621..417.508rows=46947 loops=1)
Recheck Cond: ((version_id >= 3200000) AND (version_id < 3300000))
-> Bitmap Index Scan on i_chm_rownum_version_id_4998 (cost=0.00..366.72 rows=44620 width=0)
(actualtime=21.174..21.174 rows=46947 loops=1)
Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
Total runtime: 1096.031 ms
(12 rows)
Sorry, I forgot to mention: This is 8.1.4, with a fairly ordinary configuration on a 4 GB system.
Craig
Craig James wrote:
> The two queries below produce different plans.
>
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and r.version_id < 3300000
> order by r.version_id;
>
>
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and r.version_id < 3300000
> and m.version_id >= 3200000
> and m.version_id < 3300000
> order by r.version_id;
>
> I discovered this while looking at the plans for the first query. It
> seemed to be ignoring the fact that it could push the "between"
> condition along to the second table, since the condition and the join
> are on the same indexed columns. So, I added a redundant condition, and
> bingo, it was a lot faster. In the analysis shown below, the timing
> (about 1.0 and 1.5 seconds respectively) are for a "hot" database that's
> been queried a couple of times. In real life on a "cold" database, the
> times are more like 10 seconds and 21 seconds, so it's quite significant.
>
> Thanks,
> Craig
>
>
>
> db=> explain analyze db-> select r.version_id, r.row_num, m.molkeys from
> my_rownum r
> db-> join my_molkeys m on (r.version_id = m.version_id)
> db-> where r.version_id >= 3200000
> db-> and r.version_id < 3300000
> db-> order by r.version_id;
>
> Sort (cost=264979.51..265091.06 rows=44620 width=366) (actual
> time=1424.126..1476.048 rows=46947 loops=1)
> Sort Key: r.version_id
> -> Nested Loop (cost=366.72..261533.64 rows=44620 width=366) (actual
> time=41.649..1186.331 rows=46947 loops=1)
> -> Bitmap Heap Scan on my_rownum r (cost=366.72..41168.37
> rows=44620 width=8) (actual time=41.616..431.783 rows=46947 loops=1)
> Recheck Cond: ((version_id >= 3200000) AND (version_id <
> 3300000))
> -> Bitmap Index Scan on i_chm_rownum_version_id_4998
> (cost=0.00..366.72 rows=44620 width=0) (actual time=21.244..21.244
> rows=46947 loops=1)
> Index Cond: ((version_id >= 3200000) AND (version_id
> < 3300000))
> -> Index Scan using i_chm_molkeys_version_id on my_molkeys m
> (cost=0.00..4.93 rows=1 width=362) (actual time=0.009..0.010 rows=1
> loops=46947)
> Index Cond: ("outer".version_id = m.version_id)
> Total runtime: 1534.638 ms
> (10 rows)
>
>
> db=> explain analyze db-> select r.version_id, r.row_num, m.molkeys from
> my_rownum r
> db-> join my_molkeys m on (r.version_id = m.version_id)
> db-> where r.version_id >= 3200000
> db-> and r.version_id < 3300000
> db-> and m.version_id >= 3200000
> db-> and m.version_id < 3300000
> db-> order by r.version_id;
>
> Sort (cost=157732.20..157732.95 rows=298 width=366) (actual
> time=985.383..1037.423 rows=46947 loops=1)
> Sort Key: r.version_id
> -> Hash Join (cost=41279.92..157719.95 rows=298 width=366) (actual
> time=502.875..805.402 rows=46947 loops=1)
> Hash Cond: ("outer".version_id = "inner".version_id)
> -> Index Scan using i_chm_molkeys_version_id on my_molkeys m
> (cost=0.00..115717.85 rows=47947 width=362) (actual time=0.023..117.270
> rows=46947 loops=1)
> Index Cond: ((version_id >= 3200000) AND (version_id <
> 3300000))
> -> Hash (cost=41168.37..41168.37 rows=44620 width=8) (actual
> time=502.813..502.813 rows=46947 loops=1)
> -> Bitmap Heap Scan on my_rownum r
> (cost=366.72..41168.37 rows=44620 width=8) (actual time=41.621..417.508
> rows=46947 loops=1)
> Recheck Cond: ((version_id >= 3200000) AND
> (version_id < 3300000))
> -> Bitmap Index Scan on
> i_chm_rownum_version_id_4998 (cost=0.00..366.72 rows=44620 width=0)
> (actual time=21.174..21.174 rows=46947 loops=1)
> Index Cond: ((version_id >= 3200000) AND
> (version_id < 3300000))
> Total runtime: 1096.031 ms
> (12 rows)
>
Craig James <craig_james@emolecules.com> writes:
> The two queries below produce different plans.
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and r.version_id < 3300000
> order by r.version_id;
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and r.version_id < 3300000
> and m.version_id >= 3200000
> and m.version_id < 3300000
> order by r.version_id;
Yeah, the planner does not make any attempt to infer implied
inequalities, so it will not generate the last two clauses for you.
There is machinery in there to infer implied *equalities*, which
is cheaper (fewer operators to consider) and much more useful across
typical queries such as multiway joins on the same keys. I'm pretty
dubious that it'd be worth the cycles to search for implied
inequalities.
regards, tom lane
Here's an oddity. I have 10 databases, each with about a dozen connections to Postgres (about 120 connections total),
andat midnight they're all idle. These are mod_perl programs (like a FastCGI -- they stay connected so they're ready
forinstant service). So using "ps -ef" and grep, we find one of the databases looks like this:
postgres 22708 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(46915) idle
postgres 22709 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(46916) idle
postgres 22710 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(46917) idle
postgres 22711 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(46918) idle
postgres 22712 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(46919) idle
postgres 22724 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42440) idle
postgres 22725 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42441) idle
postgres 22726 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42442) idle
postgres 22727 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42443) idle
postgres 22728 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42444) idle
postgres 22731 7619 0 Jul11 ? 00:00:06 postgres: chemmega chemmega 192.168.10.192(42447) idle
Now here's the weird thing. I'm running a pg_restore of a database (on the order of 4GB compressed, maybe 34M rows of
ordinarydata, and 15M rows in one BLOB table that's typically 2K per blob). When I do this, ALL of the postgress
backendsstart working at about 1% CPU apiece. This means that the 120 "idle" postgres backends are together using
almost100% of one CPU on top of the 100% CPU being used by pg_restore. See the output of top(1) below.
Is this normal? All I can guess at is that something's going on in shared memory that every Postgres backend has to
respondto.
Thanks,
Craig
Tasks: 305 total, 1 running, 304 sleeping, 0 stopped, 0 zombie
Cpu(s): 33.5% us, 1.5% sy, 0.0% ni, 57.8% id, 6.6% wa, 0.2% hi, 0.4% si
Mem: 4151456k total, 4011020k used, 140436k free, 10096k buffers
Swap: 2104504k total, 94136k used, 2010368k free, 3168596k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6681 postgres 16 0 217m 188m 161m D 50.4 4.6 4:29.30 postmaster
1577 root 10 -5 0 0 0 S 1.0 0.0 108:01.97 md0_raid1
8487 postgres 15 0 187m 8704 4996 S 1.0 0.2 0:06.56 postmaster
8506 postgres 15 0 187m 8604 4892 S 1.0 0.2 0:06.37 postmaster
8507 postgres 15 0 187m 8708 5004 S 1.0 0.2 0:06.42 postmaster
8512 postgres 15 0 187m 8612 4904 S 1.0 0.2 0:06.65 postmaster
8751 postgres 15 0 187m 10m 7520 S 1.0 0.3 0:07.95 postmaster
8752 postgres 15 0 187m 10m 7492 S 1.0 0.3 0:07.84 postmaster
14053 postgres 15 0 187m 8752 5044 S 1.0 0.2 0:06.53 postmaster
16515 postgres 15 0 187m 8156 4452 S 1.0 0.2 0:06.33 postmaster
25351 postgres 15 0 187m 9772 6064 S 1.0 0.2 0:06.75 postmaster
25387 postgres 15 0 187m 8444 4752 S 1.0 0.2 0:06.45 postmaster
25425 postgres 15 0 187m 9.8m 6340 S 1.0 0.2 0:06.75 postmaster
30626 postgres 15 0 187m 8472 4792 S 1.0 0.2 0:06.52 postmaster
30628 postgres 15 0 187m 8536 4840 S 1.0 0.2 0:06.50 postmaster
30630 postgres 15 0 187m 8524 4844 S 1.0 0.2 0:06.49 postmaster
30637 postgres 15 0 187m 8692 4880 S 1.0 0.2 0:06.25 postmaster
31679 postgres 15 0 187m 8544 4860 S 1.0 0.2 0:06.39 postmaster
31681 postgres 15 0 187m 8528 4848 S 1.0 0.2 0:06.25 postmaster
1751 postgres 15 0 187m 8432 4748 S 1.0 0.2 0:06.26 postmaster
11620 postgres 15 0 187m 8344 4644 S 1.0 0.2 0:06.23 postmaster
11654 postgres 15 0 187m 8316 4624 S 1.0 0.2 0:06.36 postmaster
19173 postgres 15 0 187m 9372 5668 S 1.0 0.2 0:06.49 postmaster
19670 postgres 15 0 187m 9236 5528 S 1.0 0.2 0:06.29 postmaster
20380 postgres 15 0 187m 8656 4956 S 1.0 0.2 0:06.20 postmaster
20649 postgres 15 0 187m 8280 4584 S 1.0 0.2 0:06.16 postmaster
22731 postgres 15 0 187m 8408 4700 S 1.0 0.2 0:06.03 postmaster
11045 postgres 15 0 185m 71m 68m S 0.7 1.8 0:19.35 postmaster
6408 postgres 15 0 187m 11m 7520 S 0.7 0.3 0:07.89 postmaster
6410 postgres 15 0 187m 10m 7348 S 0.7 0.3 0:07.53 postmaster
6411 postgres 15 0 187m 10m 7380 S 0.7 0.3 0:07.83 postmaster
6904 postgres 15 0 187m 8644 4788 S 0.7 0.2 0:06.15 postmaster
6905 postgres 15 0 187m 8288 4596 S 0.7 0.2 0:06.15 postmaster
6906 postgres 15 0 187m 8488 4764 S 0.7 0.2 0:06.18 postmaster
6907 postgres 15 0 187m 8580 4856 S 0.7 0.2 0:06.37 postmaster
7049 postgres 15 0 187m 8488 4800 S 0.7 0.2 0:06.07 postmaster
7054 postgres 15 0 187m 8376 4672 S 0.7 0.2 0:06.28 postmaster
7188 postgres 15 0 187m 8588 4868 S 0.7 0.2 0:06.39 postmaster
7190 postgres 15 0 187m 8832 5120 S 0.7 0.2 0:06.52 postmaster
7191 postgres 15 0 187m 8632 4916 S 0.7 0.2 0:06.48 postmaster
7192 postgres 15 0 187m 8884 5176 S 0.7 0.2 0:06.55 postmaster
8511 postgres 15 0 187m 8612 4904 S 0.7 0.2 0:06.39 postmaster
8513 postgres 15 0 187m 8776 5064 S 0.7 0.2 0:06.60 postmaster
8750 postgres 15 0 187m 10m 7220 S 0.7 0.3 0:07.72 postmaster
8768 postgres 15 0 187m 10m 7508 S 0.7 0.3 0:07.77 postmaster
8769 postgres 15 0 187m 10m 7448 S 0.7 0.3 0:07.81 postmaster
8775 postgres 15 0 187m 10m 7064 S 0.7 0.3 0:07.72 postmaster
8782 postgres 15 0 187m 10m 7316 S 0.7 0.3 0:07.84 postmaster
13947 postgres 15 0 187m 8500 4780 S 0.7 0.2 0:06.36 postmaster
13949 postgres 15 0 187m 8536 4824 S 0.7 0.2 0:06.36 postmaster
13951 postgres 15 0 187m 8504 4804 S 0.7 0.2 0:06.35 postmaster
14041 postgres 15 0 187m 8548 4828 S 0.7 0.2 0:06.45 postmaster
14046 postgres 15 0 187m 8560 4812 S 0.7 0.2 0:06.39 postmaster
14052 postgres 15 0 187m 8744 5024 S 0.7 0.2 0:06.54 postmaster
14055 postgres 15 0 187m 8580 4868 S 0.7 0.2 0:06.52 postmaster
14061 postgres 15 0 187m 8464 4760 S 0.7 0.2 0:06.45 postmaster
14092 postgres 15 0 187m 8624 4920 S 0.7 0.2 0:06.52 postmaster
16358 postgres 15 0 187m 8284 4596 S 0.7 0.2 0:06.54 postmaster
16367 postgres 15 0 187m 8392 4568 S 0.7 0.2 0:06.24 postmaster
Craig James <craig_james@emolecules.com> writes:
> Now here's the weird thing. I'm running a pg_restore of a database
> (on the order of 4GB compressed, maybe 34M rows of ordinary data, and
> 15M rows in one BLOB table that's typically 2K per blob). When I do
> this, ALL of the postgress backends start working at about 1% CPU
> apiece.
It's not surprising that they'd all start eating some CPU, if that's a
schema restore and not just bulk data loading. Any catalog change is
going to broadcast "shared cache inval" messages that all the backends
have to process to make sure they get rid of any now-obsolete cached
catalog information.
> This means that the 120 "idle" postgres backends are together
> using almost 100% of one CPU on top of the 100% CPU being used by
> pg_restore. See the output of top(1) below.
Perhaps you need to try to cut down the number of idle processes ...
I don't think anyone's ever spent any time trying to micro-optimize
the shared cache inval code paths. It could be we could cut your
1% figure some, if we were willing to put effort into that. But it's
not going to go to zero.
regards, tom lane