merging HashJoin and Hash nodes

Поиск
Список
Период
Сортировка
От Andres Freund
Тема merging HashJoin and Hash nodes
Дата
Msg-id 20191028231526.wcnwag7lllkra4qt@alap3.anarazel.de
обсуждение исходный текст
Ответы Re: merging HashJoin and Hash nodes
Список pgsql-hackers
Hi,

I've groused about this a few times, but to me it seems wrong that
HashJoin and Hash are separate nodes. They're so tightly bound together
that keeping them separate just doesn't architecturally makes sense,
imo. So I wrote a prototype.

Evidence of being tightly bound together:
- functions in nodeHash.h that take a HashJoinState etc
- how many functions in nodeHash.h and nodeHashjoin.h are purely exposed
  so the other side can call them
- there's basically no meaningful separation of concerns between code in
  nodeHash.c and nodeHashjoin.c
- the Hash node doesn't really exist during most of the planning, it's
  kind of faked up in create_hashjoin_plan().
- HashJoin knows that the inner node is always going to be a Hash node.
- HashJoinState and HashState both have pointers to HashJoinTable, etc

Besides violating some aesthetical concerns, I think it also causes
practical issues:

- code being in different translation code prevents the compiler from
  inlining etc. There's a lot of HOT calls going between both. For each
  new outer tuple we e.g. call, from nodeHashjoin.c separately into
  nodeHash.c for ExecHashGetHashValue(), ExecHashGetBucketAndBatch(),
  ExecHashGetSkewBucket(), ExecScanHashBucket(). They each have to
  do memory loads from HashJoinState/HashJoinTable, even though previous
  code *just* has done so.
- a separate executor node, and all the ancillary data (slots,
  expression context, target lists etc) is far from free
- instead of just applying an "empty outer" style optimization to both
  sides of the HashJoin, we have to choose. Once unified it's fairly
  easy to just use it on both.
- generally, a lot of improvements are harder to develop because of the
  odd separation.


Does anybody have good arguments for keeping them separate? The only
real one I can see is that it's not a small change, and will make
bugfixes etc a bit harder.  Personally I think that's outweighed by the
disadvantages.

Attached is a quick prototype that unifies them. It's not actually that hard,
I think? Obviously this is far from ready, but I thought it'd be a good
basis to get a discussion started?

Comments on the prototype:

- I've hacked EXPLAIN to still show the Hash node, to reduce the size of
  the diffs. I'm doubtful that that's the right approach (and I'm sure
  it's not the right approach to do so with the code I injected) - I
  think the Hash node in the explain doesn't really help users, and just
  makes the explain bigger (except for making it clearer which side is
  hashed)
- currently I applied a very ugly hack to distinguish the parallel
  shm_toc key for the data previously in hash from the data previously
  in HashJoin. Clearly that'd need to be done properly.
- obviously we'd have to work a lot more on comments, function ordering,
  docs etc. if we wanted to actually apply this.

FWIW, it's much easier to look at the patch if you use
--color-moved --color-moved-ws=allow-indentation-change

as parameters, as that will color code that's moved without any changes
(except for indentation), differently from modified code.


One thing I noticed is that create_hashjoin_plan() currently says:

    /*
     * Set Hash node's startup & total costs equal to total cost of input
     * plan; this only affects EXPLAIN display not decisions.
     */
    copy_plan_costsize(&hash_plan->plan, inner_plan);
    hash_plan->plan.startup_cost = hash_plan->plan.total_cost;

which I don't think is actually true? We use that for:
                else if (HJ_FILL_OUTER(node) ||
                         (outerNode->plan->startup_cost < hashNode->ps.plan->total_cost &&
                          !node->hj_OuterNotEmpty))

Leaving the inaccurate (outdated?) comment aside, it's not clear to me
why we should ignore the cost of hashing?

It also seems like we ought actually charge the cost of hashing to the
hash node, given that we actually apply some hashing cost
(c.f. initial_cost_hashjoin).

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: WIP: expression evaluation improvements
Следующее
От: Andres Freund
Дата:
Сообщение: Re: JIT performance bug/regression & JIT EXPLAIN