I've been reading all this with interest, even though I know nothing
about distributed database design. I've used Tandem's a bit though, and
they do a rather good job of parallelising queries. A key part of
building an efficient database system on a Tandem is figuring out how
the database is distributed over the disks, which used to correspond (on
the K10000 anyway) to processors. This partitioning is explicitly
declared. I believe the query optimizer used this information to figure
out where it had to go for data. If yor partitioning was wrong,
performance would be dismal, if it was right -- pheew, it would fly.
Bit more onus on the dba, or application developer, but, having worked
on lots of parallel applications, it is my experience that a completely
automatic solution is never terribly good. Distributing work/data
optimally is just too complex a problem to automate.
Adriaan