I tried a couple of alternative methods over the weekend in the hope of improving performance, but unfortunately to no avail. One of these was to have the processing of the 500K xml files shared between multiple threads ( multiple connections ). In an attempt to "force" the dropping of the temp tables , each thread creates its own connection, run the function with XML payload and the disconnects. The impression I got was that the avg time per transaction still increases as the process progresses.
default temp_buffers = 8MB - so with ~ 10..20 clients all is done via IO, what is relative slow. Changes of system tables are not fast too on system with high load.
My one concern with this method was locking , which I'm unfortunately quite unfamiliar with.
Is it possible that locking could be a key problem when following this multi-thread approach ?
You can write PostgreSQL extension in C - and store XML only in memory.
Temp tables are best when you do some queries or when you need indexes, but it is terrible slow cache.
else - Postgres is good as database and very slow as cache. It is good for prototyping and for less or middle load servers. For any other use different software