I have not found any mean to force bgwriter to send writes when it can. (Well, I have: create a process which sends "CHECKPOINT" every 0.2 seconds... it works more or less, but this is not my point:-)
There is scan_whole_pool_milliseconds, which currently forces bgwriter to circle the buffer pool at least once every 2 minutes. It is currently fixed, but it should be trivial to turn it into an experimental guc that you could use to test your hypothesis.
I recompiled with the variable coldly set to 1000 instead of 120000. The situation is slightly degraded (15% of transactions were above 200 ms late). However it seems that bgwriter did not write much more pages:
You should probably try it set to 200 rather than 1000, to put it on an equal footing with the checkpoint_timeout of 0.2 seconds you reported on.
Not that I think this will improve the situation. Afterall, my theory is that it does not matter who *writes* the pages, it only matters how they get fsynced.