Multiprocessing in standalone mode, poor speed-up

Unfortunately there is not much general advice I can give, all depends on the details of the model. The compilation has a more or less fixed cost (it basically depends on the number of objects), and does not depend on the size of the network, the complexity of the model, or the length of the simulation. The time it takes therefore becomes negligible when you scale up the network or simulate it for a longer time. If you have a small/simple/short-running network where the compilation takes a large part of the time, you might even be better off by not using code generation completely (i.e. without set_device and with prefs.codegen.target='numpy')!
We actually discussed this a bit in our Brian2GeNN paper where this is even more of an issue since the code generation for GeNN + CUDA compilation takes a very long time.

Having said all that, the most important approach to reduce the compilation time in standalone mode is to reduce the number of objects/simulations. In the first toy example, you could obviously simulate all the neurons with different time constants in a single network and this would only need a single compilation. In the Vogels 2011 network there is a more subtle issue. If you used it as it is with the two run statements, these will effectively double the number of objects it has to compile. It would be more efficient for the compilation time to change the eta variable with a run_regularly operation, for example.

Finally, some Brian-independent approaches to decrease the compilation time might work. For example, using less aggressive optimization (set as part of the preferences) should make compilation a bit faster (but potentially simulation a bit slower). If you have a lot of memory, you could also try to point the code directory to a ramdisk which should be considerably faster. Finally, you could do a single compilation first and copy over the directory of this to the directory you create for your new process. It will have to recompile some parts depending on what the difference between the compilations is, but most of the files will be unchanged and will therefore be skipped. You can also automate this by using a tool like ccache. If you try this out, please let us know about your experience!