Description of problem
I’m using brian for optimizing a neuron model. It seems that each thread creates one or more files in the cython cache directory (
~/.cython/brian_extensions/), maybe even one for each parameter set. After a couple of hours of optimization I had more than 75000 cache files and that filled up my whole quota. I later tried to clear the cache after every run with the command
The problem seems to be that it clears the cache from all the nodes and not only those ones created by the thread, so then the other threads give an error. Is this the case? Is there a way to fix the issue?
Hi @Tuoma. I agree that we are not handling this situation very well. The idea behind the cache is that compiling code is quite costly (in particular with Cython, which generates huge source code files for even the simplest model), so we don’t want to do it repeatedly for the same model. In your case, you seem to be running the same model with different parameters, which ideally should only generate as many files as for exploring a single parameter combination. The devil is in the detail, though, and Brian may generate slightly different code for the same model, which then needs to be compiled again. I see three ways to deal with your issue:
- You change the code so that it only generates a single set of code files. This might require changes in how you set the parameters, and some details like using the
name argument so that objects do not use auto-generated names. The details might also depend on how you parallelize your simulation – if you want to go this route, could you maybe share some details of your code (ideally a minimal example)?
- Do not use Cython, but rather
numpy or the C++ standalone mode (if possible). The
numpy mode is in general about half as fast as
cython, but if your simulations are rather simple/short this might not matter much since it does not have to compile code (it could even be faster in total). For the C++ standalone mode, you’d have to make sure that each of the runs uses its own directory (either by setting a unique directory name
set_device('cpp_standalone', directory='unique_name'), or by letting Brian create new unique directory automatically
set_device('cpp_standalone', directory=None)). You can then delete the directory afterwards: Computational methods and efficiency — Brian 2 126.96.36.199 documentation
- You use a different cache directory for each of your runs (this requires that each of the runs uses its own process not just a thread, because otherwise all runs would share the same global preferences, but I guess this might already be the case?), by setting
prefs.codegen.runtime.cython.cache_dir to a unique value for each process. The
clear_cache call should then only apply to this directory as well, but I am not 100% sure that works correctly if you set the preference after the import of
brian2 (i.e. not in a preference file). A safer alternative would be to delete the directory manually.
Thanks for the reply! I tried setting prefs.codegen.target = ‘numpy’, but it made my simulations approximately 6 times slower.
set_device('cpp_standalone', directory='unique_name') either didn’t work for me or made the simulations very slow. But manually setting
prefs.codegen.runtime.cython.cache_dir and deleting the contents every now and then seems to work very well.
One more thing: it seems that brian does some weird recursion with the cache_dir. When I e.g. set
prefs.codegen.runtime.cython.cache_dir = 'cythontmp/node1_seed1, most of the cache files (*.so .lock) are saved into the correct directory, but some files (.o) are saved into the subdirectory ‘cythontmp/node1_seed1/cythontmp/node1_seed1’.
Yes, I never looked into the details, but this is Cython’s default behaviour, unrelated to Brian. If I run for example:
python -c 'import cython; print(cython.inline("return x + 1", x=1))'
It will use
~/.cython/inline as the cache directory and recreate the full directory structure to save the
$ tree -a ~/.cython/inline/