Hi @lmun373 . One general remark: I am really not sure whether using this exact connection method (fixed total number of connections) is really worth all the effort – if you used Brian’s built-in connection method with the probability depending on the distance (with an appropriately tweaked scaling factor for the probability), your code would be much simpler. The downside would be that the total number of connections would no longer be constant, but given the massive number of connections that you have, the variation should be small in relative terms. But that is of course up to you
Regarding the memory issue itself, in general deleting/overwriting p
should work, but I think the problem here is that you also store the returned result in the prob_array
variable:
prob_array = find_p(src, tgt, n_dict, n_groups, p)
This in itself does not change the memory usage (prob_array
is only a second reference to the memory of p
), but it means that deleting p
is not enough to free the memory, you also have to delete prob_array
.
But the biggest problem might be the find_p
function itself. To calculate the pairwise distances, it creates big matrices that repeat the X
and Y
coordinates to get appropriately shaped matrices. With numpy, you do not actually have to create these matrices, instead you can use broadcasting (see e.g. 1.4.2. Numerical operations on arrays — Scipy lecture notes – the examples actually calculate pairwise distances). This makes the code much more compact, but should also save quite a bit of memory (it will still create intermediate matrices, but not all at the same time). Your function could look something like this:
def find_p(src, tgt, n_dict, n_groups, p):
radius = 0.000175
l_name = list(n_dict.keys())
array_X_src = n_groups[l_name.index(src)].X_[:][:, np.newaxis]
array_X_tgt = n_groups[l_name.index(tgt)].X_[:]
array_Y_src = n_groups[l_name.index(src)].Y_[:][:, np.newaxis]
array_Y_tgt = n_groups[l_name.index(tgt)].Y_[:]
p[:] = 1 * np.exp(-(np.add(np.power(np.subtract(array_X_src, array_X_tgt), 2), np.power(np.subtract(array_Y_src, array_Y_tgt), 2)))/(2*radius**2))
np.fill_diagonal(p, 0)
The slightly weird n_groups[...].X_[:]
means to get the value in X
without units and as a standard numpy array (instead of as Brian’s own VariableView
which adds additional functionality on top). Importantly, this gets you a view on the values, i.e. they are not copied in memory. This does not seem to be enough, though, so in addition you can change the dtype
for the p
and the X
and Y
variables to float32
:
p = np.empty((n_groups[l_name.index(src)].N, n_groups[l_name.index(tgt)].N),
dtype=np.float32)
# ....
neurons = b2.NeuronGroup(N, eqs, threshold='v>theta', reset=eqs_reset,
method='Linear', refractory=tau_ref,
dtype={'X': np.float32, 'Y': np.float32})
I did not measure/verify things in detail, but with this change I am at least able to go through the most of the connection creation process on my machine (with 32GB RAM as well) – it stops for 2/E → 5E, which are two very big populations and therefore a huge number of potential connections. I did not try this out, but if the intermediate arrays in find_p
are the problem, you could also try to split things up, e.g. to divide the whole matrix into 4 quadrants. First calculate the pairwise distances between the first half of the source neurons and the first half of the target neurons, then between the second half of the source neurons with the first half of the target neurons, etc. The final matrix will of course still have the same size, but the intermediate matrices will be smaller.
There are more sophisticated ways of optimizing pairwise distance calculations, not sure you want to get into this… Either way, may I suggest storing the generated indices to disk so that you don’t have to recreate them from scratch every time you want to run the simulation ?
Hope that helps a bit, best
Marcel