Execution speed of the network

Description of problem

Hello,
I have written a python code for a neural network with large list of neurons (3000 neurons).
Network has two stages, where the second stage is a fully connected network (with 1300 neurons).
Just the fully connected network takes 4 hours for one iteration.
I am hoping to find a way to speed up the execution, as I need to train the entire network.
Any help in this direction would be really helpful.

Thank you!

Minimal code to reproduce problem

What you have aready tried

I am using multiprocessing to speed up as well. I am trying out with standalone mode but it has not generated any output since last 12 hours.

Expected output (if relevant)

Actual output (if relevant)

Full traceback of error (if relevant)

Hi @amruthark . Could you give some more information on the model, e.g. what the neurons and the synapses are like? When you say “list of neurons”, do you mean a single NeuronGroup with 3000 neurons, or several NeuronGroups? What kind of simulated time does “one iteration” represent? Regarding standalone mode, when you say it hasn’t generated any output, do you mean that it hasn’t even started the simulation?
To get some more information about where the time is spent, you can run a (short) simulation with profile=True, and then do

print(profiling_summary())

to get information about where the time was spent. See Running a simulation — Brian 2 2.5.4 documentation

Hi @mstimberg ,
Thank you for suggesting the profiling step; it brought clarity.
I could see through the profiling summary that for 1300 neurons (connected in a fully connected network format), it takes around 2.5 hours for all the neurons to be simulated (the other 2 hours are taken for the calculation of weights in the backpropagation algorithm).

Neurons and synapses are straight-forward.
example:
G = NeuronGroup(100, eqs, threshold=‘v>0.4’, reset=‘v=0’, refractory=5*ms, method=‘exact’)
S_0 = Synapses(G, G, ‘W : 1’, on_pre=‘v_post += W’)

And I have also observed through trial and error that I need to have separate synapse’s for connecting all the neurons in first layer to specific neuron in second neuron layer. ( like S_0 for connecting all the neurons in first neuron layer to first neuron in second neuron layer, similarly S_1 for connecting all the neurons in first layer to second neuron in second neuron layer).
for example,
S_1 = Synapses(G, G, ‘W : 1’, on_pre=‘v_post += W’)
count = 0
for count in range(100): # if you consider 100 neurons in first layer
S_1.connect(i = [count], j=[101]) # 101 is the first neuron in second layer
S_1.W = W
count += 1

I have two neuron groups: one with a large number of neurons, and another with just 10 neurons.

One iteration is forward propagation and updating of weights (backpropagation algorithm) based on the membrane potentials of the neurons.

I tried standalone mode, but I think it hasn’t started the simulation itself.

So is it possible to speed up the execution in this situation?

Hi @amruthark. I am still a bit unsure about several aspects of the model.

How much is this in simulated (“biological”) time, i.e. what is the argument of the run call? Here is a simple network, which use the same neuron/synapse model as in your post and has 2×1300 neurons with all-to-all connections (since you are talking about a forward pass, I assume you are not actually setting up a recurrent network connecting to itself as in your example code?).

from brian2 import *
tau = 10*ms
eqs = 'dv/dt = (-v + 0.45)/tau : 1'  # constant input current to make neurons spike
G1 = NeuronGroup(1300, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
G2 = NeuronGroup(1300, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
S = Synapses(G1, G2, 'W : 1', on_pre='v_post += W')
S.connect()  # all-to-all
G1.v = 'rand()*0.4'
S.W = 'rand()*0.0001'

run(10*second, report='text')

This takes about 3 seconds to simulate for the 10 seconds of biological time. So this would mean that in your code, you are running the forward pass for someting on the order of 8 hours of biological time, is that true ?

No, you definitely don’t have to do this, and this would slow everything down quite a bit. You can call connect several times with a single Synapse object (as you did in the example code you just posted) – can you tell more about why you think you need several Synapses to do this? Also, can you say more about your connection pattern – this is not about all-to-all connectivity, right?

This loop could be replaced by:

S1.connect(i=np.arange(100), j=101)

This probably also explains why standalone does not start (or takes a very long time to start). Each single connect call will be converted into C++ code and compiled – having many trivial connect calls in a loop will therefore take a long time just to create/compile very simple and mostly identical code.

Hope that helps a bit!

Hi @mstimberg ,
Thank you for your response; it has been really helpful.

I apologize for not mentioning it early on. I am using the n-mnist dataset and attempting to use the temporal information as well. Hence, I will be executing run multiple times .

eqs = """
dv/dt = (I-v)/tau : 1 (unless refractory)
I : 1
tau : second
"""
G_input = NeuronGroup(1156, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
G_hidden = NeuronGroup(128, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
G_out = NeuronGroup(10, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
k = 0
while (k < size):  #size is the total number of events for a single n-mnist image
    if xytp[k][3] == True: # xytp is a single event in n-mnist image, xytp - [X cordinate, y - cordinate, time, Polarity]     
        m =  (xytp[k][0]*34) + (xytp[k][1]) # To activate specific neuron in G_inputlayer 
        G_input.I[m] = 0.6
   
        spikemon_input = SpikeMonitor(G_input)
        spikemon_hidden = SpikeMonitor(G_hidden)
        spikemon_out = SpikeMonitor(G_out)
        M_input = StateMonitor(G_input, 'v', record=True)
        M_hidden = StateMonitor(G_hidden, 'v', record=True)
        M_out = StateMonitor(G_out, 'v', record=True)

        run(30*second, report='text')
        k += 1
    else:
        k += 1

It’s an attempt to pass the input at varing time .
Hence, for all the events together, it takes 2–3 hours.
Is there any workaround to speed up a situation like this?

Initially, I tried using a single synapse for connecting all the neurons between two layers ( it is all - to - all connections) something like this.

count_1 = 0
count = 0
S_inputlayer_hiddenlayer = Synapses(G_inputlayer, G_hiddenlayer, 'W : 1', on_pre='v_post += W')
while (count_1 < 128):
    count = 0
    for count in range(1156):
        S_inputlayer_hiddenlayer.connect(i=[count], j=[count_1])
        S_inputlayer_hiddenlayer.W = Whid[count_1][count]
        count += 1
    count_1 += 1

But this results in the same voltage values for all the neurons in the G_hiddenlayer. I am using a multidimentional list for weight so that I can manipulate the values later on.

Ok, I see. This is in general less efficient than having a single long run call (you can do things with TimedArray and run_regularly, for example), but if your individual runs are long (as in your example), it won’t make much of a difference. This isn’t compatible with standalone mode, though (although this feature in Brian’s current development version would actually allow a variant of this).

I am a bit confused by your code, though. You do not feed the spikes in the image in as events, but rather make the neuron representing the pixel fire continuously for 30s? And most importantly, you seem to be running one simulation per event (not per image), and these “stack up”? I.e., for the first simulation it has one neuron that is active all the time, for the second simulation there are two neurons that are active, etc.? This seems wrong, no?
If you wanted to actually feed in events at certain times, you could use a SpikeGeneratorGroup.

Regarding the synapses, the problem in your code is that

S_inputlayer_hiddenlayer.W = Whid[count_1][count]

will set the value of all weights in each loop iteration. In the end, all weights will therefore be the same. You could fix this by using something like S_inputlayer_hiddenlayer.W[-1] = Whid[count_1][count] (will set the last weight only each time), but for a weight matrix, you should really use the approach described in the documentation. I am not 100% sure about the transpose in the last line, but I think the code you posted, can be replaced by:

S_inputlayer_hiddenlayer = Synapses(G_inputlayer, G_hiddenlayer, 'W : 1', on_pre='v_post += W')
S_inputlayer_hiddenlayer.connect()  # connect all-to-all
S_inputlayer_hiddenlayer.W[:] = np.array(Whid).T.flatten()  # set weights from the matrix

This should be orders of magnitude faster to set up the connections, and will also allow you to use a single Synapses object (which should make the simulation much faster as well).

Let me know whether this works for you!

Hi @mstimberg ,

Thank you for your response its really helpful!

No, previously, for the first simulation, a specific neuron will be active, and for the next simulations, it won’t be active. But as you suggested correctly, SpikeGeneratorGroups fits perfectly for my requirement.
And I have corrected the synapses has well.

I will proceed with these corrections, thank you very much for your timely support:)

Ok, great, SpikeGeneratorGroup should be what you need. But just to make clear why I made my comment: in this code (removed most lines to focus on the important bits)

G_input = NeuronGroup(1156, eqs, threshold='v>0.4', reset='v=0', refractory=5*ms, method='exact')
# ...
k = 0
while (k < size):  #size is the total number of events for a single n-mnist image
    if xytp[k][3] == True: # xytp is a single event in n-mnist image, xytp - [X cordinate, y - cordinate, time, Polarity]     
        m =  (xytp[k][0]*34) + (xytp[k][1]) # To activate specific neuron in G_inputlayer 
        G_input.I[m] = 0.6
        # ...
        run(30*second, report='text')
        k += 1
    else:
        k += 1

you are setting up a group and with each loop iteration where there is a positive event, you are setting I[m] for a different m – you never reset/delete the old inputs, though, each simulation will continue with the previous input still active. But maybe this was not the complete code ? Either way, SpikeGeneratorGroup is what you want :blush: