Unsupervised learning of digit recognition using spike-timing-dependent

I have been using the code from this webpage (GitHub - zxzhijia/Brian2STDPMNIST: Brian 2 version of Paper "Unsupervised Learning of digit recognition using STDP"). I have a few doubts on the oncepts behind this code.

  1. The inhibitory neurons’ membrane potential has a component named I_synI given by the equation (I_synI = gi * nS * (-85.*mV-v)). I think only the one-to-one connections from the excitatory to inhibitory layer must contribute to the membrane potential of the inhibitory neurons (this is done by I_synE). So please explain what is the role of I_synI. What part of the network does this model?
  2. There is a variable named STDP_offset which is never used. Please let me know the use of this variable initialization.

Hello,

I have reproduced a similar experiment
https://www.kaggle.com/dlarionov/mnist-spiking-neural-network

Regards,
Denis

2 Likes

Hello,I run your code, and encountered some problems:

  1. When I turn on the debug parameter, it takes up a lot of memory
  2. How to reconstruct the weight parameters as shown in the figure?
    1647077862(1)

Hi @assergazina and other followers of this topic.

I’ve been struggling to run only a mini simple pattern recognition task with SNN and STDP(presented on Brian samples) for a couple of months. and I’ve not managed to run it yet. I’ve run but it can’t classify patterns and my output neurons firing have relatively the same firing pattern for each input after training.

I’ve numbered my questions for easy recognition.

Since you’ve worked on this paper, I wanted to ask you 1. could you implement this paper using Brian2?
and 2. is it possible to use only simple STDP (presented on Brian2 samples) instead of adapted STDP(introduced in the mentioned paper) for pattern recognition even with lower accuracy?

I’ve implemented a 2-layer SNN with STDP on the last layer. (first layer containing 500 neurons : 400 exc and 100 inh , and last layer : 10 neuron for classification of 3 digits) neurons of the first layer is fully connected to last one. and connectivity probability between exc and inh is 0.2.

I have an interval time of 100 ms between input spike trains… 3. my question is why we should apply this interval and about the amount of this time what factors should we consider?
I have an interval time of 150 ms between inputs, after passing this time before next input is going to feed into my network , I can see some of neurons are not in a rest potential… 4. is it necessary all of neurons are in a rest state?
another challenge is with the rate of excitation and inhibition of my network in the first layer. 5. how the number of spike counts per exc and inh neurons should be based on each input ?

  1. how important is constant time of exc and inh neurons in training? in the above paper the author has mentioend :

longer neuron membrane constants allow for better estimation of the input spiking rate. For example, if the recognition neuron can only integrate inputs over 20 ms at a maximum input rate of 63.75 Hz, the neuron will only integrate over 1.275 spikes on average, which means that a single noise spike would have a large influence. By increasing the membrane time constant to 100 ms, a neuron can integrate over 6.375 spikes on average, reducing the effects of noise

  1. if we set an interval time between inputs, what difference makes these interval time in comparison with the mentioned?

thanks in advance.

Hi @mstimberg and others, please can you help on replicating the work done in the paper. When I am changing the number of inhibitory/ excitatory neurons in the code from 400 to 1600, It is giving me TRACEBACK ERROR saying “could not broadcast input array from shape (400,) into shape (1600,)” . I want to know where else in the code should I make the changes to avoid that error?

Thanks

@sumanmahato027 Which code are you referring to? Note that we can only give limited support for code that someone else wrote, and cannot give step-by-step instructions how to adapt it for your use case.

The full error message should show in which line of your code this problem actually occurs. Did you maybe try to load trained weights from a network with the smaller size?

@mstimberg Thanks Sir for your response. I am referring the code from the paper by Xu Zhang(the one you had shared the github repository of in one of your responses). In the readme file, it is mentioned that for different number of neurons we just need to change the number of neurons only.
When changing number of neurons to 1600 from 400, it is showing error in this line while testing:
neuron_groups[‘e’].theta = np.load(weight_path + ‘theta_’ + name + ending + ‘.npy’) * b2.volt saying
ValueError: could not broadcast input array from shape (400,) into shape (1600,)

When changing number of neurons to 100 from 400, it is throwing error while training itself in this line :
value_arr[np.int32(readout[:,0]), np.int32(readout[:,1])] = readout[:,2] saying
IndexError: index 100 is out of bounds for axis 0 with size 100

I am using 1000 training and testing examples

As I said before, this seems to indicate that you are loading stored weights which have been trained with a network of a different size. I don’t know the code well enough to say whether it is enough to delete/rename the old weight files, or whether you have to do something else to trigger a new training from scratch.

@mstimberg got it what I was doing wrong. Thanks a lot for your quick response.

1 Like

@assergazina @Jul94 When I am using pre-trained weights, then I am getting the expected results. But when I am generating the weights by training the network, then I am getting errors. Please help out on what/where I am getting errors.
Thanks.

Hi, @sumanmahato027
Sorry for late reply.
Could you please share your code through Google Colab or as a file via Google drive? You just need to give the link with full access.

Thanks

1 Like

@assergazina Here is the link of google colab :

Hope you can access it.
Thanks for your time.

Hi @assergazina, have you gone through the code? Please help me in pointing out what I should do for this code to work in different numbers of neurons

Hello to all Brian users ,
I have recently started reading the paper “Unsupervised learning of digit recognition using spike-timing-dependent plasticity” and tried implementing this code using Brian2 library. Could you please help me understand why the paper mentions only using 100 or 400 or 1600 or 6400 neurons in the excitatory layer. Can we try using some other number such as 150, 200 or any other random number of neurons in its place?

Thank you

Hi anish,

I’m new to his community but I’ve been working on the experiment in this paper for a little while now. The authors don’t explain in the paper exactly why they chose these numbers of neurons but you can use other numbers of neurons as well. Note that if you do so then you need to initialise the weights for all of the synapses yourself as the original code loads in numpy arrays to set the weights and only works for the 400 neuron network.

I know some questions in this thread are quite old but it might still be useful for future people who come here to answer them. I converted Diehl and Cook’s source code to python 3 and Brian 2 myself and recreated their study, I didn’t use the converted code linked here but it looks very similar to mine.

The source code is implementing the triplet STDP rule from their paper, not the power rule that they reported the 95.0% accuracy with 6400 neurons for.

Also, question 6 from assergazina’s question:

In real Diehl&Cook’s source code they hadn’t used one-to-one and connected to all excitatory ones, except for the one from which it receives a connection. There was an only all-to-all connection between each layer. However, on paper, they say used the one-to-one connection between exc. and inh. How & Why is it different in the source code?

The way their code works is that they have connected the exc. to inh. neurons in all-to-all fashion, and the inh. to exc. in all-to-all as well, but then they load in weight matrices to assign weights to these connections.
The weight matrix for exc. to inh. is the file random/AeAi.npy. It contains the one-to-one index values and the weight for each connection. All other connections I assume were initialised at zero.
The weight matrix for inh. to exc. is the file random/AiAe.npy. It contains every index combination with a constant weight for each except for i==j where the weight is zero.

This is how they implemented the one-to-one and i!=j connections using structure=dense.

weights/XeAe.npy are their trained weights for getting the results in their repo notes during testing.
random/XeAe.npy appear to be randomly initialised weights with a mean value of 0.15. This weight file is used when you change the settings to train the network instead of testing it.

I hope this information is helpful.

2 Likes

Thanks @genevievefahey ,
The information you provided is useful. As you have recreated the code from scratch I want to ask you a few questions:

  1. If you have generated the exc. to inh. and inh. to exc. weight values on your own, then how did you decide upon the values of to be used, since it affects how the inhibitory neurons interact with the excitatory neurons?
  2. For the stdp parameters (e.g. η_post , w_max, μ), how did you decide the values of these parameters? If the values are too high then the neurons become sensitive to all inputs and if too low, then learning takes a long time.
  3. This question is related to the paper: In the paper the equation for the LIF neuron is given as:
    τdV/dt = (Erest − V) + ge(Eexc − V) + gi(Einh − V)
    But in the above the equation the dimensions are not matching. In the code this has been corrected by dividing the ge(Eexc − V) + gi(Einh − V) terms by 1nS. I am not able to understand if in the paper the 1nS is implied or if the parameters ge and gi are taken as dimensionless (but they are considered as conductances by the paper)?

I am really grateful to this thread and all of its contributors as it has solved a lot of queries for me.

Thank you

Glad I could be of help.

In answer to your questions:

  1. I used the same weights as in the AeAi.npy and AiAe.npy files provided. If you open them with numpy you can read the values. I then just set the weights of each layer to the same values when I create the layers. Hint: they use the same weight value for all weights of the layer except where the values are zero for no connection.

  2. I have only used the triplet STDP rule so far which doesn’t use μ. I’m using w_max = 1.0 and η_post = 0.01 as these were what they listed in their code. I’m assuming μ in their code is exp_ee_pre which is 0.2 but I haven’t tested it yet I can’t confirm it’s what they used when they ran their experiment.
    Are you using the STDP rule as found in the code or trying to implement one of the rules from their paper? If so, which of the rules are you looking at? You can give me an equation number if you like. What values have you tried so far?

  3. ge and gi are conductances in siemens but in the code they are defined as dimensionless. If the dimensionless values are used in the current and voltage equations then Brian reports that the dimensions mismatch. The * nS is included to avoid that, but note that * nS is used in the current equation and then the currents are divided by nS in the voltage equation, cancelling each other out. They could have just as easily used * siemens instead as it gives the same result. I’m not sure why they chose nS instead of just S but it cancels out anyway. I checked both of these methods and it returned the same results.
    So ge and gi are conductances in siemens.

Good luck and I hope this helps!

Hello @mstimberg and others. I wanted to retrain with 100 neurons instead of the 400 used in the paper and used the code from GitHub - sdpenguin/Brian2STDPMNIST: Brian 2 version of Paper "Unsupervised Learning of digit recognition using STDP" but it is running slow, after one day of training it only completed 6600 runs out of 180000 (which I assume to be 3 epochs of MNIST training set). Is there something I missed, or any reimplementation of the original code that is faster or anything I could do to make this run faster on my 16 GB RAM, Intel Core i7-8700 CPU @ 3.20GH running Python 3.18.18, Brian 2.5.0.2?
I appreciate all the help I could get, thank you in advance.