No problem! That’s what this forum is for .
About your clarification for questions no. 2,3, and 9:
During learning, the receptive fields of neurons converge towards the specific inputs of different digits. That basically means that some neurons will become sensitive to the characteristic input of a 7 (and trigger more often when presented with a 7), others for a 3, and so on… But all of this happens in an unsupervised manner, so the neurons “don’t know” which digit they are converging to. After training, there is a labeling phase, where neurons are assigned to a digit based on which digit they converged to. This is done like this: digits are shown to the neurons to see which digit causes the strongest response, and then that digit is assigned to that neuron (because that’s the digit the neuron has converged to, or learnt, the best).
Then, during testing, you just check which neurons are firing the most for any given digit presentation, and check if the label you assigned to those neurons matches with the digit you’re presenting (in that case, it is a correct classification) or not. That is what I understand from this, at least:
After training is done, we set the learning rate to zero, fix each neuron’s spiking threshold, and assign a class to each neuron, based on its highest response to the ten classes of digits over one presentation of the training set. This is the only step where labels are used, i.e., for the training of the synaptic weights we do not use labels.
The response of the class-assigned neurons is then used to measure the classification accuracy of the network on the MNIST test set (10,000 examples). The predicted digit is determined by averaging the responses of each neuron per class and then choosing the class with the highest average firing rate.
About question no. 8:
They mention indeed an “exponential time dependence”, so I checked the paper and the code to see how that was done. In the paper, the only reference I could find was related to the presynaptic trace, x_{pre}. That trace keeps track of the recent presynaptic spike history of a neuron, and is used for the computations of the STDP learning rule (as show in the equation I posted). Now, this is how the exponential time dependence is implemented using x_{pre}: when a presynaptic spike arrives at the synapse, the trace is increased by 1, otherwise x_{pre} decays exponentially. It is that exponential decay of x_{pre} that they refer to, or at least that is what I got by reading their paper.
Interestingly enough, as I said in the previous comment, I couldn’t locate where and how this is implemented in their code…