Learning SNN temporal encoding (MNIST)

Hi everyone, not sure if the title is appropriate… but let’s see -

I am new to AI and SNNs, but I am a strong supporter of the learning-while-doing approach.

Thus, I started building a 2 layer SNN using Brian2 to solve the hands-digit recognition task using the MNIST dataset.

What I’ve done is:

  1. Converting each image in a stream of spikes coming from 28x28=784 neurons using temporal encoding. Each pixel whose value is greater than 0 is translated to a spike firing at t = (256 - x)/255 * 10ms
    So if the pixel 645 has a value 255 (white), the neuron 645 spikes at 0.04ms, if it has value 1 (black)
    it fires at 10ms. It does not fire if it has a value of 0.

  2. This stream of spikes is fed into layer 1, which is a fully connected layer made of 100 Neurons.
    Each neuron implements a LIF with a dynamic threshold (homeostasis) and refractory period of 10ms.
    I can give more details if needed.
    The synapses implement STDP with traces where Apost has a higher absolute value than Apre.

  3. Layer 1 is connected to the output layer made of 10 Neurons. Neuron “i” fires first when the i-th digit is presented to the network.
    The synapses implement Reward-modulated symmetric STDP (supervised learning), i.e. both post and pre spikes get rewarded on the right output neuron, otherwise punished - so if the i-th digit is presented to the network, the reward is given to the synapses from layer 1 to the “i” neuron, whereas punishment is given to the synapses from layer 1 to the other 9 neurons. Also, in this case, the punishment is more severe than the reward.

  4. Layer 1 and the output are trained separately, i.e. the output layer is used only after layer 1 learned to extract features in an unsupervised fashion.

I have some questions:

a) Does the setup make sense? Do you expert people immediately see some obvious points that won’t make the network work? (it does not so far, I am still in the HYPERPARAMETER exploration now)

b) What shall I expect from the layer 1 weight distribution? What I think (here I need you to confirm or explain better) is separable clusters of neurons - where the separation hyperplane will be found by the output layer. So I guess I should expect, for example, that when images showing “0” are shown, only a subset of L1 neurons should fire (let’s say) 0-9, then when images showing “2”, only neurons 20-29, and so on. Of course, my example was simplistic as I imagined L1 firing in a sorted manner with always 10 neurons per class.

Is this something I should expect? (not the simplistic example, I mean a separable cluster of neurons)

c) If I plot the L1-L0 weights in 100 28x28 images (1 for each L1 neuron), what shall I see? kind of blurred numbers or what? what is the expected layer visualization?
These here Neural Network Weight Visualization - MNIST Dataset - YouTube seem quite uniform, I guess STDP will find a distribution like “either 0 or 1”

d) Could the output layer learning strategy converge?

Thanks guys,