Parallelization of Optimization Procedure For Network Fitting

Description of problem

Hi everyone,

In a follow up to my question regarding fitting network models to exhibit realistic spontaneous mean firing rates, I tried to implement an optimization loop similar to what @mstimberg proposed in my other post, using the differential evolution algorithm in scipy. The code runs well for the case when the ‘workers’ parameter is set to 1, so that it only uses 1 core. However, this results in very long computation times for any optimization procedure and I would like to increase the number of workers utilized by the differential_evolution function. I tried to increase the number of workers (setting it to 3, 5 and -1) and the code starts running and I see the prints corresponding to each core. So, for example if I set the workers=3, essentially I see what I’m printing for a single run x 3, hence I can see that the optimization function is working. However, after some time, the process really slows down and stops running. (I believe) This is because I have not managed to implement a correct parallelization process in Brian2.

Minimal code to reproduce problem

My script is part of a larger repository and it uses another script to construct the network , as well as various files for the neuron models I’m using. Essentially, the network represents a 17-population cortical patch, with recurrent connections between populations, as well as connections from background Poisson Groups with a particular firing rate (17 background rates). Each of these Poisson Groups targets a particular population with a certain synaptic weight, which I am trying to fit in order to obtain experimentally relevant spontaneous mean firing rates across the 17 populations.

I cannot post all my code, so it will not be enough to reproduce the problem, but I believe it illustrates the problem I am trying to solve. In essence, I have to be able to make the optimization loop work across multiple cores so that I can run the procedure faster (and also to be able to run it on a cluster). There are some irrelevant lines of code for this context, but please let me know if I can provide any additional details.

from brian2 import *
import network_main
from network_parameters import net_dict
from utils.utils import save_config, create_run_folder, setup_loggers
import csv
import pandas as pd
from collections import defaultdict
import hydra
from omegaconf import DictConfig
from omegaconf import OmegaConf
import os
import logging
from scipy.optimize import differential_evolution
prefs.codegen.target = 'numpy'
defaultclock.dt = 0.1*ms # Time resolution of the simulation
np.random.seed(net_dict['Seed']) 

# Target firing rates in Hz
target_rates = [0.18, 0.27, 3.10, 3.63, 7.97, 1.06, 3.89, 1.89, 0.85, 2.22, 4.58, 3.78, 6.50, 0.93, 4.61, 6.30, 2.90]

def evaluate_network(background_weights, target_rates, network, cfg, metrics_dir, t_sim, log):
    # Scale the integer weights back to (0.1, 0.2, ..., 1.5)
    background_weights = [round(w) / 10 for w in background_weights]
    log.info(f"Rounded background weights: {background_weights}")
    net = network
    net.create(background_weights) # Instantiating the network with 17 populations, as well as the Poisson Groups responsible for 'background' stimulation
    net.connect(cfg) # Setting up recurrent connections and connections from background to each population
    net.simulate(t_sim) # Simulating activity in spontaneous conditions, default 500ms
    mean_rates = net.firing_rates(t_sim, metrics_dir) # Returns list with average firing rates across the 17 populations
    log.info(mean_rates)
    # Mean squared error between target and actual rates
    return np.mean((np.array(mean_rates) - np.array(target_rates))**2)

@hydra.main(version_base=None, config_path='config', config_name='config.yaml')
def main(cfg: DictConfig):
    # Setting up files, folders, paths etc...
    t_sim = cfg.t_sim * ms # Simulation time - needs to be in ms in config file
    EXPERIMENT_PATH = os.environ.get('EXPERIMENT_PATH') # Path of directory where experiments are saved, named runs
    experiment_dir = EXPERIMENT_PATH
    run_dir, raster_plots_dir, spike_times_dir, metrics_dir = create_run_folder(experiment_dir, cfg.experiment_type)
    save_config(cfg, run_dir)
    log = setup_loggers(run_dir)

    net = network_main.Network_main(net_dict, log) # Instantiating the network that's going to be used in all runs within the optimization function

    bounds = [(1, 15)] * len(target_rates) # Integer bounds for 0.1 to 1.5 (scaled by 10), (min, max) for each background rate
    best_parameters = differential_evolution(evaluate_network, bounds, popsize=5,
                                             args=(target_rates, net, cfg, metrics_dir, t_sim, log), workers=1, maxiter=2) # Optimization procedure, here on 1 core
    
if __name__ == "__main__":
    main()

I have read the documentation on parallelization with Brian2, but so far I have not managed to make it work, as it is a bit hard for me to understand how to implement the changes required. I am not sure if my way of implementing this process even allows for parallelization and I would greatly appreciate any help or direction which I can take in order to make this work across multiple cores (and to somehow save each output to a different file/folder).

Many thanks in advance :blush:

Best wishes,
Rares

Hi @raresdorcioman. I am not actually sure that the parallelization is an issue here. You are using the numpy code generation target which keeps all the information in memory, and scipy’s parallelilzation uses multiprocessing, which means that each simulation uses a new process with its own memory. The instructions in Computational methods and efficiency — Brian 2 2.7.1 documentation are only relevant for the C++ standalone mode, where data is exchanged via the disk, and additional care has to be taken to not overwrite the results from other processes.

Are you sure that this only happens when you run the simulation with multiple workers? If a simulation slows down a lot, it could be that the network goes into some pathological activity (a lot of spiking activity), which can slow things down up to a point where it looks as if it stopped. Another possibility (that would indeed be more likely with more workers) is that your machine runs out of memory, did you check for that?

The only thing that looks problematic in your above code is the metrics_dir, which seems to be shared between all processes, but I am not quite sure what this is used for. Also, I don’t know what functions like net.create actually do, do they create the NeuronGroup objects, or have they’ve been created previously?

1 Like

Dear @mstimberg ,

I apologise for the very long delay in my response, I took a break and had some urgent responsibilities to attend to before picking up the project again.

Many thanks for your reply. What you mentioned makes sense and I mistakenly believed that it was an issue linked to parallelization. The problem seems more linked to pathological activity and the fact that my script attempts to write the outputs to the same metrics_dir, which led to conflicts in that. I will try to solve these issues and better define my parameter space as to avoid pathological activity as much as I can.

net.create actually creates NeuronGroup objects that are used during the simulation. These include neuronal populations as well as Poisson Groups that represent ‘background activity’. Then the connect function creates Synapse objects and connects these populations recurrently (as well as the background Poisson Groups). I will attempt to re-write my script and try to run the optimization procedure again, hope this time I can proceed with the parallelization procedure.

Thanks a lot for your time and help again!

Best wishes,
Rares

1 Like