GSoC 2025 – Replacing Brian's Just-in-Time Compilation Mechanism

Hi everyone!

I’m Mrigesh Thakur, a GSoC 2025 contributor, and I’m incredibly excited to work on Project #8: Replace Brian’s Just-In-Time Compilation Mechanism under the mentorship of Marcel Stimberg, Dan Goodman, and Benjamin Evans.

What This Project Is About

Brian currently uses Cython for runtime JIT compilation, but this approach introduces significant compilation delays and maintenance overhead. The project aims to:

  • Replace Python/Cython-based runtime data structures (like dynamic arrays, spike queues) with their C++ counterparts.
  • Enable direct memory sharing between Python and C++ with minimal overhead.
  • Optimize performance by eliminating redundant Python/C++ boundary crossings.
  • Potentially explore alternatives to Cython like lightweight weave-like compilation backends.

The goal is to streamline the runtime backend, reduce compilation bottlenecks, and improve simulation performance while preserving the flexibility and interactivity that Brian is known for.

More Details

You can find a detailed write-up here:
:backhand_index_pointing_right: Proof of Concept + Research Repo

Would love to hear any suggestions, feedback, or ideas you might have related to this direction!

Looking forward to learning from and collaborating with this amazing community :smile:


3 Likes

Also, just putting this down here — here’s the full proposal for the project:
:backhand_index_pointing_right: Link to Proposal

If you have any thoughts, suggestions, or feedback on it, I’d really appreciate it. Always happy to iterate and improve based on your insights!

Hey everyone,

Excited to share the first blog post for the project! While it’s not project-specific yet, it lays the foundation by walking through how Brian2 actually works under the hood.

It’s a deep-dive — about an hour’s read — because I didn’t want it to be just another high-level overview. My aim was to make it intuitive, practical, and beginner-friendly, so that anyone jumping into Brian (especially future contributors) can build a solid mental model of what’s happening behind the scenes :)

Would love your feedback or thoughts if you give it a read!

Cheers!

Hey everyone … `:)

Continuing my journey into understanding and contributing to Brian2, I’ve been digging into the internals of the SpikeQueue — and here’s what I’m working on right now:

Currently, here’s how spike pushing works in Brian2:

# 1. Generated Cython code calls a Python method
{% block maincode %}
    _owner.push_spikes()  # Python method call!
{% endblock %}

# 2. Python method extracts spikes and calls another Python method  
def push_spikes(self):
    events = self.eventspace[: self.eventspace[len(self.eventspace) - 1]]
    if len(events):
        self.queue.push(events)  # Another Python call!

# 3. Finally calls the C++ implementation through Cython
def push(self, np.ndarray[int32_t, ndim=1, mode='c'] spikes):
    self.thisptr.push(<int32_t*>spikes.data, spikes.shape[0])

So essentially:
:right_arrow: Compiled Cython → Python → Python → Cython → C++

I’m currently working on this PR to streamline the flow and reduce this overhead — more soon!

Also, this week’s blog continues my deep dive into Brian2 internals — this time on how code generation works and how equations are transformed into executable code.

Read it here: Blog


1 Like

Hi everyone! :waving_hand:

Quick update on my GSoC work:

Over the past week, I focused on removing Python overhead from the SpikeQueue data structure and started integrating the C++ version of DynamicArrays into Brian2 for improved runtime performance.

Here are the key PRs from this phase:

  1. Eliminate Python Layer Calls in Synapses Codegen Templates for SpikeQueue Operations
    :backhand_index_pointing_right: https://github.com/brian-team/brian2/pull/1643

  2. Remove Python SpikeQueue Fallback and Make Cython Version Mandatory
    :backhand_index_pointing_right: https://github.com/brian-team/brian2/pull/1649

  3. Add C++ DynamicArrays in Brian2 for Runtime Mode
    :backhand_index_pointing_right: https://github.com/brian-team/brian2/pull/1650

And as always, here’s my weekly blog post documenting the journey. This week, I even added a fun little animation showing how SpikeQueue works — feel free to check it out!

:link: Brian2: How We Made Spike Processing Faster by Eliminating Python Overhead

Thanks!

Hi everyone!

Sorry for the radio silence over the last few weeks. We’ve been deep in the code, and I’m thrilled to share some exciting progress on our project :smile:

Removing Python Overheads: We’re Almost There!

Our main goal has been to eliminate Python indirection from our most performance-critical code by calling C++ directly from our Cython templates.

We started this with SpikeQueue, and I’m happy to announce that the work on DynamicArrays is now nearly complete (check here) ! The heavy lifting for array resizing and access is now handled by pure C++, which is a massive step forward.

We’re currently wrestling with a few stubborn CI test failures—they’re hard to get by, believe me—but we’re hoping to merge this very soon.

The Future: Beyond Cython?

First, we removed Python indirection from Cython. The next logical, albeit ambitious, step is to explore removing Cython from the runtime path altogether …

This won’t happen suddenly, as it will require extensive testing and will likely debut as an optional, experimental mode. But we are actively investigating technologies like cppyy or CFFI to create a bridge that allows for a true dynamic C++ runtime. This would give us the raw power of compiled C++ even in interactive, runtime sessions.

As always here’s the Blog you all have been waiting for `:)

https://understandbrian2.hashnode.dev/dynamic-arrays-memory-access-patterns-from-python-indirection-to-direct-c-speed

Thanks for following our progress !

For anyone following along and wondering why we’re starting our exploration with cppyy instead of diving into manual C-extensions, I wanted to share my “aha!” moment.

The crux of it is that we realized we were trying to solve the problem while being trapped in “Ahead-of-Time” (AOT) thinking.

Our current approach with Cython is AOT. If we switched to manual C-extensions, the workflow would still be AOT:

  1. Generate source code (C++ files, which would still be large).

  2. Write to disk (a requirement for an external compiler).

  3. Launch external compiler (g++, clang, etc.).

  4. Wait for compilation (this is the big bottleneck, still 15-40 seconds).

  5. Load result (same complex process).

We’d just be swapping one AOT toolchain for another. The fundamental bottleneck isn’t Cython itself—it’s the entire file-based, external-compiler, AOT workflow.

cppyy is exciting because it represents a complete break from this paradigm. It’s a truly Just-in-Time (JIT) binding approach that compiles code in-memory, avoiding the slow steps that have been holding us back.

I wrote a whole new blog post detailing this journey from AOT-despair to JIT-hope. It explains why we believe this is the right path forward.

For the full story, have a read (PS: I know you love reading my blogs… hopefully :wink:):

Escaping the AOT Trap: Why Brian2 is Exploring a Cppyy-Powered Runtime

And here’s an in-depth blog post that explains how cppyy works—from LLVM and Clang, to Cling—in a way that’s actually interesting and jargon-free. I promise it’s super easy to follow (and trust me—I’m not a fan of complicated technical language either):

Understanding cppyy: A True Automatic Python-C++ Binding

Hi everyone! Hard to believe it’s already been 4 months — time really flew by. We’ve made some solid progress along the way, and I’m excited to share the detailed report I prepared for GSoC.

This definitely isn’t the end — I’ll keep contributing and doing my part to make Brian2 even better. :rocket:

Report : GSOC 2025

 

Google Summer of Code 2025: Work Product Submission

INCF: Brian2 Simulator
Replace Brian's just-in-time compilation mechanism

Mrigesh Thakur (@Legend101Zz)

Project Abstract

Brian2 is a free, open-source simulator for spiking neural networks written in Python. It allows users to describe complex neural and synaptic models using intuitive mathematical notation without requiring knowledge of lower-level programming languages. Brian2’s power comes from its code generation framework that transforms high-level model descriptions into efficient compiled code.

However, Brian2’s current just-in-time (JIT) compilation mechanism faces significant performance bottlenecks. In “runtime mode,” Brian2 uses Cython as a bridge between Python and C++, which creates two major issues:

  1. Compilation Overhead: Cython generates extensive boilerplate code for error checking and Python API integration, leading to slow compilation times (25-45 seconds per code object)
  2. Runtime Inefficiencies: Generated code routes through multiple Python method calls before reaching C++ implementations, adding substantial overhead

This project aimed to replace Brian2’s Cython-based JIT compilation with a more efficient system that enables direct memory access between Python and C++ while eliminating redundant boundary crossings.

GitHub Repository: brian-team/brian2

Mentors: @mstimberg (@mstimberg), @dan (@thesamovar), @bdevans

Project Goals

The project was structured in two main phases:

Phase 1: Direct Access Implementation

  • Replace Python-based runtime data structures (SpikeQueue, Dynamic Arrays) with direct C++ equivalents
  • Enable direct memory sharing between Python and C++ objects with minimal overhead
  • Optimize performance by eliminating redundant Python/C++ boundary crossings

Phase 2: Alternative Backend Research

  • Research alternatives to Cython for actual JIT compilation
  • Explore lightweight compilation backends like cppyy
  • Prototype and evaluate new approaches for runtime code generation

Work Accomplished

Phase 1: Core Data Structure Optimization

SpikeQueue Direct Access Implementation

The SpikeQueue is a critical component that manages spike delivery between neurons. Previously, spike processing involved multiple inefficient Python method calls, just to give example :

# Original inefficient flow:
# Compiled Cython → Python → Python → Cython → C++
{% block maincode %}
    _owner.push_spikes()  # Python method call!
{% endblock %}

def push_spikes(self):
    events = self.eventspace[: self.eventspace[len(self.eventspace) - 1]]
    if len(events):
        self.queue.push(events)  # Another Python call!

Solution: Implemented direct C++ SpikeQueue data-structure and access that bypasses Python intermediaries entirely, enabling compiled code to directly call C++ SpikeQueue methods.

C++ Dynamic Arrays Implementation

Dynamic arrays store neuron state variables and previously suffered from Python-mediated access patterns. The new implementation:

  • Replaced Python-based dynamic arrays with pure C++ implementations adapted from standalone mode
  • Created efficient Cython wrappers that expose raw C++ pointers
  • Enabled direct memory access from generated code without Python method calls

Phase 2: Next-Generation Backend Research

The AOT Compilation Problem

Through extensive research and prototyping, we identified that the fundamental issue wasn’t Cython specifically, but rather doing Ahead-of-Time (AOT) compilation at runtime. Traditional approaches still required:

  • Writing large generated files to disk
  • Launching external compiler processes
  • Waiting for full compilation cycles (15-40 seconds)
  • Managing temporary files and linking

cppyy Integration

After evaluating multiple alternatives, we identified cppyy as the most promising solution for true just-in-time compilation. cppyy provides:

  • Real JIT compilation: Compiles C++ code directly in memory without file I/O
  • Automatic Python bindings: No manual wrapper code required
  • Interactive development: Code can be compiled and executed immediately
  • Performance: Near-native C++ speed with minimal overhead

Implementation: Started integration of cppyy as a new code generation backend for Brian2’s runtime mode, enabling true JIT compilation that eliminates the file-based compilation bottleneck entirely.

Contributions During GSoC Period

Core Implementation Pull Requests

Pull Request Description Status
#1643 SpikeQueue optimization: Remove Python overhead in spike processing Merged
#1649 Implement direct C++ path for SpikeQueue operations Merged
#1650 Add C++ DynamicArrays for runtime mode, replace Python arrays Open
#1674 Add cppyy as new codegen backend for runtime mode Open

Additional Contributions ( not related to project )

Pull Request Description Status
#1644 Remove runtime dependency-version checks from init.py Merged
#1657 Introduce abstract RateMonitor class for unified rate analysis Open

Technical Documentation and Blog Posts

Throughout the project, I maintained comprehensive Blogs of my learnings and what we did :

Blog Post Topic Link
Understanding Brian2 Codebase Comprehensive guide to Brian2’s architecture and code generation Read Here
Code Generation Pipeline From Math to Machine Code: How Brian2 transforms equations Read Here
SpikeQueue Optimization Eliminating Python overhead in spike processing Read Here
Dynamic Arrays Optimization From Python indirection to direct C++ speed Read Here
The AOT Trap Why Brian2 is exploring cppyy-powered runtime Read Here
Understanding cppyy A true automatic Python-C++ binding system Read Here

Current State

Successfully Implemented

  • :white_check_mark: SpikeQueue Direct Access: Eliminated Python overhead in neural spike processing
  • :white_check_mark: C++ Dynamic Arrays: Replaced Python-based arrays with efficient C++ implementations
  • :white_check_mark: cppyy Prototype: Initial integration of cppyy as alternative compilation backend

Performance Improvements Achieved

  • SpikeQueue Operations: Eliminated multiple Python method calls in critical spike processing path
  • Dynamic Array Access: Direct C++ memory access without Python intermediaries
  • Compilation Research: Identified path forward with cppyy for true JIT compilation

What’s Next

Short-term Goals

  • Complete cppyy Integration: Finalize the cppyy backend implementation and comprehensive testing
  • Benchmarking Suite: Develop thorough performance comparison between Cython and cppyy backends
  • Documentation: Complete user-facing documentation for new features

Long-term Vision

  • Full Cython Replacement: Gradually migrate Brian2’s runtime mode to use cppyy exclusively
  • Common set of C++ templates : Once cython is removed we’d like to reduce our burden of maintaining seperate templates for standalone and runtime modes , as both cppyy and now standalone will directly use C++ the idea is to have a common set of templates that both use …

Challenges and Key Learnings

The Fundamental Paradigm Shift

The most significant insight was recognizing that the problem wasn’t Cython itself, but the entire AOT-at-runtimeparadigm. This realization led us to cppyy, which represents a true paradigm shift to in-memory JIT compilation.

Technical Challenges Overcome

  • Memory Management: Safely sharing memory between Python and C++ with proper lifetime management
  • Code Generation: Adapting Brian2’s template system to work with direct pointer access

Impact and Future Contributions

This work represents a significant step toward making Brian2 more efficient and maintainable. The direct access implementations provide immediate performance benefits, while the cppyy research establishes a foundation for revolutionary improvements to Brian2’s compilation system.

The project has opened new possibilities for Brian2’s architecture and demonstrated the potential for dramatic performance improvements through careful systems-level optimization.

Acknowledgments

I’m incredibly grateful to my mentor @mstimberg for his guidance, patience, and technical brilliance throughout this project. What really stood out to me was not just how deeply he understands Brian2, but also how carefully he thinks about the ripple effects of even the smallest change in the codebase. The fact that he could casually bring up things like “what if this change affects making Brian2 run on Arduino?” honestly blew my mind — it showed me how much vision and creativity go into maintaining a project like this.

A big thanks as well to the Brian2 community for making me feel welcome and part of something so meaningful. It’s been a joy working on such a beautiful piece of code.

This project has been one of the most rewarding learning experiences I’ve had — it really pushed me to understand Python on a deeper level. And funny enough, it made me realize that when we run python3 main.py, we’re not actually running Python itself, but just one of its implementations. That little detail still fascinates me.

I’m excited to keep contributing to Brian2 well beyond GSoC — there’s so much more to explore :wink:


also gist for the same : Google Summer of Code 2025: Work Product Submission · GitHub