From a CNN model to a DYNAP-CNN DevKit#

This tutorial explains all steps necessary to convert a torch CNN model to a configuration of the DYNAP-CNN chip. We will first convert the network to a spiking neural network (SNN) and then to a DynapcnnNetwork – a model that is compatible with the chip and simulates its behavior. Finally we port the model to a DYNAP-CNN DevKit.

Import libraries#

Before we start, we will import the libraries necessary to define a torch model, convert it to an SNN and to convert the SNN to a DynapcnnNetwork.

%%capture

# Suppress warnings (This is only to keep the notebook pretty. You might want to comment the below two lines)
import warnings

warnings.filterwarnings("ignore")

# - Import statements
import torch
import samna
import time
from tqdm.auto import tqdm
import numpy as np
import torch.nn as nn
from torchvision import datasets
import sinabs
from sinabs.from_torch import from_model
from sinabs.backend.dynapcnn import io
from sinabs.backend.dynapcnn import DynapcnnNetwork
from sinabs.backend.dynapcnn.chip_factory import ChipFactory

CNN definition#

First we will define a sequential CNN model.

Note that although non-sequential models are supported by the hardware, this is not yet the case for this library.

# - Define CNN model

ann = nn.Sequential(
    nn.Conv2d(1, 20, 5, 1, bias=False),
    nn.ReLU(),
    nn.AvgPool2d(2, 2),
    nn.Conv2d(20, 32, 5, 1, bias=False),
    nn.ReLU(),
    nn.AvgPool2d(2, 2),
    nn.Conv2d(32, 128, 3, 1, bias=False),
    nn.ReLU(),
    nn.AvgPool2d(2, 2),
    nn.Flatten(),
    nn.Linear(128, 500, bias=False),
    nn.ReLU(),
    nn.Linear(500, 10, bias=False),
)

# Load pre-trained weights
ann.load_state_dict(torch.load("../../../examples/mnist_params.pt", map_location="cpu"))
<All keys matched successfully>

Model evaluation#

Lets pass some data and see how this converted spiking model performs.

# Define custom dataset for spiking input data
class MNIST_Dataset(datasets.MNIST):

    def __init__(self, root, train = True, spiking=False, tWindow=100):
        super().__init__(root, train=train, download=True)
        self.spiking=spiking
        self.tWindow = tWindow


    def __getitem__(self, index):
        img, target = self.data[index], self.targets[index]

        if self.spiking:
            img = (np.random.rand(self.tWindow, 1, *img.size()) < img.numpy()/255.0).astype(float)
            img = torch.from_numpy(img).float()
        else:
            # Convert image to tensor
            img = torch.from_numpy(img.numpy()).float()
            img.unsqueeze_(0)

        return img, target
from torch.utils.data import DataLoader
# Define dataloader
tWindow = 200 # ms (or) time steps

# Define test dataset
test_dataset = MNIST_Dataset("./data", train=False, spiking=True, tWindow=tWindow)
test_dataloader = DataLoader(test_dataset, batch_size=1)

We scale the parameters in each layer to as to normalise the output activation to a certain percentile. This helps performance on the chip a lot because output is so heavily quantized. More information about this can be found in Rueckauer et al. 2017.

param_layers = [name for name, child in ann.named_children() if isinstance(child, (nn.Conv2d, nn.Linear))]
output_layers = [name for name, child in ann.named_children() if isinstance(child, nn.ReLU)]
output_layers += [param_layers[-1]]
normalise_loader = DataLoader(dataset=test_dataset, batch_size=10, shuffle=True)
sample_data = next(iter(normalise_loader))[0].flatten(0, 1)
percentile = 99.99


sinabs.utils.normalize_weights(ann, sample_data, output_layers=output_layers, param_layers=param_layers, percentile=percentile)

Note that there are 5 parameter layers in the above defined model. You will see this come into play later.

Convert to Spiking CNN#

We can use the from_torch method from SINABS to convert our CNN to a SNN. The returned object contains the original CNN as analog_model and the newly generated SNN as spiking_model. The ReLUs have been converted to SpikingLayers. In addition, we have also added a spiking layer at the end. This is for compatibitlity with the chip later on, since the chips can only produce spikes and not activations.

sinabs_model = from_model(ann, add_spiking_output=True, min_v_mem=-1, batch_size=1)
print(sinabs_model.spiking_model)

print(sinabs_model.spiking_model[1].min_v_mem)
Sequential(
  (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1), bias=False)
  (1): IAFSqueeze(spike_threshold=1.0, min_v_mem=-1, batch_size=1, num_timesteps=-1)
  (2): AvgPool2d(kernel_size=2, stride=2, padding=0)
  (3): Conv2d(20, 32, kernel_size=(5, 5), stride=(1, 1), bias=False)
  (4): IAFSqueeze(spike_threshold=1.0, min_v_mem=-1, batch_size=1, num_timesteps=-1)
  (5): AvgPool2d(kernel_size=2, stride=2, padding=0)
  (6): Conv2d(32, 128, kernel_size=(3, 3), stride=(1, 1), bias=False)
  (7): IAFSqueeze(spike_threshold=1.0, min_v_mem=-1, batch_size=1, num_timesteps=-1)
  (8): AvgPool2d(kernel_size=2, stride=2, padding=0)
  (9): Flatten(start_dim=1, end_dim=-1)
  (10): Linear(in_features=128, out_features=500, bias=False)
  (11): IAFSqueeze(spike_threshold=1.0, min_v_mem=-1, batch_size=1, num_timesteps=-1)
  (12): Linear(in_features=500, out_features=10, bias=False)
  (spike_output): IAFSqueeze(spike_threshold=1.0, min_v_mem=-1, batch_size=1, num_timesteps=-1)
)
Parameter containing:
tensor(-1)

DYNAP-CNN compatible network#

The next step is to convert the SNN to a DynapcnnNetwork. This way we can be sure that all functionalities of our network are supported by the hardware and we can simulate the expected hardware output for testing purposes. This object will also generate the configuration objects to set up the chip.

We need to tell the chip the dimensions of the input data. This can be done either by specifying an input_shape argument in the constructor or including a SINABS InputLayer at the beginning of the model.

The class will convert the parameters (weights, biases, and thresholds) to discrete values that are supported by DYNAP-CNN. For testing purposes this can be disabled by setting discretize to False.

We can use the dvs_input flag to determine whether the chip should process data coming from the on-chip dynamic vision sensor (DVS).

# - Input dimensions
input_shape = (1, 28, 28)

# - DYNAP-CNN compatible network
dynapcnn_net = DynapcnnNetwork(
    sinabs_model.spiking_model,
    input_shape=input_shape,
    discretize=True,
    dvs_input=False,
)
print(dynapcnn_net)
DynapcnnNetwork(
  (sequence): Sequential(
    (0): DynapcnnLayer(
      (conv_layer): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=799.0, min_v_mem=-799.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (1): DynapcnnLayer(
      (conv_layer): Conv2d(20, 32, kernel_size=(5, 5), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=2396.0, min_v_mem=-2396.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (2): DynapcnnLayer(
      (conv_layer): Conv2d(32, 128, kernel_size=(3, 3), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=2260.0, min_v_mem=-2260.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (3): DynapcnnLayer(
      (conv_layer): Conv2d(128, 500, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=1518.0, min_v_mem=-1518.0, batch_size=1, num_timesteps=-1)
    )
    (4): DynapcnnLayer(
      (conv_layer): Conv2d(500, 10, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=1065.0, min_v_mem=-1065.0, batch_size=1, num_timesteps=-1)
    )
  )
)

The resulting model consists of 5 DynapcnnLayer objects, each containing a convolutional, a spiking, and possibly a pooling layer. Because there were 5 parameter layers (as we pointed out earlier), each gets mapped into a separate layer.

with torch.no_grad():
    correct = 0
    samples = 0
    # we will only use the first 100 samples to save some time
    pbar = tqdm(test_dataset, total=100)
    for data, label in pbar:
        dynapcnn_net.reset_states()
        out = dynapcnn_net(data)

        # Calculate total number of spikes out
        pred = out.squeeze().sum(0)
        
        # Check if the prediction matches the label
        if pred.argmax() == label:
            correct += 1
        samples += 1
        pbar.set_postfix(acc=100*correct/samples)
        if samples >= 100:
            break
data.shape
torch.Size([200, 1, 28, 28])

The final accuracy of this model can now be evaluated.

f"Accuracy of the  dynapcnn_net is: {100*correct/samples}%"
'Accuracy of the  dynapcnn_net is: 99.0%'

Porting model to DYNAP-CNN DevKit#

Similar to porting a model to cpu or gpu in pytorch, the DynapcnnNetwork is a special class that supports porting a model to hardware based on DYNAP-CNN technology.

# Apply model to device such as dynapcnndevkit, speck2, speck2b
device = "dynapcnndevkit:0"
dynapcnn_net.to(device)
Network is valid
[2022-12-22 14:25:22.331] [Default] [warning] The write method of a model is DEPRECATED, please use BasicSourceNode to write events.
DynapcnnNetwork(
  (sequence): Sequential(
    (0): DynapcnnLayer(
      (conv_layer): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=799.0, min_v_mem=-799.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (1): DynapcnnLayer(
      (conv_layer): Conv2d(20, 32, kernel_size=(5, 5), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=2396.0, min_v_mem=-2396.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (2): DynapcnnLayer(
      (conv_layer): Conv2d(32, 128, kernel_size=(3, 3), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=2260.0, min_v_mem=-2260.0, batch_size=1, num_timesteps=-1)
      (pool_layer): SumPool2d(norm_type=1, kernel_size=(2, 2), stride=None, ceil_mode=False)
    )
    (3): DynapcnnLayer(
      (conv_layer): Conv2d(128, 500, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=1518.0, min_v_mem=-1518.0, batch_size=1, num_timesteps=-1)
    )
    (4): DynapcnnLayer(
      (conv_layer): Conv2d(500, 10, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (spk_layer): IAFSqueeze(spike_threshold=1065.0, min_v_mem=-1065.0, batch_size=1, num_timesteps=-1)
    )
  )
)

We can findout where the individual layers have been placed on the chip by looking at variable chip_layers_ordering. This information will come in handy when trying to send or receive events from the chip.

print(dynapcnn_net.chip_layers_ordering)
[0, 3, 5, 6, 1]

Inspecting memory#

For the model described above, we were able to find a sequence of layers on the chip which could be utilized. If we had a larger model that does not fit the chip, the to method call would have raised a ValueError. In that case, it would be handy to understand the model’s memory requirements so that it can be modified to fit the chip. A quick overview of the memory requirements of the model can be found by calling the method memory_summary().

dynapcnn_net.memory_summary()
{'kernel': [1024.0, 20480.0, 65536.0, 65536.0, 8000.0],
 'neuron': [20480.0, 2048.0, 512.0, 500.0, 10.0],
 'bias': [0, 0, 0, 0, 0]}

Note that this memory is not the same as the total number of kernel parameters. See the memory_summary documentation for more details on this.

The memory_summary() method returns the number of parameters/memory required for each layer of the DynapcnnNetwork for kernel, neruons and bias. We see the corresponding values for the 5 layers of our current model above.

You can compare these values to the constrains of the chip. For instance, the chip constraints for your device can be viewed from DynapcnnConfigBuilder.get_constriants().

builder = ChipFactory(device).get_config_builder()
builder.get_constraints()
[LayerConstraints(kernel_memory=16384, neuron_memory=65536, bias_memory=1024),
 LayerConstraints(kernel_memory=16384, neuron_memory=65536, bias_memory=1024),
 LayerConstraints(kernel_memory=16384, neuron_memory=65536, bias_memory=1024),
 LayerConstraints(kernel_memory=32768, neuron_memory=32768, bias_memory=1024),
 LayerConstraints(kernel_memory=32768, neuron_memory=32768, bias_memory=1024),
 LayerConstraints(kernel_memory=65536, neuron_memory=16384, bias_memory=1024),
 LayerConstraints(kernel_memory=65536, neuron_memory=16384, bias_memory=1024),
 LayerConstraints(kernel_memory=16384, neuron_memory=16384, bias_memory=1024),
 LayerConstraints(kernel_memory=16384, neuron_memory=16384, bias_memory=1024)]

Sending a receiving spikes#

Last but not least, we want to be able to send and receive spikes from the chips.

So first we start by generating some spike events. Unlike simulations where the spikes were presented as a tensor (a raster of spikes), for the chip, we will send a sequence of custom event objects. We can convert a spike raster tensor to a list of events using the utility funciton ChipFactory.raster_to_events().

raster, label = test_dataset[0]

factory = ChipFactory(device)
first_layer_idx = dynapcnn_net.chip_layers_ordering[0] 
events_in = factory.raster_to_events(raster, layer=first_layer_idx)

We make sure that neuron states are reset to zero before sending these events to the chip. You can do this similarly to what you would do for a pytorch model on GPU or CPU.

dynapcnn_net.reset_states()
time.sleep(6.0)  # wait until the configuration take effect, this step is necessary when use a speck2edevkit!
events_out = dynapcnn_net(events_in)

We receive spike objects from the hardware as output. We expect them to be from the last layer of the model (chip_layers_ordering[-1]) and for the data we sent in, we expect most spikes to be from the neuron index equal to label. Let’s convert those to a tensor/raster format for easy slicing. Since the last layer has dimensions (10, 1, 1) for (feature, y, x), we are interested in the feature value of the neuron.

output = factory.events_to_raster(events_out)
output.sum([0,2,3])
tensor([  3.,   0.,  48.,  34.,   0.,   0.,   0., 152.,   0.,  28.])

We can see above that neuron 7 produced most spikes. Lets see what the original label of our data was.

label
tensor(7)

Yay! Success. Our model on the chip identified the input data correctly.