Reproducing Kilosort4 benchmarks

1. Download data

First, choose a simulated dataset to use for the comparison. You’ll need to download and extract the data before running the rest of the notebook. For this demo, I chose the medium drift dataset and saved it to D:/sim_med_drift.

All datasets: https://janelia.figshare.com/articles/dataset/Simulations_from_kilosort4_paper/25298815/1

Medium drift: https://janelia.figshare.com/ndownloader/files/44729869

2. Decompress data

You will also need to decompress the binary file, which has been compressed with mtscomp. You can see details about the package here: https://github.com/int-brain-lab/mtscomp

Install with:

pip install mtscomp

Then decompress the .cbin file in the unzipped directory downloaded in the previous step:

mtsdecomp sim.imec0.ap.cbin -o data.bin

3. Get probe information

The probe file for the recording can be parsed from sim.imec0.ap.meta using a Spike GLX script available here:

https://github.com/jenniferColonell/SGLXMetaToCoords/blob/main/SGLXMetaToCoords.py

Simply download the script and run it in your Kilosort environment with python SGLXMetaToCoords.py, then follow the prompt to select the .meta file. This will save the probe file in the same directory as sim.imec0.ap_kilosortChanMap.mat.

4. Run Kilosort4

[1]:
from pathlib import Path

from kilosort import run_kilosort
from kilosort.io import load_probe, load_ops
from kilosort.bench import load_GT, load_phy, compare_recordings

# Specify paths
# NOTE: Make sure to update `download_path` to match your local directory.
download_path = Path('D:/sim_med_drift')
results_dir = download_path / 'kilosort4'
filename = download_path / 'data.bin'
probe_path = download_path / 'sim.imec0.ap_kilosortChanMap.mat'
gt_path = download_path / 'sim.imec0.ap_params.npz'
[ ]:
# Run Kilosort4 on downloaded dataset
# You can also run it without drift correction by setting `nblocks = 0`,
# or only rigid drift correction by setting `nblocks = 1`.
probe = load_probe(probe_path)
settings = {'n_chan_bin': 385, 'filename': filename, 'nblocks': 5}
_ = run_kilosort(settings, probe=probe)
[2]:
# Load `ops`
# NOTE: If you've already sorted the data, you can skip the previous step.
ops = load_ops(results_dir / 'ops.npy')

5. Compute ground truth comparison

[3]:
# NOTE: Using this with the current version of Kilosort4 requires updates
#       to `kilosort.bench.py`` added in v4.0.32

# Load ground truth results
st_gt, clu_gt, yclu_gt, mu_gt, Wsub, nsp = load_GT(filename, ops, gt_path)
# Load sorting results
st_new, clu_new, yclu_new, Wsub = load_phy(filename, results_dir, ops)
# Compare
fmax, fmiss, fpos, best_ind, matched_all, top_inds = \
    compare_recordings(st_gt, clu_gt, yclu_gt, st_new, clu_new, yclu_new)
100%|██████████| 600/600 [04:25<00:00,  2.26it/s]
[5]:
def num_correct(fpos, fmiss):
    score = 1 - fpos - fmiss
    num_correct = (score >= 0.8).sum()
    return num_correct

n = num_correct(fpos, fmiss)
print(f'number correct: {n}')
print(f'number missed: {600 - n}')      # 600 ground truth units
print(f'number correct in paper: 555')  # for medium drift dataset, from fig 4j
number correct: 553
number missed: 47
[ ]: