Loading other data formats with SpikeInterface

Kilosort 4 natively supports data in binary format, .bin. The simplest way to save your data in this format is to load it into memory one chunk at a time and save it to a .bin file using NumPy's memmap function. However, if you aren’t comfortable with that process, the SpikeInterface package can load most common electrophysiology formats in a standardized way that makes it easy to extract the data.

To follow the steps in this notebook, you will first need to install SpikeInterface:

pip install spikeinterface[full]

For each data format, SpikeInterface has a read_<format> utility that loads the data as a RecordingExtractor object, which we can use to extract the data and relevant meta information like sampling frequency. The following example shows the steps for the NWB data format. At the bottom of the notebook, there are notes on how to load several other common formats. For all cells after the first, all steps should be the same regardless of format.

Load NWB data

[ ]:

from pathlib import Path
import numpy as np
from spikeinterface.extractors import read_nwb_recording

# Specify the path where the data will be copied to, and where Kilosort 4
# results will be saved.
DATA_DIRECTORY = Path('/home/example_path')  # NOTE: You should change this
# Create path if it doesn't exist
DATA_DIRECTORY.mkdir(parents=True, exist_ok=True)

# Specify path to your existing data
filepath = Path(".../my_data.nwb")       # NOTE: You must change this
# Load existing data with spikeinterface
# NOTE: You may need to specify additional keyword arguments for
#       `read_nwb_recording`, such as `electrical_series_name`. Any required
#       arguments should be clearly spelled out by an error message.
recording = read_nwb_recording(filepath)

Create a new binary file and copy the data to it 60,000 samples at a time. Depending on your system’s memory, you could increase or decrease the number of samples loaded on each iteration. This will also export the associated probe information as a ‘.prb’ file, if present.

[ ]:

from kilosort import io

# NOTE: Data will be saved as np.int16 by default since that is the standard
#       for ephys data. If you need a different data type for whatever reason
#       such as `np.uint16`, be sure to update this.
dtype = np.int16
filename, N, c, s, fs, probe_path = io.spikeinterface_to_binary(
    recording, DATA_DIRECTORY, data_name='data.bin', dtype=dtype,
    chunksize=60000, export_probe=True, probe_name='probe.prb'
    )

If no probe information was loaded through spikeinterface, you will need to specify the probe yourself, either as a .prb file or as a .json with Kilosort4’s expected format. Follow the steps at the bottom of this notebook, or see the tutorial notebook titled, ‘Creating a Kilosort4 probe dictionary’

At this point, it’s a good idea to open the Kilosort gui and check that the data and probe appear to have been loaded correctly and no settings need to be tweaked. You will need to input the path to the binary datafile, the folder where results should be saved, and select a probe file.

python -m kilosort

From there, you can either launch Kilosort using the GUI or run the next notebook cell to run it through the API.

Run Kilosort (API)

Note that in this case, we don’t actually need to specify a probe since it’s the same as the default Neuropixels 1 configuration. For handling different probe layouts, provide your own .prb file and/or see the tutorial on creating a new probe file from scratch.

[ ]:

from kilosort import run_kilosort

# NOTE: 'n_chan_bin' is a required setting, and should reflect the total number
#       of channels in the binary file, while probe['n_chans'] should reflect
#       the number of channels that contain ephys data. In many cases these will
#       be the same, but not always. For example, neuropixels data often contains
#       385 channels, where 384 channels are for ephys traces and 1 channel is
#       for some other variable. In that case, you would specify
#       'n_chan_bin': 385.
settings = {'fs': fs, 'n_chan_bin': c}

# Specify probe configuration.
assert probe_path is not None, 'No probe information exported by SpikeInterface'
probe = io.load_probe(probe_path)

# This command will both run the spike-sorting analysis and save the results to
# `DATA_DIRECTORY`.
ops, st, clu, tF, Wall, similar_templates, is_ref, est_contam_rate = run_kilosort(
    settings=settings, probe=probe, filename=filename, dtype=dtype
    )

Whether you used the gui or the API, the results can now be browsed in Phy from a terminal with:

phy template-gui <DATA_DIRECTORY>/kilosort4/params.py

(replacing DATA_DIRECTORY with the appropriate path)

Using the API to load data through SpikeInterface without copying

We also provide a wrapper for SpikeInterface recordings that will allow them to be read by Kilosort4 without first copying the data to binary. However, in most cases the copy-to-binary approach is recommended since the binary file can be read by the Kilosort4 gui and Phy, while other dataformats cannot. To use this option, you will still need to provide the probe configuration and the filename for the source file.

[ ]:

# First get `recording` through SpikeInterface and specify probe & settings,
# as described above.
wrapper = io.RecordingExtractorAsArray(recording)
ops, st, clu, tF, Wall, similar_templates, is_ref, est_contam_rate = run_kilosort(
    settings=settings, probe=probe, filename=filepath, file_object=wrapper
)

Instructions for additional data formats

The following cells demonstrate how to load other dataformats using spikeinterface. Use these code snippets to modify the first cell of this notebook to work with different datasets.

See SpikeInterface’s documentation for additional details.

SpikeGLX

[ ]:

# NOTE: You do not need to load SpikeGLX data this way. It is already saved in
#       binary format, so you should just point Kilosort 4 to the .bin file.
from spikeinterface.extractors import read_spikeglx
# Provide path to directory containing .bin file.
filepath = Path(".../TEST_20210920_0_g0/")
recording = read_spikeglx(filepath)

Blackrock

[ ]:

from spikeinterface.extractors import read_blackrock
# Provide path to nsX file, not nev file.
filepath = Path(".../file_spec_3_0.ns6")
recording = read_blackrock(filepath)

Neuralynx

[ ]:

from spikeinterface.extractors import read_neuralynx
# Provide path to directory containing .Ncs file(s).
filepath = Path("C:/code/ephy_testing_data/neuralynx/BML/original_data/")
recording = read_neuralynx(filepath)

Openephys

[ ]:

from spikeinterface.extractors import read_openephys
filepath = Path(".../ecephys_tutorial_v2.5.0.nwb")
# NOTE: Open Ephys data can have multiple streams, specify `stream_id` to
#       load different ones.
recording = read_openephys(filepath)

Intan

[ ]:

from spikeinterface.extractors import read_intan
# NOTE: You will need to select the appropriate data stream. If you run without
#       specifying `stream_id`, you will get an error message explaining what
#       each stream corresponds to.
filepath = Path(".../intan_rhs_test_1.rhs")
recording = read_intan(filepath, stream_id='0')

Exporting probes from SpikeInterface

To create a new probe file, we can use ProbeInterface (a subpackage of SpikeInterface). You will also need matplotlib if you want to visualize the probe geometry (recommended).

You can follow the steps in this ProbeInterface tutorial to create a new probe from scratch, or to plot a probe to check that it is configured correctly.

Then use the following steps to export to a .prb file that can be read by Kilosort4.

[ ]:

from probeinterface import ProbeGroup, write_prb

probe = ...  # From SpikeInterface tutorial, or recording.get_probe()

# Multiple probes can be added to a ProbeGroup. We only have one, but a
# ProbeGroup wrapper is still necessary for `write_prb` to work.
pg = ProbeGroup()
pg.add_probe(probe)
# CHANGE THIS PATH to wherever you want to save your probe file.
write_prb('.../test_prb.prb', pg)

Note that the probe object must have channel indices specified in order to save to a .prb file. If write_prb results in an error indicating these are not set, you can use the probe.set_device_channel_indices method to set them. For example, for a 24-channel probe with all contacts connected:

[ ]:

# Must set channel indices for .prb files.
# Indicate "not connected" with a value of -1.
probe.set_device_channel_indices(np.arange(24))