IO¶

Save¶

naplib.io.save(filename, obj, makedirs=False)[source]¶

Save object with pickle.

Parameters:

filename (string) -- File to load. If doesn't end with .pkl this will be added automatically.
obj (Object) -- Data to save.
makedirs (bool, default=False) -- Whether to create parent directories if they do not exist.

Examples

>>> from naplib.io import save, load
>>> arr = [1, 2, 3]
>>> save('data.pkl', arr)
>>> arr_loaded = load('data.pkl')
>>> arr_loaded
[1, 2, 3]

Load¶

naplib.io.load(filename)[source]¶

Load object from saved file.

Parameters:: filename (string) -- File to load. If doesn't end with .pkl this will be added automatically.
Returns:: output -- Loaded object.
Return type:: Object
Raises:: FileNotFoundError -- Can't find file.

Examples

>>> from naplib.io import save, load
>>> arr = [1, 2, 3]
>>> save('data.pkl', arr)
>>> arr_loaded = load('data.pkl')
>>> arr_loaded
[1, 2, 3]

Import Data¶

naplib.io.import_data(filepath, strict=True, useloadmat=True, varname='out')[source]¶

Import Data object from MATLAB (.mat) format. This will automatically transpose the 'resp' and 'aud' fields so that they are shape (time, channels) for each trial. The MATLAB equivalent structure is a 1xN struct with N trials and some number of fields, and this is stored in the .mat file under the variable name "out".

Parameters:

filepath (string) -- Path to .mat file.
strict (bool, default=True) -- If True, requires strict adherance to the following standards: 1) Each trial must contain at least the following fields: ['name','sound','soundf','resp','dataf'] 2) Each trial must contain the exact same set of fields
useloadmat (boolean, default=True) -- If True, use hdf5storage.loadmat, else use custom h5py loader
varname (string, default='out') -- Name of the variable containing the out structure to load.

Returns:

data

Return type:

naplib.Data object

Notes

Given the highly-specific nature of the Data object Matlab format, this function is mostly used internally by Neural Acoustic Processing Lab members.

Export Data¶

naplib.io.export_data(filepath, data, fmt='7.3')[source]¶

Export a naplib.Data instance to the MATLAB-compatible equivalent (.mat file). The MATLAB equivalent structure is a 1xN struct with N trials and some number of fields, and this is stored in the .mat file under the variable name "out". This function will automatically transpose the 'resp' and 'aud' fields for each trial in the .mat file, thus undoing the actions of import_data.

Parameters:

filepath (string) -- Filename or path-like specifying where to save the file.
data (Data instance) -- Data to export.
fmt (str, default='7.3') -- MATLAB file format. Options are {'7.3','7','6'}

Load EDF¶

naplib.io.load_edf(path: str, t1: float = 0, t2: float = 0) → Dict[source]¶

Load data from EDF file (*.edf).

Notes

This function supports the original EDF format and EDF+C. For the EDF+D format it may be better to use PyEDFlib or mne.io.

Parameters:

path (str, path-like) -- Path to EDF data file
t1 (float, default=0) -- Starting time to extract
t2 (float, default=0) -- Ending time to extract until (default of 0 extracts until end of file)

Returns:

loaded_dict -- Keys: 'data' - loaded neural recording (time*channels), 'data_f' - sampling rate of data, 'wav' - loaded audio recording (time*channels), wav_f' - sampling rate of sound, 'labels_data' - array of labels for the channel streams, 'labels_wav' - array of labels for the audio streams

Return type:

dict from string to numpy array or float

Load NWB¶

naplib.io.load_nwb(filepath: str) → Dict[source]¶

Load data from NWB structure. File should be string path to a .nwb file

Parameters:: filepath (str, path-like) -- Path to NWB data file.
Returns:: loaded_dict -- Keys: 'data' - loaded neural recording (time*channels), 'data_f' - sampling rate of data, 'wav' - loaded audio recording (time*channels), wav_f' - sampling rate of sound, 'labels' - array of labels for the channel streams
Return type:: dict from string to numpy array or float

Load TDT¶

naplib.io.load_tdt(directory: str, t1: float = 0, t2: float = 0, wav_stream: str = 'Wav5') → Dict[source]¶

Load data from TDT structure. Directory should contain .sev files and a .tev file, as well as other metadata files.

Parameters:

directory (str, path-like) -- Directory containing TDT data files (tev, sev, and/or tin files, etc.)
t1 (float, default=0) -- Starting time to extract
t2 (float, default=0) -- Ending time to extract until (default of 0 extracts until end of file)
wav_stream (str, default='Wav5') -- The name of the stream containing the audio recording to extract.

Returns:

loaded_dict -- Keys: 'data' - loaded neural recording (time*channels), 'data_f' - sampling rate of data, 'wav' - loaded audio recording (time*channels), wav_f' - sampling rate of sound, 'labels_data' - array of labels for the channel streams, 'labels_wav' - array of labels for the audio streams

Return type:

dict from string to numpy array or float

Load BIDS¶

naplib.io.load_bids(root, subject, datatype, task, suffix, run=None, session=None, extension=None, check=True, befaft=[0, 0], crop_by='onset', info_include=['sfreq', 'ch_names'], resp_channels=None)[source]¶

Load data from the BIDS file structure [1] to create a Data object. The BIDS file structure is a commonly used structure for storing neural recordings such as EEG, MEG, or iEEG.

The channels in the BIDS files are either stored in the 'resp' field of the Data object or the 'stim' field, depending on whether the channel_type is 'stim'.

Parameters:

root (string, path-like) -- Root directory of BIDS file structure.
datatype (string) -- Likely one of ['meg','eeg','ieeg'].
task (string) -- Task name.
suffix (string) -- Suffix name in file naming. This is often the same as datatype.
run (string) -- Run name.
session (string) -- Session name.
extension (string) -- The extension of the filename. E.g., '.tsv'.
check (bool) -- If True, enforces BIDS conformity. Defaults to True.
befaft (list or array-like or length 2, default=[0, 0]) -- Amount of time (in sec.) before and after each trial's true duration to include in the trial for the Data. For example, if befaft=[1,1] then if each trial's recording is 10 seconds long, each trial in the resulting Data object will contain 12 seconds of data, since 1 second of recording before the onset of the event and 1 second of data after the end of the event are included on either end.
crop_by (string, default='onset') -- One of ['onset', 'durations']. If crop by 'onset', each trial is split by the onset of each event defined in the BIDS file structure and each trial ends when the next trial begins. If crop by 'durations', each trial is split by the onset of each event defined in the BIDS file structure and each trial lasts the duration specified by the event. This is typically not desired when the events are momentary stimulus presentations that have very short duration because only the responses during the short duration of the event will be saved, and all of the following responses are truncated.
info_include (list of strings, default=['sfreq, ch_names']) -- List of metadata info to include from the raw info. For example, you may wish to include other items such as 'file_id', 'line_freq', etc, for later use, if they are stored in the BIDS data.
resp_channels (list, default=None) -- List of channel names to select as response channels to be put in the 'resp' field of the Data object. By default, all channels which are not of type 'stim' will be included. Note, the order of these channels may not be conserved.

Returns:

out -- Event/trial responses, stim, and other basic data in naplib.Data format.

Return type:

Data

Notes

The measurement information that is read-in by this function is stored in the Data.mne_info attribute. This info can be used in conjunction with mne's visualization functions.

References

Load CND¶

naplib.io.load_cnd(filepath: str, load_stims: bool | str = True, truncate_lengths: bool = True, connectivity: str | Sequence | float | None = None)[source]¶

Load continuous neural data (CND) file used in the mTRF-Toolbox.

Parameters:

filepath (str) -- Path to the data file (*.mat). This can be either the stim data or the eeg data.
load_stims (Union[bool, str], default=True) -- If True (default), try to load stimuli from an inferred filepath by looking for dataStimXX.mat, where XX is the subject number parsed from filepath, or fall back on dataStim.mat, under the same directory as filepath. Optionally, the exact path to the stim file can be specified. If False, only the file specified by filepath is loaded. This argument is ignored if stim is contained in the data loaded from filepath.
truncate_lengths (bool, default=True) -- If True, and there are both eeg and stim data loaded, truncate the lengths of the eeg and all the stimuli to match each other. The beginnings of all features and eeg are assumed to be aligned, and the end are truncated to the same length on a trial-by-trial basis.
connectivity (Optional[Union[str, Sequence, float]], default=1.6) -- Sensor adjacency graph for EEG sensors. By default, the function tries to use the deviceName entry and falls back on distance-based connectivity for unknown devices. Can be explicitly specified as a FieldTrip neighbor file (e.g., 'biosemi64'; Use a float for distance-based connectivity. This connectivity info will be put into the info attribute of the naplib.Data instance returned.

Returns:

data -- Data containing the various trials loaded from the file, as well as all associated metadata for each trial. Some metadata, including connectivity, is located in the info attribute of the Data object.

Return type:

Data

Notes

If stimuli and eeg are not the same length, it will be assumed that they

This loading function is modified from the read_cnd function found in `Eelbrain<https://eelbrain.readthedocs.io/en/stable/index.html>`_

Read HTK¶

naplib.io.read_htk(filename, return_codes=False)[source]¶

Read an HTK file.

Parameters:

filename (str, pathlike) -- Path to file or filename to read (should end in .htk)

Returns:

data (np.ndarray) -- Data array in the file (time * channels)
fs (int) -- Sampling rate (Hz)
type_code (int) -- Type code of the file. Only returned if return_codes=True. See `voicebox's readhtk<http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/mdoc/v_mfiles/v_readhtk.html>`_ for details.
data_type (str) -- Data type of the file. Only returned if return_codes=True.

Examples

>>> from naplib.io import read_htk
>>> data, fs = read_htk('example.htk')
>>> data.shape
(200000, 1)
>>> fs
2400
>>> data, fs, tc, dt = read_htk('example.htk', return_codes=True)
>>> tc
8971
>>> dt
'PLP_D_A_0'

Notes

Not implemented types:: CRC checking - files can have CRC, but it won't be checked for correctness VQ - Vector features are not implemented.

Load Directory of Wav Files¶

naplib.io.load_wav_dir(directory: str, pattern: str | None = None, rescale: bool = True, subset: Set[str] | None = None) → Dict[str, Tuple[float, ndarray]][source]¶

Load a set of wav files in a directory and return then in a dict mapping from filename (without the .wav suffix) to tuples of floats and numpy arrays containing the sampling rate and wav data.

Parameters:

directory (str, path-like) -- Directory containing wav files. All wav files will be loaded and all other files will be ignored
pattern (str, optional) -- If provided, should be a regex pattern which will be used to match against the wav files found in the directory. For example, if ``pattern=r".*_stim.*", then only the wav files whose base name contains "_stim" will be loaded.
rescale (bool, default=True) -- If True, convert each input to a float in the range -1 to 1 based on the max value of the loaded dtype. For example, a wav file stored as 16-bit integers will be rescaled to np.float32 between -1 and 1 by dividing by 32768.0. This is only done on wav files that are integer types. If True, output is always of type np.float32
subset (Set[str], default=None) -- If provided, only this subset of files will be loaded.

Returns:

loaded_dict

Return type:

dict from string to tuple of float (fs) and numpy array (wav data)

Load Sample Speech Task Dataset¶

naplib.io.load_speech_task_data()[source]¶

Load a sample Data object containing simulated intracranial EEG data from a speech task where a human subject listened to audiobook excerpts. The data contains 10 trials, each with a single-channel audio waveform, 10 simulated channels of electrodes, a transcript of the speech in the audio, and a 128-channel auditory spectrogram.

The electrode responses were simulated by adding noise to the predictions of 10 different spectro-temporal receptive field models.

Returns:: data -- Task data containing 10 trials of stimuli, responses, and metadata for a simulated intracranial EEG recording.
Return type:: naplib.Data instance

Examples

>>> from naplib.io import load_speech_task_data
>>> data = load_speech_task_data()
>>> type(data)
naplib.data.Data
>>> len(data)
10
>>> data.fields
['name',
 'sound',
 'soundf',
 'dataf',
 'duration',
 'befaft',
 'resp',
 'aud',
 'script',
 'chname']