Array Operations¶

Sliding Window¶

naplib.array_ops.sliding_window(arr, window_len, window_key_idx=0, fill_out_of_bounds=True, fill_value=0)[source]¶

Extract windows of length window_len and put them into an array. Can be used for causal, anticausal, or noncausal windowing.

Parameters:

arr (np.ndarray, shape (time, feature_dims...)) -- Data to be windowed. Windowing is only applied across first dimension, which is assumed to be time. All other dimensions are kept the same for the output.
window_len (int) -- length of sliding window
window_key_idx (int, default=0 (must be from 0 to window_len-1)) -- Key point of a given sliding window. A value of 0 corresponds to causal sliding windows, where the first window_len-1 values in the nth window happen before the nth point in arr. A value of window_len corresponds to anti-causal sliding windows, where the first value in the nth window is arr[n], and the remaining window_len-1 values come after that point. A value of 1 would return windows where the nth window is a window starting at arr[n-(window_len-2)] and ending at (and including) arr[n+1].
fill_out_of_bounds (bool, default=True) -- If True, prepends fill_value to the first (window_len-1) samples before the beginning of the array across all feature dimensions so that the output is the same length as the input (i.e. there is one window for each time point in the original array, though the first window will contain only zeros except for the last value). If False, does not prepend zeros, so the output has fewer windows than the input has time points.

Returns:

windows -- Windowed array segments.

Return type:

np.ndarray, shape (n_samples, window_len, feature_dims...)

Examples

>>> import numpy as np
>>> from naplib.array_ops import sliding_window
>>> arr = np.arange(1,5)
>>> slide1 = sliding_window(arr, 3)
>>> slide2 = sliding_window(arr, 3, 0, False)
>>> slide3 = sliding_window(arr, 3, 2)
>>> slide4 = sliding_window(arr, 3, 1)
>>> print(slide1)
[[0. 0. 1.]
 [0. 1. 2.]
 [1. 2. 3.]
 [2. 3. 4.]]
>>> print(slide2)
[[1 2 3]
 [2 3 4]]
>>> print(slide3)
[[1. 2. 3.]
 [2. 3. 4.]
 [3. 4. 0.]
 [4. 0. 0.]]
>>> print(slide4)
[[0. 1. 2.]
 [1. 2. 3.]
 [2. 3. 4.]
 [3. 4. 0.]]

Concatenate and Apply¶

naplib.array_ops.concat_apply(data_list, function, axis=0, function_kwargs=None)[source]¶

Apply a function to a list of data by first contatenating the list into a single array along the axis dimension, passing it into the function, and then spreading the result back into the same size list. The function must return an array with the axis dimension unchanged.

Parameters:

data_list (list of np.array's) -- Each array in the list must match in all dimensions except for axis so that they can be concatenated along that dimension.
function (Callable) -- A function which operates on an array. It must return an array where the axis dimensions is unchanged. For example, this could be something like sklearn.manifold.TSNE().fit_transform if axis=0, or your own custom function.
axis (int, default=0) -- Axis over which to concatenate and then re-split the data_list before and after applying the function.
function_kwargs (dict, default=None) -- If provided, a dict of keyword arguments to pass to the function.

Returns:

output -- List of arrays after chopping up the output of the function into arrays of the same length as the original input.

Return type:

list of np.ndarray's

Raises:

RuntimeError -- If the callable function changes the size of the concatenation/splitting axis.:

Examples

>>> import numpy as np
>>> from naplib.array_ops import concat_apply
>>> data = [np.arange(20).reshape((5,4)), np.arange(20, 40).reshape((5,4))] # 2 trials, 5 samples with 4 channels
>>> data
[array([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]]),
 array([[20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35],
        [36, 37, 38, 39]])]

>>> # We can use PCA to reduce the channel dimensionality by fitting PCA on the
>>> # concatenated data, transforming it, and then splitting it back into 2 trials
>>> from sklearn.decomposition import PCA
>>> data_pca = concat_apply(data, PCA(2).fit_transform)
>>> data_pca
[array([[-3.60000000e+01,  8.63623587e-15],
        [-2.80000000e+01, -2.36903429e-15],
    [-2.00000000e+01, -1.34899193e-15],
        [-1.20000000e+01, -5.15542367e-16],
        [-4.00000000e+00, -4.16724783e-16]]),
 array([[4.00000000e+00, 4.16724783e-16],
        [1.20000000e+01, 5.15542367e-16],
        [2.00000000e+01, 1.34899193e-15],
        [2.80000000e+01, 2.36903429e-15],
        [3.60000000e+01, 3.01589107e-15]])]

>>> # We can downsample the channel dimension, making use of
>>> # the function_kwargs parameter
>>> from scipy.signal import resample
>>> downsampled_channels = concat_apply(data, resample, function_kwargs={'num': 3, 'axis': 1})
>>> downsampled_channels
[array([[ 0.5      ,  1.1339746,  2.8660254],
        [ 4.5      ,  5.1339746,  6.8660254],
        [ 8.5      ,  9.1339746, 10.8660254],
        [12.5      , 13.1339746, 14.8660254],
        [16.5      , 17.1339746, 18.8660254]]),
 array([[20.5      , 21.1339746, 22.8660254],
        [24.5      , 25.1339746, 26.8660254],
        [28.5      , 29.1339746, 30.8660254],
        [32.5      , 33.1339746, 34.8660254],
        [36.5      , 37.1339746, 38.8660254]])]

Resample Categorical Data¶

naplib.array_ops.resample_categorical(arr, num)[source]¶

Resample categorical data (i.e. integers) to a new size

Parameters:

arr (np.ndarray) -- Array to be resampled. Either shape (time,) or shape (time, features). Will resample along axis=0. Each feature is resampled independently.
num (int) -- Number of desired samples. Output will be of shape (num, features)

Returns:

resamp_arr -- Resampled data. Length = num

Return type:

np.ndarray

Examples

>>> from naplib.array_ops import resample_categorical
>>> import numpy as np
>>> # array of length 16 containing categorical values
>>> x = np.array([1,1,1,1,2,2,3,3,4,4,4,4,5,5,5,5])
>>> resample_categorical(x, num=8) # downsample
array([1., 1., 2., 3., 4., 4., 5., 5.])
>>> resample_categorical(x, num=20) # upsample
array([1., 1., 1., 1., 1., 2., 2., 2., 3., 3., 4., 4., 4., 4., 4., 5., 5.,
   5., 5., 5.])

Forward Fill¶

naplib.array_ops.forward_fill(arr, axis=0)[source]¶

Forward fill a numpy array along an axis (removing nan's in the process).

Note, only 2-dimensional inputs are currently supported.

Parameters:

arr (np.ndarray) -- Array to forward fill.
axis (int, default=0) -- Axis over which to forward fill.

Returns:

filled_arr -- Array which is now forward filled

Return type:

np.ndarray

Examples

>>> from naplib.array_ops import forward_fill
>>> arr = np.nan*np.ones((5,4))
>>> arr[0,1] = 1
>>> arr[2,0] = 2
>>> arr[2,2] = 3
>>> arr
array([[nan,  1., nan, nan],
       [nan, nan, nan, nan],
       [ 2., nan,  3., nan],
       [nan, nan, nan, nan],
       [nan, nan, nan, nan]])
>>> # forward fill along axis=0
>>> forward_fill(arr, axis=0)
array([[nan,  1., nan, nan],
       [nan,  1., nan, nan],
       [ 2.,  1.,  3., nan],
       [ 2.,  1.,  3., nan],
       [ 2.,  1.,  3., nan]])
>>> # forward fill along axis=1
>>> forward_fill(arr, axis=1)
array([[nan,  1.,  1.,  1.],
       [nan, nan, nan, nan],
       [ 2.,  2.,  3.,  3.],
       [nan, nan, nan, nan],
       [nan, nan, nan, nan]])

Center of Mass¶

naplib.array_ops.center_of_mass(*args, axis=0, interp_n=None)[source]¶

Compute Center of Mass over an axis in an array.

Parameters:

x (np.ndarray, optional) -- Sorted 1D array of current uneven sampling of x values for the axis over which to compute center of mass. For example, the current sampling may be at x values [1,2,5,10], so before the center of mass is computed, it will be interpolated along that axis. If this is not provided, it is assumed to be integers from 0 to the length of the axis.
y (np.ndarray) -- Array to compute center of mass over an axis.
axis (int, default=0) -- Axis over which to compute center of mass.
interp_n (int, default=None) -- Number of values to interpolate if currently using uneven sampling. If None, then new sampling will be every integer between the minimum x and the maximum x, so that the computed center of mass is valid automatically. However, if the current x values are floats and do not span a good range for interpolation (e.g. the x values are [0.2, 0.3, 0.6]), then this should be an integer like 10 instead to produce 10 samples between 0.2 and 0.6.

Returns:

x_vals (np.ndarray) -- X values, where the center of mass is with respect to these x values. This is used for when the uneven sampling of the original x values was over floats that do not translate well to integer samples. So if the output com_indices is 1.333 and the x_vals are [0.1,0.2,0.3,0.4,0.5], then the center of mass is a third of the way between 0.2 and 0.3, so 2.33
com_indices (np.ndarray) -- Array with same shape as y except missing the axis dimension which has been collapsed into a single center of mass value.

Interpolate Along Axis¶

naplib.array_ops.interp_axis(new_x, x, y, axis=0)[source]¶

Perform 1D interpolation along a specified axis of a multidimensional array.

Parameters:

new_x (np.ndarray) -- 1D array of new x values for interpolation.
x (np.ndarray) -- 1D array of original x values.
y (np.ndarray) -- Array of shape (..., N, ...) containing original data.
axis (int, default=0) -- Axis along which to perform interpolation (default is 0).

Returns:

y_interp -- New y array with the axis having been interpolated

Return type:

np.ndarray