Code Reference
kymata
kymata.plot.plot
Functions:
-
expression_plot
–Generates a plot of transform expressions over time with optional display customizations.
-
hide_axes
–Hide all axes markings from a pyplot.Axes.
-
legend_display_dict
–Creates a dictionary for the
legend_display
parameter ofexpression_plot()
. -
plot_top_five_channels_of_gridsearch
–Generates correlation and p-value plots showing the top five channels of the gridsearch.
expression_plot
expression_plot(expression_set: ExpressionSet, show_only: Optional[str | Sequence[str]] = None, paired_axes: bool = True, alpha: float = 1 - cdf(5), color: Optional[str | dict[str, str] | list[str]] = None, ylim: Optional[float] = None, xlims: Optional[tuple[float | None, float | None]] = None, hidden_transforms_in_legend: bool = True, title: str = None, fig_size: tuple[float, float] = (12, 7), minimap: bool = False, minimap_view: str = 'lateral', minimap_surface: str = 'inflated', show_only_sensors: Optional[Literal['eeg', 'meg']] = None, minimap_latency_range: Optional[tuple[float | None, float | None]] = None, save_to: Optional[Path] = None, overwrite: bool = True, show_legend: bool = True, legend_display: dict[str, str] | None = None) -> Figure
Generates a plot of transform expressions over time with optional display customizations.
Parameters:
-
expression_set
ExpressionSet
) –The set of expressions to plot, containing transforms and associated data.
-
show_only
Optional[str | Sequence[str]]
, default:None
) –A string or a sequence of strings specifying which transforms to plot. If None, all transforms in the expression_set will be plotted. Default is None.
-
paired_axes
bool
, default:True
) –When True, shows the expression plot split into left and right axes. When False, all points are shown on the same axis. Default is True.
-
alpha
float
, default:1 - cdf(5)
) –Significance level for statistical tests, defaulting to a 5-sigma threshold.
-
color
Optional[str | dict[str, str] | list[str]]
, default:None
) –Color settings for the plot. Can be a single color, a dictionary mapping transform names to colors, or a list of colors. Default is None.
-
ylim
Optional[float]
, default:None
) –The y-axis limit (p-value). Use log10 of the desired value — e.g. if the desired limit is 10^-100, supply ylim=-100. If None, it will be determined automatically. Default is None.
-
xlims
tuple[Optional[float], Optional[float]]
, default:None
) –The x-axis limits as a tuple (in ms). None to use default values, or set either entry to None to use the default for that value. Default is (-100, 800).
-
hidden_transforms_in_legend
bool
, default:True
) –If True, includes non-plotted transforms in the legend. Default is True.
-
title
str
, default:None
) –Title over the top axis in the figure. Default is none.
-
fig_size
tuple[float, float]
, default:(12, 7)
) –Figure size in inches. Default is (12, 7).
-
minimap
bool
, default:False
) –If True, displays a minimap of the expression data. Default is False.
-
minimap_view
str
, default:'lateral'
) –The view type for the minimap, either "lateral" or other specified views. Valid options are:
"lateral"
: From the left or right side such that the lateral (outside) surface of the given hemisphere is visible."medial"
: From the left or right side such that the medial (inside) surface of the given hemisphere is visible (at least when in split or single-hemi mode)."rostral"
: From the front."caudal"
: From the rear."dorsal"
: From above, with the front of the brain pointing up."ventral"
: From below, with the front of the brain pointing up."frontal"
: From the front and slightly lateral, with the brain slightly tilted forward (yielding a view from slightly above)."parietal"
: From the rear and slightly lateral, with the brain slightly tilted backward (yielding a view from slightly above)."axial"
: From above with the brain pointing up (same as 'dorsal')."sagittal"
: From the right side."coronal"
: From the rear. Default islateral
. -
minimap_surface
str
, default:'inflated'
) –The surface type for the minimap, such as "inflated". Default is "inflated".
-
show_only_sensors
str
, default:None
) –Show only one type of sensors. "meg" for MEG sensors, "eeg" for EEG sensors. None to show all sensors. Supplying this value with something other than a SensorExpressionSet causes will throw an exception. Default is None.
-
minimap_latency_range
Optional[tuple[float | None, float | None]]
, default:None
) –Supply
(start_time, stop_time)
to restrict minimap view to only the specified time window, and highlight the time window on the expression plot. Bothstart_time
andstop_time
are in seconds. Setstart_time
orstop_time
toNone
for half-open intervals. -
save_to
Optional[Path]
, default:None
) –Path to save the generated plot. If None, the plot is not saved. Default is None.
-
overwrite
bool
, default:True
) –If True, overwrite the existing file if it exists. Default is True.
-
show_legend
bool
, default:True
) –If True, displays the legend. Default is True.
-
legend_display
dict[str, str] | None
, default:None
) –Allows grouping of multiple transforms under the same legend item. Provide a dictionary mapping true transform names to display names. None applies no grouping. Default is None.
Returns:
-
Figure
–pyplot.Figure: The matplotlib figure object containing the generated plot.
Raises:
-
FileExistsError
–If the file already exists at save_to and overwrite is set to False.
Notes
The function plots the expression data with options to customize the appearance and statistical significance thresholds. It supports different data types (e.g., HexelExpressionSet, SensorExpressionSet) and can handle paired axes for left/right hemisphere data.
legend_display_dict
legend_display_dict(transforms: list[str], display_name) -> dict[str, str]
Creates a dictionary for the legend_display
parameter of expression_plot()
.
This function maps each transform name in the provided list to a single display name, which can be used to group multiple transforms under one legend item in the plot.
Parameters:
-
transforms
list[str]
) –A list of transform names to be grouped under the same display name.
-
display_name
str
) –The display name to be used for all transforms in the list.
Returns:
-
dict[str, str]
–dict[str, str]: A dictionary mapping each transform name to the provided display name.
plot_top_five_channels_of_gridsearch
plot_top_five_channels_of_gridsearch(latencies: NDArray, corrs: NDArray, transform: Transform, n_samples_per_split: int, n_reps: int, n_splits: int, auto_corrs: NDArray, log_pvalues: any, save_to: Optional[Path] = None, overwrite: bool = True)
Generates correlation and p-value plots showing the top five channels of the gridsearch.
Parameters:
-
latencies
NDArray[any]
) –Array of latency values (e.g., time points in milliseconds) for the x-axis of the plots.
-
corrs
NDArray[any]
) –Correlation coefficients array with shape (n_channels, n_conditions, n_splits, n_time_steps).
-
transform
Transform
) –The transform object whose name attribute will be used in the plot title.
-
n_samples_per_split
int
) –Number of samples per split used in the grid search.
-
n_reps
int
) –Number of repetitions in the grid search.
-
n_splits
int
) –Number of splits in the grid search.
-
auto_corrs
NDArray[any]
) –Auto-correlation values array used for plotting the transform auto-correlation.
-
log_pvalues
any
) –Array of log-transformed p-values for each channel and time point.
-
save_to
Optional[Path]
, default:None
) –Path to save the generated plot. If None, the plot is not saved. Default is None.
-
overwrite
bool
, default:True
) –If True, overwrite the existing file if it exists. Default is True.
Raises:
-
FileExistsError
–If the file already exists at save_to and overwrite is set to False.
Notes
The function generates two subplots:
- The first subplot shows the correlation coefficients over latencies for the top five channels.
- The second subplot shows the corresponding p-values for these channels.
kymata.gridsearch.plain
Functions:
-
do_gridsearch
–Perform a grid search over all hexels for all latencies using EMEG data and a given transform.
do_gridsearch
do_gridsearch(emeg_values: NDArray, transform: Transform, channel_names: list, channel_space: str, start_latency: float, emeg_t_start: float, stimulus_shift_correction: float, stimulus_delivery_latency: float, emeg_sample_rate: float, plot_location: Optional[Path] = None, n_derangements: int = 1, seconds_per_split: float = 1, n_splits: int = 400, n_reps: int = 1, plot_top_five_channels: bool = False, overwrite: bool = True) -> ExpressionSet
Perform a grid search over all hexels for all latencies using EMEG data and a given transform.
This function processes EMEG data to compute the correlation between sensor or source signals and a specified transform across multiple latencies. The results include statistical significance testing and optional plotting.
Parameters:
-
emeg_values
NDArray
) –A 2D array of EMEG values with shape (channels, reps, time).
-
transform
Transform
) –The transform against which the EMEG data will be correlated. It should have a
values
attribute representing the transform's values and asample_rate
attribute indicating its sample rate. -
channel_names
list
) –List of channel names corresponding to the EMEG data. For 'sensor' space, it is a flat list of sensor names. For 'source' space, it is a list containing two lists: left hemisphere and right hemisphere hexel names.
-
channel_space
str
) –The type of channel space used, either 'sensor' or 'source'.
-
start_latency
float
) –The starting latency for the grid search in milliseconds.
-
emeg_t_start
float
) –The starting time of the EMEG data in milliseconds.
-
stimulus_shift_correction
float
) –Correction factor for stimulus shift in seconds per second.
-
stimulus_delivery_latency
float
) –Correction offset for stimulus delivery in seconds.
-
plot_location
Optional[Path]
, default:None
) –Path to save the plot of the top five channels of the grid search. If None, plotting is skipped. Default is None.
-
emeg_sample_rate
float
) –The sample rate of the EMEG data in Hertz.
-
n_derangements
int
, default:1
) –Number of derangements (random permutations) used to create the null distribution. Default is 1.
-
seconds_per_split
float
, default:1
) –Duration of each split in seconds. Default is 0.5 seconds.
-
n_splits
int
, default:400
) –Number of splits used for analysis. Default is 800.
-
n_reps
int
, default:1
) –Number of repetitions for each split. Default is 1.
-
plot_top_five_channels
bool
, default:False
) –Plots the p-values and correlation values of the top five channels in the gridsearch. Default is False.
-
overwrite
bool
, default:True
) –Whether to overwrite existing plot files. Default is True.
Returns:
-
ExpressionSet
(ExpressionSet
) –An ExpressionSet object (either SensorExpressionSet or HexelExpressionSet)
-
ExpressionSet
–containing the log p-values for each channel/hexel and latency.
Notes
- The function down-samples the EMEG data to match the transform's sample rate.
- The EMEG data is reshaped into segments of the specified duration (
seconds_per_split
). - Cross-correlations between the EMEG data and the transform are computed using FFT.
- Statistical significance is assessed using a vectorized Welch's t-test.
- If specified, the results are plotted and saved to the given location.
kymata.io.nkg
Functions:
-
load_expression_set
–Loads an ExpressionSet from the specified path(s) or open file.
-
save_expression_set
–Save the given ExpressionSet to a specified path or an already open file.
load_expression_set
load_expression_set(from_path_or_file: PathType | FileType | list[PathType]) -> ExpressionSet
Loads an ExpressionSet from the specified path(s) or open file.
The function determines the type of ExpressionSet (HexelExpressionSet or SensorExpressionSet) based on the data loaded from the provided path or file. It then constructs and returns an instance of the appropriate ExpressionSet subclass.
Parameters:
-
from_path_or_file
PathType | FileType | list[PathType]
) –The path, file, or list of paths from which to load the data.
Returns:
-
ExpressionSet
(ExpressionSet
) –An instance of either HexelExpressionSet or SensorExpressionSet, depending on the type identifier in the data.
Raises:
-
KeyError
–If required keys are missing in the data dictionary.
-
ValueError
–If the type identifier is not recognized.
save_expression_set
save_expression_set(expression_set: ExpressionSet, to_path_or_file: PathType | FileType, compression=ZIP_LZMA, overwrite: bool = False)
Save the given ExpressionSet to a specified path or an already open file.
This function saves the ExpressionSet data into a compressed file format. If a file path is provided, it creates and writes to the file. If an open file is supplied, it should be opened in "wb" mode. The overwrite flag is ignored if an open file is supplied.
Parameters:
-
expression_set
ExpressionSet
) –The ExpressionSet object to be saved.
-
to_path_or_file
PathType | FileType
) –The path or open file where the ExpressionSet will be saved.
-
compression
The compression method to use (default is ZIP_LZMA).
-
overwrite
bool
, default:False
) –If True, allows overwriting an existing file (default is False).
Raises:
-
FileExistsError
–If the specified path already exists and overwrite is False.
-
TypeError
–If the provided path or file type is invalid.
Notes
- The compression parameter should be compatible with the
ZipFile
class. - The function writes various metadata and data blocks in a structured format within the zip file.
kymata.io.config
Functions:
-
get_root_dir
–Get the root directory based on the configuration parameters.
-
load_config
–Load configuration parameters from a specified path or file.
-
modify_param_config
–Modify a specific configuration parameter in the given configuration file.
get_root_dir
get_root_dir(config: dict) -> str
Get the root directory based on the configuration parameters.
This function returns the appropriate root directory path based on the 'data_location' parameter in the provided configuration dictionary.
Parameters:
-
config
dict
) –The configuration dictionary containing the 'data_location' parameter.
Returns:
-
str
(str
) –The root directory path corresponding to the 'data_location' parameter.
Raises:
-
ValueError
–If the 'data_location' parameter is not 'local', 'cbu', or 'cbu-local'.
load_config
load_config(config_location: PathType | FileType)
Load configuration parameters from a specified path or file.
This function reads the configuration parameters from a YAML file located at the given path or open file.
Parameters:
-
config_location
PathType | FileType
) –The path to the configuration file or an open file object.
Returns:
-
dict
–The configuration parameters loaded from the file.
Raises:
-
FileNotFoundError
–If the specified path does not exist.
-
YAMLError
–If there is an error in parsing the YAML file.
modify_param_config
modify_param_config(config_location: str, key: str, value)
Modify a specific configuration parameter in the given configuration file.
This function updates the value of a specified key in the configuration file and saves the changes.
Parameters:
-
config_location
str
) –The path to the configuration file.
-
key
str
) –The key of the configuration parameter to be modified.
-
value
The new value to be assigned to the specified key.
Raises:
-
FileNotFoundError
–If the specified configuration file does not exist.
-
YAMLError
–If there is an error in parsing the YAML file.
kymata.ippm.plot
Functions:
-
plot_denoised_vs_noisy
–Utility function to plot the noisy and denoised versions. It runs the supplied clusterer and then copies the denoised spikes, which
-
plot_ippm
–Plots an acyclic, directed graph using the graph held in graph. Edges are generated using BSplines.
-
plot_k_dist_1D
–This could be optimised further but since we aren't using it, we can leave it as it is.
-
stem_plot
–Plots a stem plot using spikes.
plot_denoised_vs_noisy
Utility function to plot the noisy and denoised versions. It runs the supplied clusterer and then copies the denoised spikes, which are fed into a stem plot.
Parameters
spikes: spikes we want to denoise then plot clusterer: A child class of DenoisingStrategy that implements .cluster title: title of plot
Returns
Nothing but plots a graph.
plot_ippm
plot_ippm(graph: IPPMGraph, colors: dict[str, str], title: Optional[str] = None, scale_spikes: bool = False, figheight: int = 5, figwidth: int = 10, arrowhead_dims: tuple[float, float] = None, linewidth: float = 3, show_labels: bool = True)
Plots an acyclic, directed graph using the graph held in graph. Edges are generated using BSplines.
Parameters:
-
graph
NodeDict
) –Dictionary with keys as node names and values as IPPMNode objects. Contains nodes as keys and magnitude, position, and incoming edges in the IPPMNode object.
-
colors
dict[str, str]
) –Dictionary with keys as node names and values as colors in hexadecimal. Contains the color for each transform. The nodes and edges are colored accordingly.
-
title
str
, default:None
) –Title of the plot.
-
scale_spikes
bool
, default:False
) –scales the node by the significance. Default is False
-
figheight
int
, default:5
) –Height of the plot. Defaults to 5.
-
figwidth
int
, default:10
) –Width of the plot. Defaults to 10.
-
show_labels
bool
, default:True
) –Show transform names as labels on the graph. Defaults to True.
plot_k_dist_1D
This could be optimised further but since we aren't using it, we can leave it as it is.
A utility function to plot the k-dist graph for a set of timings. Essentially, the k dist graph plots the distance to the kth neighbour for each point. By inspecting the gradient of the graph, we can gain some intuition behind the density of points within the dataset, which can feed into selecting the optimal DBSCAN hyperparameters.
For more details refer to section 4.2 in https://www.dbs.ifi.lmu.de/Publikationen/Papers/KDD-96.final.frame.pdf
Parameters
timings: list of timings extracted from a spikes. It contains the timings for one transform and one hemisphere k: the k we use to find the kth neighbour. Paper above advises to use k=4. normalise: whether to normalise before plotting the k-dist. It is important because the k-dist then equally weights both dimensions.
Returns
Nothing but plots a graph.
stem_plot
stem_plot(spikes: SpikeDict, title: Optional[str] = None, timepoints: int = 201, y_limit: float = pow(10, -100), number_of_spikes: int = 200000, figheight: int = 7, figwidth: int = 12)
Plots a stem plot using spikes.
Params
spikes : Contains transform spikes in the form of a spike object. All timings are found there.
title : Title of plot.
kymata.ippm.build
A graphing functions used to construct a dictionary that contains the nodes and all relevant information to construct a dict containing node names as keys and Node objects (see namedtuple) as values.
Classes:
-
IPPMBuilder
– -
IPPMNode
–A node to be drawn in an IPPM graph.
-
YOrdinateStyle
–Enumeration for Y-ordinate plotting styles.
IPPMBuilder
IPPMBuilder(spikes: SpikeDict, inputs: list[str], hierarchy: TransformHierarchy, hemisphere: str, y_ordinate: str = progressive, serial_sequence: list[list[str]] = None, avoid_collinearity: bool = True)
list of serial sequence of parallel steps of functions. e.g. first entry is list of all inputs,
second entry is list of all functions immediately downstream of inputs, etc.
avoid_collinearity: if True, vertically nudges each successive serial step to prevent steps from overlapping
IPPMNode
Bases: NamedTuple
A node to be drawn in an IPPM graph.
YOrdinateStyle
Bases: StrEnum
Enumeration for Y-ordinate plotting styles.
Attributes:
-
progressive
–Points are plotted with increasing y ordinates.
-
centered
–Points are vertically centered.
kymata.entities.expression
Classes and functions for storing expression information.
Classes:
-
ExpressionSet
–Brain data associated with expression of a single transform.
-
HexelExpressionSet
–Brain data associated with expression of a single transform in hexel space.
-
SensorExpressionSet
–Brain data associated with the expression of a single transform in sensor space.
Functions:
-
combine
–Combines a sequence of
ExpressionSet
s into a singleExpressionSet
.
ExpressionSet
ExpressionSet(transforms: str | Sequence[str], latencies: Sequence[Latency], data_blocks: dict[str, _InputDataArray | Sequence[_InputDataArray]], channel_coord_name: str, channel_coord_dtype, channel_coord_values: dict[str, Sequence])
Bases: ABC
Brain data associated with expression of a single transform. Data is log10 p-values.
Initializes the ExpressionSet with the provided data.
Parameters:
-
transforms
str | Sequence[str]
) –Transform name, or sequence of names.
-
latencies
Sequence[Latency]
) –Latency values.
-
data_blocks
dict[str, _InputDataArray | Sequence[_InputDataArray]]
) –Mapping of block names to data arrays (log10 p-values).
In general there are two possible formats for this argument.
In the first (safer, more explicit and flexible) format,
data_blocks
contains a dict mapping block names to data arrays. E.g., in the case there are three transforms in a hexel setting:or in a sensor setting: (and where{ # each array is (channel, latency)-shaped # and there's one for each transform # ↓ "left": [array(...), array(...), array(...)], "right": [array(...), array(...), array(...)], }
array(...)
can be a numpy array or a sparse array). In this format, all data arrays should be the same size.In the second (more performant) format,
data_blocks
contains a single data array whosetransform
dimensions can be concatenated to achieve the desired resultant data block. E.g. -
channel_coord_name
str
) –Name of the channel coordinate.
-
channel_coord_dtype
Data type of the channel coordinate.
-
channel_coord_values
dict[str, Sequence]
) –Dictionary mapping block names to channel coordinate values.
Raises:
-
ValueError
–when arguments are invalid
Methods:
-
best_transforms
–Note that channels for which the best p-value is 1 will be omitted.
-
crop
–Returns a copy of the ExpressionSet with latencies cropped between the two endpoints (inclusive).
-
rename
–Renames the transforms and channels within an ExpressionSet.
Attributes:
-
latencies
(NDArray[LatencyDType]
) –Latencies, in seconds.
-
transforms
(list[TransformNameDType]
) –Transform names.
best_transforms
abstractmethod
Note that channels for which the best p-value is 1 will be omitted.
crop
abstractmethod
crop(latency_start: float | None, latency_stop: float | None) -> Self
Returns a copy of the ExpressionSet with latencies cropped between the two endpoints (inclusive).
Parameters:
-
latency_start
float | None
) –Latency in seconds to start the cropped window. Use None for no cropping at the start (e.g. half-open crop).
-
latency_stop
float | None
) –Latency in seconds to stop the cropped window. Use None for no cropping at the end (e.g. half-open crop).
Returns:
-
Self
(Self
) –A copy of the ExpressionSet with the latencies cropped between the specified start and stop.
rename
Renames the transforms and channels within an ExpressionSet.
Supply a dictionary mapping old values to new values.
Raises KeyError if one of the keys in the renaming dictionary is not a transform name in the expression set.
HexelExpressionSet
HexelExpressionSet(transforms: str | Sequence[str], hexels_lh: Sequence[Hexel], hexels_rh: Sequence[Hexel], latencies: Sequence[Latency], data_lh: _InputDataArray | Sequence[_InputDataArray], data_rh: _InputDataArray | Sequence[_InputDataArray])
Bases: ExpressionSet
Brain data associated with expression of a single transform in hexel space. Includes lh, rh, flipped, non-flipped. Data is log10 p-values
Methods:
-
best_transforms
–Return a pair of DataFrames (left, right), containing:
-
rename
–Renames the transforms and channels within an ExpressionSet.
Attributes:
-
hexels_left
(NDArray[HexelDType]
) –Hexels, canonical ID.
-
hexels_right
(NDArray[HexelDType]
) –Hexels, canonical ID.
-
latencies
(NDArray[LatencyDType]
) –Latencies, in seconds.
-
left
(DataArray
) –Left-hemisphere data.
-
right
(DataArray
) –Right-hemisphere data.
-
transforms
(list[TransformNameDType]
) –Transform names.
best_transforms
Return a pair of DataFrames (left, right), containing: for each hexel, the best transform and latency for that hexel, and the associated log p-value
Note that channels for which the best p-value is 1 will be omitted.
rename
Renames the transforms and channels within an ExpressionSet.
Supply a dictionary mapping old values to new values.
Raises KeyError if one of the keys in the renaming dictionary is not a transform name in the expression set.
SensorExpressionSet
SensorExpressionSet(transforms: str | Sequence[str], sensors: Sequence[Sensor], latencies: Sequence[Latency], data: _InputDataArray | Sequence[_InputDataArray])
Bases: ExpressionSet
Brain data associated with the expression of a single transform in sensor space. Includes left hemisphere (lh), right hemisphere (rh), flipped, and non-flipped data. Data is represented as log10 p-values.
Initialize the SensorExpressionSet with transform names, sensor metadata, latency information, and log p-value data.
Parameters:
-
transforms
str | Sequence[str]
) –The names of the transforms being evaluated.
-
sensors
Sequence[Sensor]
) –Metadata about the sensors used in the study.
-
latencies
Sequence[Latency]
) –Latency information corresponding to the data.
-
data
_InputDataArray | Sequence[_InputDataArray]
) –Log p-values representing the data.
Methods:
-
best_transforms
–Return a DataFrame containing:
-
rename
–Renames the transforms and channels within an ExpressionSet.
Attributes:
-
latencies
(NDArray[LatencyDType]
) –Latencies, in seconds.
-
scalp
(DataArray
) –Get the left-hemisphere data.
-
sensors
(NDArray[SensorDType]
) –Get the sensor metadata.
-
transforms
(list[TransformNameDType]
) –Transform names.
sensors
property
Get the sensor metadata.
Returns:
-
NDArray[SensorDType]
–NDArray[SensorDType]: Array of sensor metadata.
best_transforms
Return a DataFrame containing: for each sensor, the best transform and latency for that sensor, and the associated log p-value
Note that channels for which the best p-value is 1 will be omitted.
rename
Renames the transforms and channels within an ExpressionSet.
Supply a dictionary mapping old values to new values.
Raises KeyError if one of the keys in the renaming dictionary is not a transform name in the expression set.