macro_eeg_model.data_prep#

This package contains the Julich brain data preparation scripts for the simulation.

Submodules#

Attributes#

labels_julich

Classes#

`AreasTerminologyParser`	A class to parse the Julich hierarchical parcellation terminology into a dictionary.
`DataPreparator`	A class to prepare and process data from directories containing CSV files with

Functions#

populate_labels_julich()

Reads the raw labels from labels_raw.txt located in the Julich data path

Package Contents#

class macro_eeg_model.data_prep.AreasTerminologyParser[source]#

A class to parse the Julich hierarchical parcellation terminology into a dictionary.

static parse_into_dict()[source]#

Parses the areas_terminology.json file located in the Julich data path (see src.utils.paths.Paths) into a nested dictionary.

The method reads a JSON file containing the hierarchical structure of brain areas, processes the data, and returns it in a clean dictionary format.

Returns:: A nested dictionary where each key represents a brain area and its corresponding children areas are stored as values.
Return type:: dict

class macro_eeg_model.data_prep.DataPreparator[source]#

A class to prepare and process data from directories containing CSV files with connectivity data across subjects. The processed data is saved as a NumPy array after averaging across multiple subjects.

prep_and_save(directory_name, included_word, delimiter, name)[source]#

Handles the prerequisites for preparing and saving the data from a specified directory within the Julich data path (see src.utils.paths.Paths) and then does the actual data preparation and saving using _prep_and_save_data().

This method filters the files in the directory based on an included word in their filenames, processes them into NumPy arrays, calculates an average array, and saves it to a specified path.

Parameters:

directory_name (str) – The name of the directory containing the subject folders.
included_word (str) – The word that should be included in the CSV filenames to be processed.
delimiter (str) – The delimiter used in the CSV files.
name (str) – The name to use when saving the final averaged array.

_prep_and_save_data(directory, subjects, included_word, delimiter, name)[source]#

Extracts relevant CSV files based on the included word using _extract_csv_files() converts them to NumPy arrays using _get_arrays_from_files(), computes an average array using _calculate_avg_array(), and saves it as a .npy file.

Parameters:

directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
included_word (str) – The word that should be included in the CSV filenames to be processed.
delimiter (str) – The delimiter used in the CSV files.
name (str) – The name to use when saving the final averaged array.

static _extract_csv_files(directory, subjects, included_word)[source]#

Extracts the names of CSV files that include a specific word in their filenames. Searches through the directory of each subject for CSV files that contain the specified word in their name.

Parameters:

directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
included_word (str) – The word that must be included in the filenames.

Returns:

A list of filenames that match the criteria.

Return type:

list

_get_arrays_from_files(directory, subjects, files, delimiter=',')[source]#

Retrieves and converts the relevant CSV files into NumPy arrays using _convert_csv_file_to_numpy_array().

For each subject in the directory, this method identifies the files to be processed, converts them into NumPy arrays, and collects them for further processing.

Parameters:

directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
files (list) – The list of filenames to be processed.
delimiter (str, optional) – The delimiter used in the CSV files (default is ‘,’).

Returns:

A list of NumPy arrays corresponding to the processed CSV files.

Return type:

list

static _convert_csv_file_to_numpy_array(file_path, delimiter)[source]#

Converts a CSV file into a NumPy array.

Parameters:

file_path (str or pathlib.Path) – The full path to the CSV file.
delimiter (str) – The delimiter used in the CSV file.

Returns:

A NumPy array representing the data from the CSV file.

Return type:

numpy.ndarray

static _calculate_avg_array(numpy_arrays)[source]#

Computes the average of each element across multiple NumPy arrays, excluding the highest and lowest 20% of values (to reduce the impact of outliers), and returns the resulting array.

Parameters:: numpy_arrays (list) – A list of NumPy arrays to average.
Returns:: A NumPy array containing the average values.
Return type:: numpy.ndarray

macro_eeg_model.data_prep.populate_labels_julich()[source]#

Reads the raw labels from labels_raw.txt located in the Julich data path (see src.utils.paths.Paths) and populates a dictionary with the labels as keys and their corresponding indices (adjusted by -1) as values.

Returns:: A dictionary where the keys are labels (as strings) and the values are the corresponding indices (integers) adjusted by -1.
Return type:: dict

macro_eeg_model.data_prep.labels_julich#