macro_eeg_model.data_prep#

This package contains the Julich brain data preparation scripts for the simulation.

Submodules#

Attributes#

Classes#

AreasTerminologyParser

A class to parse the Julich hierarchical parcellation terminology into a dictionary.

DataPreparator

A class to prepare and process data from directories containing CSV files with

Functions#

populate_labels_julich()

Reads the raw labels from labels_raw.txt located in the Julich data path

Package Contents#

class macro_eeg_model.data_prep.AreasTerminologyParser[source]#

A class to parse the Julich hierarchical parcellation terminology into a dictionary.

static parse_into_dict()[source]#

Parses the areas_terminology.json file located in the Julich data path (see src.utils.paths.Paths) into a nested dictionary.

The method reads a JSON file containing the hierarchical structure of brain areas, processes the data, and returns it in a clean dictionary format.

Returns:

A nested dictionary where each key represents a brain area and its corresponding children areas are stored as values.

Return type:

dict

class macro_eeg_model.data_prep.DataPreparator[source]#

A class to prepare and process data from directories containing CSV files with connectivity data across subjects. The processed data is saved as a NumPy array after averaging across multiple subjects.

prep_and_save(directory_name, included_word, delimiter, name)[source]#

Handles the prerequisites for preparing and saving the data from a specified directory within the Julich data path (see src.utils.paths.Paths) and then does the actual data preparation and saving using _prep_and_save_data().

This method filters the files in the directory based on an included word in their filenames, processes them into NumPy arrays, calculates an average array, and saves it to a specified path.

Parameters:
  • directory_name (str) – The name of the directory containing the subject folders.

  • included_word (str) – The word that should be included in the CSV filenames to be processed.

  • delimiter (str) – The delimiter used in the CSV files.

  • name (str) – The name to use when saving the final averaged array.

_prep_and_save_data(directory, subjects, included_word, delimiter, name)[source]#

Extracts relevant CSV files based on the included word using _extract_csv_files() converts them to NumPy arrays using _get_arrays_from_files(), computes an average array using _calculate_avg_array(), and saves it as a .npy file.

Parameters:
  • directory (str or pathlib.Path) – The path to the directory containing the subject folders.

  • subjects (list) – The list of subject folder names.

  • included_word (str) – The word that should be included in the CSV filenames to be processed.

  • delimiter (str) – The delimiter used in the CSV files.

  • name (str) – The name to use when saving the final averaged array.

static _extract_csv_files(directory, subjects, included_word)[source]#

Extracts the names of CSV files that include a specific word in their filenames. Searches through the directory of each subject for CSV files that contain the specified word in their name.

Parameters:
  • directory (str or pathlib.Path) – The path to the directory containing the subject folders.

  • subjects (list) – The list of subject folder names.

  • included_word (str) – The word that must be included in the filenames.

Returns:

A list of filenames that match the criteria.

Return type:

list

_get_arrays_from_files(directory, subjects, files, delimiter=',')[source]#

Retrieves and converts the relevant CSV files into NumPy arrays using _convert_csv_file_to_numpy_array().

For each subject in the directory, this method identifies the files to be processed, converts them into NumPy arrays, and collects them for further processing.

Parameters:
  • directory (str or pathlib.Path) – The path to the directory containing the subject folders.

  • subjects (list) – The list of subject folder names.

  • files (list) – The list of filenames to be processed.

  • delimiter (str, optional) – The delimiter used in the CSV files (default is ‘,’).

Returns:

A list of NumPy arrays corresponding to the processed CSV files.

Return type:

list

static _convert_csv_file_to_numpy_array(file_path, delimiter)[source]#

Converts a CSV file into a NumPy array.

Parameters:
  • file_path (str or pathlib.Path) – The full path to the CSV file.

  • delimiter (str) – The delimiter used in the CSV file.

Returns:

A NumPy array representing the data from the CSV file.

Return type:

numpy.ndarray

static _calculate_avg_array(numpy_arrays)[source]#

Computes the average of each element across multiple NumPy arrays, excluding the highest and lowest 20% of values (to reduce the impact of outliers), and returns the resulting array.

Parameters:

numpy_arrays (list) – A list of NumPy arrays to average.

Returns:

A NumPy array containing the average values.

Return type:

numpy.ndarray

macro_eeg_model.data_prep.populate_labels_julich()[source]#

Reads the raw labels from labels_raw.txt located in the Julich data path (see src.utils.paths.Paths) and populates a dictionary with the labels as keys and their corresponding indices (adjusted by -1) as values.

Returns:

A dictionary where the keys are labels (as strings) and the values are the corresponding indices (integers) adjusted by -1.

Return type:

dict

macro_eeg_model.data_prep.labels_julich#