macro_eeg_model.data_prep#
This package contains the Julich brain data preparation scripts for the simulation.
Submodules#
Attributes#
Classes#
A class to parse the Julich hierarchical parcellation terminology into a dictionary. |
|
A class to prepare and process data from directories containing CSV files with |
Functions#
Reads the raw labels from labels_raw.txt located in the Julich data path |
Package Contents#
- class macro_eeg_model.data_prep.AreasTerminologyParser[source]#
A class to parse the Julich hierarchical parcellation terminology into a dictionary.
- static parse_into_dict()[source]#
Parses the areas_terminology.json file located in the Julich data path (see
src.utils.paths.Paths) into a nested dictionary.The method reads a JSON file containing the hierarchical structure of brain areas, processes the data, and returns it in a clean dictionary format.
- Returns:
A nested dictionary where each key represents a brain area and its corresponding children areas are stored as values.
- Return type:
dict
- class macro_eeg_model.data_prep.DataPreparator[source]#
A class to prepare and process data from directories containing CSV files with connectivity data across subjects. The processed data is saved as a NumPy array after averaging across multiple subjects.
- prep_and_save(directory_name, included_word, delimiter, name)[source]#
Handles the prerequisites for preparing and saving the data from a specified directory within the Julich data path (see
src.utils.paths.Paths) and then does the actual data preparation and saving using_prep_and_save_data().This method filters the files in the directory based on an included word in their filenames, processes them into NumPy arrays, calculates an average array, and saves it to a specified path.
- Parameters:
directory_name (str) – The name of the directory containing the subject folders.
included_word (str) – The word that should be included in the CSV filenames to be processed.
delimiter (str) – The delimiter used in the CSV files.
name (str) – The name to use when saving the final averaged array.
- _prep_and_save_data(directory, subjects, included_word, delimiter, name)[source]#
Extracts relevant CSV files based on the included word using
_extract_csv_files()converts them to NumPy arrays using_get_arrays_from_files(), computes an average array using_calculate_avg_array(), and saves it as a .npy file.- Parameters:
directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
included_word (str) – The word that should be included in the CSV filenames to be processed.
delimiter (str) – The delimiter used in the CSV files.
name (str) – The name to use when saving the final averaged array.
- static _extract_csv_files(directory, subjects, included_word)[source]#
Extracts the names of CSV files that include a specific word in their filenames. Searches through the directory of each subject for CSV files that contain the specified word in their name.
- Parameters:
directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
included_word (str) – The word that must be included in the filenames.
- Returns:
A list of filenames that match the criteria.
- Return type:
list
- _get_arrays_from_files(directory, subjects, files, delimiter=',')[source]#
Retrieves and converts the relevant CSV files into NumPy arrays using
_convert_csv_file_to_numpy_array().For each subject in the directory, this method identifies the files to be processed, converts them into NumPy arrays, and collects them for further processing.
- Parameters:
directory (str or pathlib.Path) – The path to the directory containing the subject folders.
subjects (list) – The list of subject folder names.
files (list) – The list of filenames to be processed.
delimiter (str, optional) – The delimiter used in the CSV files (default is ‘,’).
- Returns:
A list of NumPy arrays corresponding to the processed CSV files.
- Return type:
list
- static _convert_csv_file_to_numpy_array(file_path, delimiter)[source]#
Converts a CSV file into a NumPy array.
- Parameters:
file_path (str or pathlib.Path) – The full path to the CSV file.
delimiter (str) – The delimiter used in the CSV file.
- Returns:
A NumPy array representing the data from the CSV file.
- Return type:
numpy.ndarray
- static _calculate_avg_array(numpy_arrays)[source]#
Computes the average of each element across multiple NumPy arrays, excluding the highest and lowest 20% of values (to reduce the impact of outliers), and returns the resulting array.
- Parameters:
numpy_arrays (list) – A list of NumPy arrays to average.
- Returns:
A NumPy array containing the average values.
- Return type:
numpy.ndarray
- macro_eeg_model.data_prep.populate_labels_julich()[source]#
Reads the raw labels from labels_raw.txt located in the Julich data path (see
src.utils.paths.Paths) and populates a dictionary with the labels as keys and their corresponding indices (adjusted by -1) as values.- Returns:
A dictionary where the keys are labels (as strings) and the values are the corresponding indices (integers) adjusted by -1.
- Return type:
dict
- macro_eeg_model.data_prep.labels_julich#