ptyrad.io.hierarchy#
Hierarchical file handling (load/save) for pt, mat, hdf5 formats
Functions
|
Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities. |
|
Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early. |
Convert data to native Python or NumPy types. |
|
|
Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths. |
|
Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5. |
|
Load dataset(s) from an HDF5 file, recursively if groups are encountered. |
|
Load exactly one ND HDF5 dataset, applying selection only after disambiguation. |
|
Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats. |
|
Loads data from a PyTorch .pt file. |
|
Loads an array from a Zarr store. |
|
Recursively logs a nested dictionary with structured formatting. |
|
Save an array as an HDF5 file. |
- ptyrad.io.hierarchy.load_zarr(file_path, key=None, ndims=None, selection=None, zarr_kwargs=None)[source]#
Loads an array from a Zarr store.
- Parameters:
file_path (str) – Path to the Zarr store.
key (str, optional) – Internal path to the array inside a Zarr group.
ndims (list, optional) – Desired dimensions when searching a group with no key.
selection (optional) – Optional load-time slicing/indexing.
zarr_kwargs (dict, optional) – Optional Zarr open settings passed to
zarr.open. Use top-levelselectionfor slicing and top-levelkeyfor the array path.
- Returns:
The loaded array data.
- Return type:
numpy.ndarray
- ptyrad.io.hierarchy.load_pt(file_path, weights_only=False)[source]#
Loads data from a PyTorch .pt file.
Warning
This function defaults to weights_only=False because PtyRAD .pt files often contain complex objects and dictionaries, not just state dictionaries. As of PyTorch 2.6, torch.load defaults to weights_only=True for security. Loading with weights_only=False can execute arbitrary code if the file contains malicious payloads. Only use this function to load trusted, legacy PtyRAD-generated files.
- Parameters:
file_path (str) – The path to the PyTorch .pt file.
weights_only (bool, optional) – If True, restricts the unpickler to load only tensors, primitive types, and dictionaries. Defaults to False.
- Returns:
The deserialized Python object(s) stored in the file.
- Return type:
Any
- Raises:
FileNotFoundError – If the specified file does not exist.
- ptyrad.io.hierarchy.load_mat(file_path, key=None, delimiter='.', squeeze_me=True, simplify_cells=True, selection=None)[source]#
Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats. The version is used to switch between scipy.io.loadmat or h5py.
- Parameters:
file_path (str) – Path to the .mat file.
key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure.
delimiter (str) – Delimiter for hierarchical keys (default: “.”).
squeeze_me (bool) – Whether to squeeze unit matrix dimensions (scipy.io.loadmat parameter).
simplify_cells (bool) – Whether to simplify cell arrays (scipy.io.loadmat parameter).
selection – Optional NumPy-style indexing object applied to loaded dataset(s).
- Returns:
The loaded dataset(s) with the same structure as load_hdf5.
- Return type:
data (np.ndarray or dict)
- Raises:
FileNotFoundError – If the specified file does not exist.
KeyError – If provided key(s) are not found in the file.
TypeError – If the key is not None, a string, or a list of strings.
- ptyrad.io.hierarchy.load_hdf5(file_path, key=None, delimiter='.', selection=None)[source]#
Load dataset(s) from an HDF5 file, recursively if groups are encountered.
- Parameters:
file_path (str) – Path to the HDF5 file.
key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets recursively, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure with the hierarchical key strings as keys.
delimiter (str) – Delimiter for hierarchical keys (default: “.”).
selection – Optional NumPy/HDF5-style indexing object applied to loaded dataset(s).
- Returns:
- The loaded dataset(s).
If key is a string, returns a single np.ndarray or a nested dictionary if the key points to a group.
If key is a list of strings, returns a dictionary with the hierarchical key strings as keys and the corresponding datasets as values.
If key is None, returns a nested dictionary preserving the original structure of the HDF5 file.
- Return type:
data (np.ndarray or dict)
- Raises:
FileNotFoundError – If the specified file does not exist.
KeyError – If provided key(s) are not found in the file.
TypeError – If the key is not None, a string, or a list of strings.
Notes
- Hierarchical Keys:
The function supports hierarchical keys (e.g., ‘group1.dataset1’) to directly access nested datasets or groups.
When a list of hierarchical keys is provided, the returned dictionary will have a flattened structure with the hierarchical key strings as keys.
- Preserving Original Structure:
If key=None, the function recursively loads all datasets and groups, preserving the original nested structure of the HDF5 file.
- Performance Considerations:
Providing an exact key (e.g., key=”group1/dataset1”) is significantly faster than recursively loading the entire file or traversing the hierarchy.
- ptyrad.io.hierarchy.load_hdf5_ND_with_selection(file_path, ndims=None, selection=None)[source]#
Load exactly one ND HDF5 dataset, applying selection only after disambiguation.
- Parameters:
file_path (str)
ndims (List[int] | None)
- Return type:
ndarray
- ptyrad.io.hierarchy.write_hdf5(file_path, data, dataset_name='meas', **kwargs)[source]#
Save an array as an HDF5 file.
- ptyrad.io.hierarchy.load_ND_with_key(file_path, key=None, ndims=None, selection=None)[source]#
Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5.
- Parameters:
file_path (str) – Path to the file.
key (str, optional) – Key to specify the dataset. If not provided, will search for all valid ND datasets.
ndims (list) – List of desired dimensions for filtering datasets.
selection – Optional NumPy/HDF5-style indexing object applied while loading.
- Returns:
The loaded dataset.
- Return type:
numpy.ndarray
- Raises:
ValueError – If the file type is unsupported, or the key is invalid, or multiple/zero valid datasets are found.
- ptyrad.io.hierarchy.collect_ND_datasets(data_dict, ndims=None, delimiter='.', _parent_key=None)[source]#
Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities.
Automatically traverses nested dictionaries and flattens keys with ‘//’.
- Parameters:
data_dict (dict) – Dictionary of datasets (flat or nested).
ndims (list of int) – Desired dimensionalities to match (e.g., [3, 4]).
delimiter (str) – String symbol used to seperate different levels of the full path to the dataset
_parent_key (str, optional) – Internal use only. Tracks nested keys during recursion. Do not set manually.
- Returns:
Matching datasets with flattened hierarchical keys.
- Return type:
dict[str, np.ndarray]
- Raises:
ValueError – If input is not a dict or no datasets match.
- ptyrad.io.hierarchy.handle_hdf5_types(x)[source]#
Convert data to native Python or NumPy types. Especially when loaded by h5py.
Handles special cases like MATLAB v7.3 complex128 data types and ensures that data is converted to a format compatible with native Python or NumPy.
Also handles sentinel string “__NONE__” as a substitute for None in HDF5.
- Parameters:
x – The input data to be converted.
- Returns:
The converted data into native Python or NumPy types.
- ptyrad.io.hierarchy.get_nested(d, key, delimiter='.', safe=False, default=None)[source]#
Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early.
Parameters: - d (dict): The dictionary to traverse. - key (str, or list or tuple of string): A sequence of keys to access nested values. - delimiter (str): The string used to seperate different parts of the displayed key path - safe (boolean): The flag to switch between safe/strict mode of getting values from a nested dict. - default: The value to return if any key is missing or intermediate value is None.
Returns: - The nested value if found, otherwise default in safe mode or error in strict mode.
- ptyrad.io.hierarchy.list_nested_keys(hobj, delimiter='.', prefix='')[source]#
Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths.
- Parameters:
hobj (h5py.File, h5py.Group, or dict) – The hierarchical object to traverse.
delimiter (str) – The string used to seperate different parts of the displayed key path
prefix (str) – The current hierarchical path (used for recursion).
- Returns:
A list of all keys with their hierarchical paths.
- Return type:
list[str]
- ptyrad.io.hierarchy.print_nested_dict(d, indent=0, leaf_inline_threshold=6)[source]#
Recursively logs a nested dictionary with structured formatting.
To improve log readability and save vertical space, small “leaf” dictionaries (dictionaries containing no further nested dicts or lists) are printed inline on a single line, provided their length does not exceed leaf_inline_threshold. Flat lists are also printed inline.
- Parameters:
d (dict) – The dictionary to log.
indent (int, optional) – The current indentation level (number of tabs). Defaults to 0.
leaf_inline_threshold (int, optional) – The maximum number of key-value pairs a flat leaf dictionary can have to be formatted inline. Defaults to 6.