ptyrad.io.hierarchy

ptyrad.io.hierarchy#

Hierarchical file handling (load/save) for pt, mat, hdf5 formats

Functions

`collect_ND_datasets`(data_dict[, ndims, ...])	Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities.
`get_nested`(d, key[, delimiter, safe, default])	Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early.
`handle_hdf5_types`(x)	Convert data to native Python or NumPy types.
`list_nested_keys`(hobj[, delimiter, prefix])	Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths.
`load_ND_with_key`(file_path[, key, ndims, ...])	Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5.
`load_hdf5`(file_path[, key, delimiter, selection])	Load dataset(s) from an HDF5 file, recursively if groups are encountered.
`load_hdf5_ND_with_selection`(file_path[, ...])	Load exactly one ND HDF5 dataset, applying selection only after disambiguation.
`load_mat`(file_path[, key, delimiter, ...])	Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats.
`load_pt`(file_path[, weights_only])	Loads data from a PyTorch .pt file.
`load_zarr`(file_path[, key, ndims, ...])	Loads an array from a Zarr store.
`print_nested_dict`(d[, indent, ...])	Recursively logs a nested dictionary with structured formatting.
`write_hdf5`(file_path, data[, dataset_name])	Save an array as an HDF5 file.

ptyrad.io.hierarchy.load_zarr(file_path, key=None, ndims=None, selection=None, zarr_kwargs=None)[source]#

Loads an array from a Zarr store.

Parameters:

file_path (str) – Path to the Zarr store.
key (str, optional) – Internal path to the array inside a Zarr group.
ndims (list, optional) – Desired dimensions when searching a group with no key.
selection (optional) – Optional load-time slicing/indexing.
zarr_kwargs (dict, optional) – Optional Zarr open settings passed to zarr.open. Use top-level selection for slicing and top-level key for the array path.

Returns:

The loaded array data.

Return type:

numpy.ndarray

ptyrad.io.hierarchy.load_pt(file_path, weights_only=False)[source]#

Loads data from a PyTorch .pt file.

Warning

This function defaults to weights_only=False because PtyRAD .pt files often contain complex objects and dictionaries, not just state dictionaries. As of PyTorch 2.6, torch.load defaults to weights_only=True for security. Loading with weights_only=False can execute arbitrary code if the file contains malicious payloads. Only use this function to load trusted, legacy PtyRAD-generated files.

Parameters:

file_path (str) – The path to the PyTorch .pt file.
weights_only (bool, optional) – If True, restricts the unpickler to load only tensors, primitive types, and dictionaries. Defaults to False.

Returns:

The deserialized Python object(s) stored in the file.

Return type:

Any

Raises:

FileNotFoundError – If the specified file does not exist.

ptyrad.io.hierarchy.load_mat(file_path, key=None, delimiter='.', squeeze_me=True, simplify_cells=True, selection=None)[source]#

Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats. The version is used to switch between scipy.io.loadmat or h5py.

Parameters:

file_path (str) – Path to the .mat file.
key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure.
delimiter (str) – Delimiter for hierarchical keys (default: “.”).
squeeze_me (bool) – Whether to squeeze unit matrix dimensions (scipy.io.loadmat parameter).
simplify_cells (bool) – Whether to simplify cell arrays (scipy.io.loadmat parameter).
selection – Optional NumPy-style indexing object applied to loaded dataset(s).

Returns:

The loaded dataset(s) with the same structure as load_hdf5.

Return type:

data (np.ndarray or dict)

Raises:

FileNotFoundError – If the specified file does not exist.
KeyError – If provided key(s) are not found in the file.
TypeError – If the key is not None, a string, or a list of strings.

ptyrad.io.hierarchy.load_hdf5(file_path, key=None, delimiter='.', selection=None)[source]#

Load dataset(s) from an HDF5 file, recursively if groups are encountered.

Parameters:

file_path (str) – Path to the HDF5 file.
key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets recursively, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure with the hierarchical key strings as keys.
delimiter (str) – Delimiter for hierarchical keys (default: “.”).
selection – Optional NumPy/HDF5-style indexing object applied to loaded dataset(s).

Returns:

The loaded dataset(s).

If key is a string, returns a single np.ndarray or a nested dictionary if the key points to a group.
If key is a list of strings, returns a dictionary with the hierarchical key strings as keys and the corresponding datasets as values.
If key is None, returns a nested dictionary preserving the original structure of the HDF5 file.

Return type:

data (np.ndarray or dict)

Raises:

FileNotFoundError – If the specified file does not exist.
KeyError – If provided key(s) are not found in the file.
TypeError – If the key is not None, a string, or a list of strings.

Notes

Hierarchical Keys:
- The function supports hierarchical keys (e.g., ‘group1.dataset1’) to directly access nested datasets or groups.
- When a list of hierarchical keys is provided, the returned dictionary will have a flattened structure with the hierarchical key strings as keys.
Preserving Original Structure:
- If key=None, the function recursively loads all datasets and groups, preserving the original nested structure of the HDF5 file.
Performance Considerations:
- Providing an exact key (e.g., key=”group1/dataset1”) is significantly faster than recursively loading the entire file or traversing the hierarchy.

ptyrad.io.hierarchy.load_hdf5_ND_with_selection(file_path, ndims=None, selection=None)[source]#

Load exactly one ND HDF5 dataset, applying selection only after disambiguation.

Parameters:

file_path (str)
ndims (List[int] | None)

Return type:

ndarray

ptyrad.io.hierarchy.write_hdf5(file_path, data, dataset_name='meas', **kwargs)[source]#: Save an array as an HDF5 file.

ptyrad.io.hierarchy.load_ND_with_key(file_path, key=None, ndims=None, selection=None)[source]#

Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5.

Parameters:

file_path (str) – Path to the file.
key (str, optional) – Key to specify the dataset. If not provided, will search for all valid ND datasets.
ndims (list) – List of desired dimensions for filtering datasets.
selection – Optional NumPy/HDF5-style indexing object applied while loading.

Returns:

The loaded dataset.

Return type:

numpy.ndarray

Raises:

ValueError – If the file type is unsupported, or the key is invalid, or multiple/zero valid datasets are found.

ptyrad.io.hierarchy.collect_ND_datasets(data_dict, ndims=None, delimiter='.', _parent_key=None)[source]#

Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities.

Automatically traverses nested dictionaries and flattens keys with ‘//’.

Parameters:

data_dict (dict) – Dictionary of datasets (flat or nested).
ndims (list of int) – Desired dimensionalities to match (e.g., [3, 4]).
delimiter (str) – String symbol used to seperate different levels of the full path to the dataset
_parent_key (str, optional) – Internal use only. Tracks nested keys during recursion. Do not set manually.

Returns:

Matching datasets with flattened hierarchical keys.

Return type:

dict[str, np.ndarray]

Raises:

ValueError – If input is not a dict or no datasets match.

ptyrad.io.hierarchy.handle_hdf5_types(x)[source]#

Convert data to native Python or NumPy types. Especially when loaded by h5py.

Handles special cases like MATLAB v7.3 complex128 data types and ensures that data is converted to a format compatible with native Python or NumPy.

Also handles sentinel string “__NONE__” as a substitute for None in HDF5.

Parameters:: x – The input data to be converted.
Returns:: The converted data into native Python or NumPy types.

ptyrad.io.hierarchy.get_nested(d, key, delimiter='.', safe=False, default=None)[source]#

Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early.

Parameters: - d (dict): The dictionary to traverse. - key (str, or list or tuple of string): A sequence of keys to access nested values. - delimiter (str): The string used to seperate different parts of the displayed key path - safe (boolean): The flag to switch between safe/strict mode of getting values from a nested dict. - default: The value to return if any key is missing or intermediate value is None.

Returns: - The nested value if found, otherwise default in safe mode or error in strict mode.

ptyrad.io.hierarchy.list_nested_keys(hobj, delimiter='.', prefix='')[source]#

Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths.

Parameters:

hobj (h5py.File, h5py.Group, or dict) – The hierarchical object to traverse.
delimiter (str) – The string used to seperate different parts of the displayed key path
prefix (str) – The current hierarchical path (used for recursion).

Returns:

A list of all keys with their hierarchical paths.

Return type:

list[str]

ptyrad.io.hierarchy.print_nested_dict(d, indent=0, leaf_inline_threshold=6)[source]#

Recursively logs a nested dictionary with structured formatting.

To improve log readability and save vertical space, small “leaf” dictionaries (dictionaries containing no further nested dicts or lists) are printed inline on a single line, provided their length does not exceed leaf_inline_threshold. Flat lists are also printed inline.

Parameters:

d (dict) – The dictionary to log.
indent (int, optional) – The current indentation level (number of tabs). Defaults to 0.
leaf_inline_threshold (int, optional) – The maximum number of key-value pairs a flat leaf dictionary can have to be formatted inline. Defaults to 6.

ptyrad.io.hierarchy

Contents

ptyrad.io.hierarchy#