ptyrad.io.hierarchy#

Hierarchical file handling (load/save) for pt, mat, hdf5 formats

Functions

collect_ND_datasets(data_dict[, ndims, ...])

Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities.

get_nested(d, key[, delimiter, safe, default])

Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early.

handle_hdf5_types(x)

Convert data to native Python or NumPy types.

list_nested_keys(hobj[, delimiter, prefix])

Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths.

load_ND_with_key(file_path[, key, ndims, ...])

Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5.

load_hdf5(file_path[, key, delimiter, selection])

Load dataset(s) from an HDF5 file, recursively if groups are encountered.

load_hdf5_ND_with_selection(file_path[, ...])

Load exactly one ND HDF5 dataset, applying selection only after disambiguation.

load_mat(file_path[, key, delimiter, ...])

Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats.

load_pt(file_path[, weights_only])

Loads data from a PyTorch .pt file.

load_zarr(file_path[, key, ndims, ...])

Loads an array from a Zarr store.

print_nested_dict(d[, indent, ...])

Recursively logs a nested dictionary with structured formatting.

write_hdf5(file_path, data[, dataset_name])

Save an array as an HDF5 file.

ptyrad.io.hierarchy.load_zarr(file_path, key=None, ndims=None, selection=None, zarr_kwargs=None)[source]#

Loads an array from a Zarr store.

Parameters:
  • file_path (str) – Path to the Zarr store.

  • key (str, optional) – Internal path to the array inside a Zarr group.

  • ndims (list, optional) – Desired dimensions when searching a group with no key.

  • selection (optional) – Optional load-time slicing/indexing.

  • zarr_kwargs (dict, optional) – Optional Zarr open settings passed to zarr.open. Use top-level selection for slicing and top-level key for the array path.

Returns:

The loaded array data.

Return type:

numpy.ndarray

ptyrad.io.hierarchy.load_pt(file_path, weights_only=False)[source]#

Loads data from a PyTorch .pt file.

Warning

This function defaults to weights_only=False because PtyRAD .pt files often contain complex objects and dictionaries, not just state dictionaries. As of PyTorch 2.6, torch.load defaults to weights_only=True for security. Loading with weights_only=False can execute arbitrary code if the file contains malicious payloads. Only use this function to load trusted, legacy PtyRAD-generated files.

Parameters:
  • file_path (str) – The path to the PyTorch .pt file.

  • weights_only (bool, optional) – If True, restricts the unpickler to load only tensors, primitive types, and dictionaries. Defaults to False.

Returns:

The deserialized Python object(s) stored in the file.

Return type:

Any

Raises:

FileNotFoundError – If the specified file does not exist.

ptyrad.io.hierarchy.load_mat(file_path, key=None, delimiter='.', squeeze_me=True, simplify_cells=True, selection=None)[source]#

Load dataset(s) from a MATLAB .mat file, handling both default and v7.3 (HDF5) formats. The version is used to switch between scipy.io.loadmat or h5py.

Parameters:
  • file_path (str) – Path to the .mat file.

  • key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure.

  • delimiter (str) – Delimiter for hierarchical keys (default: “.”).

  • squeeze_me (bool) – Whether to squeeze unit matrix dimensions (scipy.io.loadmat parameter).

  • simplify_cells (bool) – Whether to simplify cell arrays (scipy.io.loadmat parameter).

  • selection – Optional NumPy-style indexing object applied to loaded dataset(s).

Returns:

The loaded dataset(s) with the same structure as load_hdf5.

Return type:

data (np.ndarray or dict)

Raises:
  • FileNotFoundError – If the specified file does not exist.

  • KeyError – If provided key(s) are not found in the file.

  • TypeError – If the key is not None, a string, or a list of strings.

ptyrad.io.hierarchy.load_hdf5(file_path, key=None, delimiter='.', selection=None)[source]#

Load dataset(s) from an HDF5 file, recursively if groups are encountered.

Parameters:
  • file_path (str) – Path to the HDF5 file.

  • key (str | list[str] | None) – Name(s) of the dataset(s) to load. - If None, ‘’, or []: Load all datasets recursively, preserving the original nested structure. - If str: Load a single dataset or group. Supports hierarchical keys (e.g., ‘group1.dataset1’). - If list[str]: Load multiple datasets. The returned dictionary will have a flattened structure with the hierarchical key strings as keys.

  • delimiter (str) – Delimiter for hierarchical keys (default: “.”).

  • selection – Optional NumPy/HDF5-style indexing object applied to loaded dataset(s).

Returns:

The loaded dataset(s).
  • If key is a string, returns a single np.ndarray or a nested dictionary if the key points to a group.

  • If key is a list of strings, returns a dictionary with the hierarchical key strings as keys and the corresponding datasets as values.

  • If key is None, returns a nested dictionary preserving the original structure of the HDF5 file.

Return type:

data (np.ndarray or dict)

Raises:
  • FileNotFoundError – If the specified file does not exist.

  • KeyError – If provided key(s) are not found in the file.

  • TypeError – If the key is not None, a string, or a list of strings.

Notes

  • Hierarchical Keys:
    • The function supports hierarchical keys (e.g., ‘group1.dataset1’) to directly access nested datasets or groups.

    • When a list of hierarchical keys is provided, the returned dictionary will have a flattened structure with the hierarchical key strings as keys.

  • Preserving Original Structure:
    • If key=None, the function recursively loads all datasets and groups, preserving the original nested structure of the HDF5 file.

  • Performance Considerations:
    • Providing an exact key (e.g., key=”group1/dataset1”) is significantly faster than recursively loading the entire file or traversing the hierarchy.

ptyrad.io.hierarchy.load_hdf5_ND_with_selection(file_path, ndims=None, selection=None)[source]#

Load exactly one ND HDF5 dataset, applying selection only after disambiguation.

Parameters:
  • file_path (str)

  • ndims (List[int] | None)

Return type:

ndarray

ptyrad.io.hierarchy.write_hdf5(file_path, data, dataset_name='meas', **kwargs)[source]#

Save an array as an HDF5 file.

ptyrad.io.hierarchy.load_ND_with_key(file_path, key=None, ndims=None, selection=None)[source]#

Load exactly one ND dataset from (possibly nested) files like .mat and .hdf5.

Parameters:
  • file_path (str) – Path to the file.

  • key (str, optional) – Key to specify the dataset. If not provided, will search for all valid ND datasets.

  • ndims (list) – List of desired dimensions for filtering datasets.

  • selection – Optional NumPy/HDF5-style indexing object applied while loading.

Returns:

The loaded dataset.

Return type:

numpy.ndarray

Raises:

ValueError – If the file type is unsupported, or the key is invalid, or multiple/zero valid datasets are found.

ptyrad.io.hierarchy.collect_ND_datasets(data_dict, ndims=None, delimiter='.', _parent_key=None)[source]#

Collect ND numpy arrays from a (possibly nested) dictionary that match desired dimensionalities.

Automatically traverses nested dictionaries and flattens keys with ‘//’.

Parameters:
  • data_dict (dict) – Dictionary of datasets (flat or nested).

  • ndims (list of int) – Desired dimensionalities to match (e.g., [3, 4]).

  • delimiter (str) – String symbol used to seperate different levels of the full path to the dataset

  • _parent_key (str, optional) – Internal use only. Tracks nested keys during recursion. Do not set manually.

Returns:

Matching datasets with flattened hierarchical keys.

Return type:

dict[str, np.ndarray]

Raises:

ValueError – If input is not a dict or no datasets match.

ptyrad.io.hierarchy.handle_hdf5_types(x)[source]#

Convert data to native Python or NumPy types. Especially when loaded by h5py.

Handles special cases like MATLAB v7.3 complex128 data types and ensures that data is converted to a format compatible with native Python or NumPy.

Also handles sentinel string “__NONE__” as a substitute for None in HDF5.

Parameters:

x – The input data to be converted.

Returns:

The converted data into native Python or NumPy types.

ptyrad.io.hierarchy.get_nested(d, key, delimiter='.', safe=False, default=None)[source]#

Get a value from a nested dictionary either safely (return default if not found) or stricly to fail early.

Parameters: - d (dict): The dictionary to traverse. - key (str, or list or tuple of string): A sequence of keys to access nested values. - delimiter (str): The string used to seperate different parts of the displayed key path - safe (boolean): The flag to switch between safe/strict mode of getting values from a nested dict. - default: The value to return if any key is missing or intermediate value is None.

Returns: - The nested value if found, otherwise default in safe mode or error in strict mode.

ptyrad.io.hierarchy.list_nested_keys(hobj, delimiter='.', prefix='')[source]#

Recursively list all keys in an HDF5 file, HDF5 group, or dict, including hierarchical paths.

Parameters:
  • hobj (h5py.File, h5py.Group, or dict) – The hierarchical object to traverse.

  • delimiter (str) – The string used to seperate different parts of the displayed key path

  • prefix (str) – The current hierarchical path (used for recursion).

Returns:

A list of all keys with their hierarchical paths.

Return type:

list[str]

ptyrad.io.hierarchy.print_nested_dict(d, indent=0, leaf_inline_threshold=6)[source]#

Recursively logs a nested dictionary with structured formatting.

To improve log readability and save vertical space, small “leaf” dictionaries (dictionaries containing no further nested dicts or lists) are printed inline on a single line, provided their length does not exceed leaf_inline_threshold. Flat lists are also printed inline.

Parameters:
  • d (dict) – The dictionary to log.

  • indent (int, optional) – The current indentation level (number of tabs). Defaults to 0.

  • leaf_inline_threshold (int, optional) – The maximum number of key-value pairs a flat leaf dictionary can have to be formatted inline. Defaults to 6.