# recon_params

`recon_params` determines the overall reconstruction behavior including iterations, grouping/batching, and saving configurations for both reconstruction and hypertune modes
The PtyRAD results are organized into folder structures with 2 (reconstruction) or 3 (hypertune) levels not including the 'output/'. 
The main (1st) output directory is specified by 'output_dir', this is usually separated by material systems or projects, and presumably you'll have multiple reconstructions for this material system / project.
Each PtyRAD reconstruction would be saved into a "reconstruction folder" that will be automatically generated by PtyRAD if 'SAVE_ITERS' is not null.
For reconstruction mode, the folder structure might look like 'output/<MATERIALS>/<RECONSTRUCTION>'
Note that 'recon_dir_affixes', 'prefix_time', 'prefix', and 'postfix' all operates on the reconstruction folder and have no effect if 'SAVE_ITERS' is null because reconstuction folder would not be generated from the first place
The date, pre- and postfix are automatically connected by '_', and the / behind 'output_dir' will also be automatically generated.
For hypertune mode, a hypertune folder is automatically inserted as the 2nd level between 'output_dir' and the (optional) resonstruction folders. For example, 'output_dir/<MATERIALS>/<HYPERTUNE>/<RECONSTRUCTION>'. This way the hypertune folders are organized under the <MATERIALS> folder just like other reconstruction folders. Note that in hypertune mode, 'prefix_time', 'prefix', and 'postfix' would be applied (i.e., hijacked) on this hypertune folder and have no effect to the reconstruction folder name. In other words, 'prefix_time', 'prefix', and 'postfix' would always be applied to the folders under 'output_dir' for both reconstruction and hypertune modes.

```yaml
recon_params: {
    'NITER': 200, # type: int. Total number of reconstruction iterations. 1 iteration means a full pass of all selected diffraction patterns. Usually 20-50 iterations can get 90% of the work done with a proper learning rate between 1e-3 to 1e-4. For faster trials in hypertune mode, set 'NITER' to a smaller number than your typical reconstruction to save time. Usually 10-20 iterations are enough for the hypertune parameters to show their relative performance. 
    'INDICES_MODE': {'mode': 'full', 'subscan_slow': null, 'subscan_fast': null}, # type: dict. Indices mode determines multiple ways to use diffraction patterns at each probe positions for reconstructions. Each probe position (or each diffraction pattern) has a unique index, by selecting a subset of the indices, we can conveniently change the effective reconstruction area ('center') or effective scan step size or real space overlap ('sub'). You may choose 'full' for reconstructing with all probe positions, 'sub' for subsampling by selecting only every few probe positions with the full FOV (i.e., this will increase the effective scan step size and is a good way to test whether we can reduce the real space overlap), or 'center' for only using the center rectangular region in real space with a effectively reduced FOV but full sized object canvas. 'subscan_slow' and 'subscan_fast' determine the number of scan positions chosen for 'sub' and 'center', and have no effect to 'full'. If 'subscan_slow' and 'subscan_fast' are not provided (or null), they'll be set to half of the 'N_scan_slow' and half of the 'N_scan_fast' by default. Typically we can start from 'INDICES_MODE': {'mode': 'sub', 'subscan_slow: null, 'subscan_fast': null} to get an quick idea of the entire object by reconstructing the entire object FOV but using only every other diffraction pattern along fast and slow scan directions (so only 1/4 diffraction patterns are used, hence 4x speedup in the iteration time). Similarly we can use 'center' to keep the effective scan step size, but reconstruct a smaller FOV. Once the 'sub' or 'center' show reasonable results, we can then switch to 'full' to further refine it without starting from scratch because there's no object dimension mismatch between the 'INDICES_MODEs'.
    'BATCH_SIZE': {'size': 32, 'grad_accumulation': 1}, # type: dict. Batch size is the number of diffraction patterns processed simultaneously to get the gradient update. 'size' is the number of diffraction pattern in a sub-batch, and 'grad_accumulation' is how many sub-batches' gradients are accumulated before applying the update. Effective batch size (for 1 update) is batch_size * grad_accumulation. Gradient accumulation is a ML technique that allows people to use large effective batch size by trading the iteration time with memory requirement, so if you can fit the entire batch inside your memory, you should always set 'grad_accumulation': 1 for performance. "Batch size" is commonly used in machine learning community, while it's called "grouping" in PtychoShelves. Batch size has an effect on both convergence speed and final quality, usually smaller batch size leads to better final quality for iterative gradient descent, but smaller batch size would also lead to longer computation time per iteration because the GPU isn't as utilized as large batch sizes (due to less GPU parallelism). On the other hand, large batch size is known to be more robust (noise-resilient) but converges slower. Generally batch size of 32 to 128 is used, although certain algorithms (like ePIE) would prefer a large batch size that is equal to the dataset size for robustness. For extremely large object (or with a lot of object modes), you'll need to reduce batch size to save GPU memory, or use `grad_accumulation` to split a batch into multiple sub-batches for 1 update.
    'GROUP_MODE': 'random',  # type: str. Group mode determines the spatial distribution of the selected probe positions within a batch (group), this is similar to the 'MLs' for 'sparse' and 'MLc' for 'compact' in PtychoShelves. Available options are 'random', 'sparse', and 'compact'. Usually 'random' is good enough with small batch sizes and is the suggested option for most cases. 'compact' is believed to provide best final quality, although it's converging much slower. 'sparse' gives the most uniform coverage on the object so converges the fastest, and is also preferred for reconstructions with few scan positions to prevent any locally biased update. However, 'sparse' for 256x256 scan could take more than 10 mins on CPU just to compute the grouping, hence PtychoShelves automatically switches to 'random' for Nscans > 1e3. The grouping in PtyRAD is fixed during optimization, but the order between each group is shuffled for every iteration.
    'SAVE_ITERS': 10, # type: null or int. Number of completed iterations before saving the current reconstruction results (model, probe, object) and summary figures. If 'SAVE_ITERS' is 50, it'll create an output reconstruction folder and save the results and figures into it every 50 iterations. If null, the output reconstruction folder would not be created and no reconstruction results or summary figures would be saved. If 'SAVE_ITERS' > 'NITER', it'll create the output reconstruction folder but no results / figs would be saved. Typically we set 'SAVE_ITERS' to 50 for reconstruction mode with 'NITER' around 200 to 500. For hypertune mode, it's suggested to set 'SAVE_ITERS' to null and set 'collate_results' to true to save the disk space, while also provide an convenient way to check the hypertune performance by the collated results.
    'output_dir': 'output/tBL_WSe2/', # type str. Path and name of the main output directory. Ideally the 'output_dir' keeps a series of reconstruction of the same materials system or project. The PtyRAD results and figs will be saved into a reconstruction-specific folder under 'output_dir'. The 'output_dir' folder will be automatically created if it doesn't exist.
    'recon_dir_affixes': ['default'], # type: list of strings. This list specifies the optional affixes to the reconstruction folder name for file management. The order of strings has NO effect to the output folder name. PtyRAD provides high-level presets including 'minimal', 'default', and 'all', while each of them corresponds to a subset of all available options. There are currently 16 available options, including 'indices', 'meas', 'batch', 'pmode', 'omode', 'nlayer', 'lr', 'optimizer', 'start_iter', 'model', 'constraint', 'loss', 'illumination', 'dx', 'tilt', and 'affine'. Each option corresponds to specific fields in the params file. These individual tags can be combined with the presets, e.g. ['minimal', 'tilt']. A typical output folder name of 'default' looks like: 'ptyrad\demo\output\tBL_WSe2\20250607_full_N16384_dp128_flipT100_random32_p6_1obj_6slice_dz2_plr1e-4_oalr5e-4_oplr5e-4_slr1e-4_orblur0.5_ozblur1_mamp0.03_4_oathr0.98_oposc_sng1.0_spr0.1'. Note that certain trivial values might not be shown even it's specified, e.g. tilt of [0,0] mrad, slice thickness for single-slice ptychography, or start_iter = 1, etc. It's recommended to use 'default' for 'reconstruction' mode and adjust if needed. For 'hypertune' mode, you can set to [] (empty list) or ['minimal'] if you're considering saving intermediate results, because a unique identifier (trial number) would be appended and detail information are fully stored in the sqlite, and the hypertuned params are appended to the collated result anyway if you have 'collate_results': true.
    'prefix_time': 'date', # type: boolean, preset strings, and time format strings. Set to true to prepend a date str like '20240903_' in front of the reconstruction folder name, so that reconstruction with the same 'recon_dir_affixes' setting won't get incorrectly saved to the same output folder. Available options are None, True, False, 'date', 'time', 'datetime', and time format string like '%Y%m%d_%H%M%S'. Suggested value is 'date' for both 'reconstruction' and 'hypertune' modes. In hypertune mode, the date string would be applied on the hypertune folder instead of the reconsstruction folder. Also note that if you're using hypertune mode on multiple GPUs, you should set prefix_time to 'date' or False, and handle any additional identifier using 'prefix', otherwise different workers launched at different times would each generate their own output folder with different time strings despite using the same sqlite database file.
    'prefix': '', # type: str. Prefix this string to the reconstruction folder name. Note that "_" will be automatically generated, and the attached str would be after the time str if 'prefix_time' is true. In hypertune mode, the prefix string would be applied on the hypertune folder instead of the reconsstruction folder.
    'postfix': '', # type: str. Postfix this string to the reconstruction folder name. Note that "_" will be automatically generated. In hypertune mode, the postfix string would be applied on the hypertune folder instead of the reconsstruction folder.  
    'save_result': ['model', 'objp'], # type: list of strings. This list specifies the available results to save every SAVE_ITERS, so it keeps the intermediate progress. Available options are 'model', 'obja', 'objp', 'probe', 'probe_prop', and 'optim_state'. 'model' is a nested dict that later got stored as an hdf5 file. 'model' contains optimizable tensors and metadata so that you can always refine from it and load whatever optimizable tensors (object, probe, positions, tilts) if you want to continue the reconstruction. It's similar to the NiterXXX.mat from PtychoShelves. 'object' and 'probe' output the reconstructed object and probe as '.tif'. If you don't want to save anything, set 'SAVE_ITERS' to null. Suggested setting is to save everything (i.e., ['model', 'obja', 'objp', 'probe']). For hypertune mode, you can set 'collate_results' to true and set 'SAVE_ITERS' to null to disable result saving.
    'result_modes': {'obj_dim': [2, 3, 4], 'FOV': ['crop'], 'bit': ['8']}, # type: dict. This dict specifies which object output is saved by their final dimension ('obj_dim'), whether to save the full or cropped FOV ('FOV') of object, and whether to save the raw or normalized bit depth version of object and probe. A comprehensive (but probably redundant) saving option looks like {'obj_dim': [2,3,4], 'FOV': ['full', 'crop'], 'bit': ['raw', '32', '16', '8']}. 'obj_dim' takes a list of int, the int ranges between 2 to 4, corresponding to 2D to 4D object output. Set 'obj_dim': [2] if you only want the zsum from multislice ptychography. Suggested value is [2,3,4] to save all possible output. 'FOV' takes a list of strings, the available strings are either 'full' or 'crop'. Suggested value is 'crop' so the lateral padded region of object is not saved. 'bit' takes a list of strings, the available strings are 'raw', '32', '16', and '8'. 'raw' is the original value range, while '32' normalizes the value from 0 to 1. '16' and '8' will normalize the value from 0 to 65535 and 255 correspondingly. Defualt is '8' to save only the normalized 8bit result for quick visualization. You can set it to ['raw', '8'] if you want to keep the original float32 bit results with normalized 8bit results. These postprocessing would postfix corresponding labels to the result files.
    'selected_figs': ['loss', 'forward', 'probe_r_amp', 'pos'], # type: list of strings. This list specified the selected figures that will be plotted/saved. The available strings are 'loss', 'forward', 'probe_r_amp', 'probe_k_amp', 'probe_k_phase', 'pos', 'tilt', and 'all'. The suggested value is ['loss', 'forward', 'probe_r_amp', 'probe_k_amp', 'probe_k_phase', 'pos'].
    'copy_params': true, # type: boolean. Set to true if you want to copy the .yml params file to the hypertune folder (hypertune mode) or individual reconstruction folders (reconsturction mode). Suggested value is true for better record keeping, although most information is saved in model.pt and can be loaded by ckpt = torch.load('model.pt'), params = ckpt['params'].
    'if_quiet': false, # type: boolean. Set to true if you want to reduce the amount of printed information during PtyRAD reconstruction. Suggested value is false for more information, but if you're running hypertune mode you should consider setting it to true.
}
```