Reconstructors

Module cdtools.tools.reconstructors contains the Reconstructor class and subclasses which run the ptychography reconstructions on a given model and dataset.

The reconstructors are designed to resemble so-called ‘Trainer’ classes that (in the language of the AI/ML folks) handles the ‘training’ of a model given some dataset and optimizer.

class cdtools.reconstructors.Reconstructor(model: CDIModel, dataset: CDataset, optimizer: t.optim.Optimizer, subset: int | List[int] = None)

Bases: object

Reconstructor handles the optimization (‘reconstruction’) of ptychographic models given a CDIModel (or subclass) and corresponding CDataset.

This is a base model that defines all functions Reconstructor subclasses must implement.

Parameters:

model (CDIModel) – Model for CDI/ptychography reconstruction
dataset (CDataset) – The dataset to reconstruct against
optimizer (torch.optim.Optimizer) – The optimizer to use for the reconstruction
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use

model

Points to the core model used.

Type:: CDIModel

optimizer

Must be defined when initializing the Reconstructor subclass.

Type:: torch.optim.Optimizer

scheduler

May be defined during the optimize method.

Type:: torch.optim.lr_scheduler, optional

data_loader

Defined by calling the setup_dataloader method.

Type:: torch.utils.data.DataLoader

__init__(model: CDIModel, dataset: CDataset, optimizer: t.optim.Optimizer, subset: int | List[int] = None)

setup_dataloader(batch_size: int | None = None, shuffle: bool = True)

Sets up or re-initializes the dataloader.

Parameters:

batch_size (int) – Optional, the size of the minibatches to use
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.

adjust_optimizer(**kwargs)

Change hyperparameters for the utilized optimizer.

For each optimizer, the keyword arguments should be manually defined as parameters.

run_epoch(stop_event: Event | None = None, regularization_factor: float | List[float] | None = None, calculation_width: int = 10)

Runs one full epoch of the reconstruction. Intended to be called by Reconstructor.optimize.

Parameters:

stop_event (threading.Event) – Default None, causes the reconstruction to stop when an exception occurs in Optimizer.optimize.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. This does not affect the result, but may affect the calculation speed.

Returns:

loss – The summed loss over the latest epoch, divided by the total diffraction pattern intensity

Return type:

float

optimize(iterations: int, batch_size: int = 1, custom_data_loader: torch.utils.data.DataLoader = None, regularization_factor: float | List[float] = None, thread: bool = True, calculation_width: int = 10, shuffle=True)

Runs a round of reconstruction using the provided optimizer

Formerly CDIModel.AD_optimize

This is the basic automatic differentiation reconstruction tool which all the other, algorithm-specific tools, use. It is a generator which yields the average loss each epoch, ending after the specified number of iterations.

By default, the computation will be run in a separate thread. This is done to enable live plotting with matplotlib during a reconstruction.

If the computation was done in the main thread, this would freeze the plots. This behavior can be turned off by setting the keyword argument ‘thread’ to False.

The batch_size parameter sets the batch size for the default dataloader. If a custom data loader is desired, it can be passed in to the custom_data_loader argument, which will override the batch_size and shuffle parameters

Please see AdamReconstructor.optimize() for an example of how to override this function when designing a subclass

Parameters:

iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the batch size to use. Default is 1. This is typically overridden by subclasses with an appropriate default for the specific optimizer.
custom_data_loader (torch.utils.data.DataLoader) – Optional, a custom DataLoader to use. Will override batch_size if set.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. This does not affect the result, but may affect the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.

Yields:

loss (float) – The summed loss over the latest epoch, divided by the total diffraction pattern intensity.

class cdtools.reconstructors.AdamReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

Bases: Reconstructor

The Adam Reconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the Adam optimizer.

Parameters:

model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
attributes (Important)
used. (- model -- Always points to the core model)
perform (- optimizer -- This class by default uses torch.optim.Adam to) – optimizations.
the (- scheduler -- A torch.optim.lr_scheduler that is defined during) – optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.

__init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

adjust_optimizer(lr: int = 0.005, betas: Tuple[float] = (0.9, 0.999), amsgrad: bool = False)

Change hyperparameters for the utilized optimizer.

Parameters:

lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
betas (tuple) – Optional, the beta_1 and beta_2 to use. Default is (0.9, 0.999).
amsgrad (bool) – Optional, whether to use the AMSGrad variant of this algorithm.

optimize(iterations: int, batch_size: int = 15, lr: float = 0.005, betas: Tuple[float] = (0.9, 0.999), custom_data_loader: DataLoader | None = None, schedule: bool = False, amsgrad: bool = False, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, shuffle: bool = True)

Runs a round of reconstruction using the Adam optimizer

Formerly CDIModel.Adam_optimize

This calls the Reconstructor.optimize superclass method (formerly CDIModel.AD_optimize) to run a round of reconstruction once the dataloader and optimizer hyperparameters have been set up.

The batch_size parameter sets the batch size for the default dataloader. If a custom data loader is desired, it can be passed in to the custom_data_loader argument, which will override the batch_size and shuffle parameters

Parameters:

iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the size of the minibatches to use.
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
betas (tuple) – Optional, the beta_1 and beta_2 to use. Default is (0.9, 0.999).
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
custom_data_loader (t.utils.data.DataLoader) – Optional, a custom DataLoader to use. If set, will override batch_size.
amsgrad (bool) – Optional, whether to use the AMSGrad variant of this algorithm.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.

class cdtools.reconstructors.LBFGSReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

Bases: Reconstructor

The LBFGSReconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the LBFGS optimizer.

Parameters:

model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
attributes (Important)
used. (- model -- Always points to the core model)
to (- optimizer -- This class by default uses torch.optim.LBFGS) – perform optimizations.
during (- scheduler -- A torch.optim.lr_scheduler that is defined) – the optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.

__init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

adjust_optimizer(lr: int = 0.005, history_size: int = 2, line_search_fn: str | None = None)

Change hyperparameters for the utilized optimizer.

Parameters:

lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
history_size (int) – Optional, the length of the history to use.
line_search_fn (str) – Optional, either strong_wolfe or None

optimize(iterations: int, lr: float = 0.1, history_size: int = 2, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, line_search_fn: str | None = None)

Runs a round of reconstruction using the LBFGS optimizer

Formerly CDIModel.LBFGS_optimize

This algorithm is often less stable that Adam, however in certain situations or geometries it can be shockingly efficient. Like all the other optimization routines, it is defined as a generator function which yields the average loss each epoch.

NOTE: There is no batch size, because it is a usually a bad idea to use LBFGS on anything but all the data at onece

Parameters:

iterations (int) – How many epochs of the algorithm to run.
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.1.
history_size (int) – Optional, the length of the history to use.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.

class cdtools.reconstructors.SGDReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

Bases: Reconstructor

The SGDReconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the SGD optimizer.

Parameters:

model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
attributes (Important)
used. (- model -- Always points to the core model)
perform (- optimizer -- This class by default uses torch.optim.Adam to) – optimizations.
the (- scheduler -- A torch.optim.lr_scheduler that is defined during) – optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.

__init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)

adjust_optimizer(lr: int = 0.005, momentum: float = 0, dampening: float = 0, weight_decay: float = 0, nesterov: bool = False)

Change hyperparameters for the utilized optimizer.

Parameters:

lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
momentum (float) – Optional, the length of the history to use.
dampening (float) – Optional, dampening for the momentum.
weight_decay (float) – Optional, weight decay (L2 penalty).
nesterov (bool) – Optional, enables Nesterov momentum. Only applicable when momentum is non-zero.

optimize(iterations: int, batch_size: int = 15, lr: float = 2e-07, momentum: float = 0, dampening: float = 0, weight_decay: float = 0, nesterov: bool = False, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, shuffle: bool = True)

Runs a round of reconstruction using the Adam optimizer

Formerly CDIModel.Adam_optimize

This calls the Reconstructor.optimize superclass method (formerly CDIModel.AD_optimize) to run a round of reconstruction once the dataloader and optimizer hyperparameters have been set up.

Parameters:

iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the size of the minibatches to use.
lr (float) – Optional, The learning rate to use. The default is 2e-7.
momentum (float) – Optional, the length of the history to use.
dampening (float) – Optional, dampening for the momentum.
weight_decay (float) – Optional, weight decay (L2 penalty).
nesterov (bool) – Optional, enables Nesterov momentum. Only applicable when momentum is non-zero.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.