Reconstructors
Module cdtools.tools.reconstructors contains the Reconstructor class and subclasses which run the ptychography reconstructions on a given model and dataset.
The reconstructors are designed to resemble so-called ‘Trainer’ classes that (in the language of the AI/ML folks) handles the ‘training’ of a model given some dataset and optimizer.
- class cdtools.reconstructors.Reconstructor(model: CDIModel, dataset: CDataset, optimizer: t.optim.Optimizer, subset: int | List[int] = None)
Bases:
object
Reconstructor handles the optimization (‘reconstruction’) of ptychographic models given a CDIModel (or subclass) and corresponding CDataset.
This is a base model that defines all functions Reconstructor subclasses must implement.
- Parameters:
- optimizer
Must be defined when initializing the Reconstructor subclass.
- Type:
torch.optim.Optimizer
- scheduler
May be defined during the
optimize
method.- Type:
torch.optim.lr_scheduler, optional
- data_loader
Defined by calling the
setup_dataloader
method.- Type:
torch.utils.data.DataLoader
- __init__(model: CDIModel, dataset: CDataset, optimizer: t.optim.Optimizer, subset: int | List[int] = None)
- setup_dataloader(batch_size: int | None = None, shuffle: bool = True)
Sets up or re-initializes the dataloader.
- Parameters:
batch_size (int) – Optional, the size of the minibatches to use
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.
- adjust_optimizer(**kwargs)
Change hyperparameters for the utilized optimizer.
For each optimizer, the keyword arguments should be manually defined as parameters.
- run_epoch(stop_event: Event | None = None, regularization_factor: float | List[float] | None = None, calculation_width: int = 10)
Runs one full epoch of the reconstruction. Intended to be called by Reconstructor.optimize.
- Parameters:
stop_event (threading.Event) – Default None, causes the reconstruction to stop when an exception occurs in Optimizer.optimize.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. This does not affect the result, but may affect the calculation speed.
- Returns:
loss – The summed loss over the latest epoch, divided by the total diffraction pattern intensity
- Return type:
float
- optimize(iterations: int, batch_size: int = 1, custom_data_loader: torch.utils.data.DataLoader = None, regularization_factor: float | List[float] = None, thread: bool = True, calculation_width: int = 10, shuffle=True)
Runs a round of reconstruction using the provided optimizer
Formerly CDIModel.AD_optimize
This is the basic automatic differentiation reconstruction tool which all the other, algorithm-specific tools, use. It is a generator which yields the average loss each epoch, ending after the specified number of iterations.
By default, the computation will be run in a separate thread. This is done to enable live plotting with matplotlib during a reconstruction.
If the computation was done in the main thread, this would freeze the plots. This behavior can be turned off by setting the keyword argument ‘thread’ to False.
The batch_size parameter sets the batch size for the default dataloader. If a custom data loader is desired, it can be passed in to the custom_data_loader argument, which will override the batch_size and shuffle parameters
Please see AdamReconstructor.optimize() for an example of how to override this function when designing a subclass
- Parameters:
iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the batch size to use. Default is 1. This is typically overridden by subclasses with an appropriate default for the specific optimizer.
custom_data_loader (torch.utils.data.DataLoader) – Optional, a custom DataLoader to use. Will override batch_size if set.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. This does not affect the result, but may affect the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.
- Yields:
loss (float) – The summed loss over the latest epoch, divided by the total diffraction pattern intensity.
- class cdtools.reconstructors.AdamReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
Bases:
Reconstructor
The Adam Reconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the Adam optimizer.
- Parameters:
model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
attributes (Important)
used. (- model -- Always points to the core model)
perform (- optimizer -- This class by default uses torch.optim.Adam to) – optimizations.
the (- scheduler -- A torch.optim.lr_scheduler that is defined during) – optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.
- __init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
- adjust_optimizer(lr: int = 0.005, betas: Tuple[float] = (0.9, 0.999), amsgrad: bool = False)
Change hyperparameters for the utilized optimizer.
- Parameters:
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
betas (tuple) – Optional, the beta_1 and beta_2 to use. Default is (0.9, 0.999).
amsgrad (bool) – Optional, whether to use the AMSGrad variant of this algorithm.
- optimize(iterations: int, batch_size: int = 15, lr: float = 0.005, betas: Tuple[float] = (0.9, 0.999), custom_data_loader: DataLoader | None = None, schedule: bool = False, amsgrad: bool = False, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, shuffle: bool = True)
Runs a round of reconstruction using the Adam optimizer
Formerly CDIModel.Adam_optimize
This calls the Reconstructor.optimize superclass method (formerly CDIModel.AD_optimize) to run a round of reconstruction once the dataloader and optimizer hyperparameters have been set up.
The batch_size parameter sets the batch size for the default dataloader. If a custom data loader is desired, it can be passed in to the custom_data_loader argument, which will override the batch_size and shuffle parameters
- Parameters:
iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the size of the minibatches to use.
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
betas (tuple) – Optional, the beta_1 and beta_2 to use. Default is (0.9, 0.999).
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
custom_data_loader (t.utils.data.DataLoader) – Optional, a custom DataLoader to use. If set, will override batch_size.
amsgrad (bool) – Optional, whether to use the AMSGrad variant of this algorithm.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.
- class cdtools.reconstructors.LBFGSReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
Bases:
Reconstructor
The LBFGSReconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the LBFGS optimizer.
- Parameters:
model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
schedule (bool) – Optional, create a learning rate scheduler (torch.optim.lr_scheduler._LRScheduler).
attributes (Important)
used. (- model -- Always points to the core model)
to (- optimizer -- This class by default uses torch.optim.LBFGS) – perform optimizations.
during (- scheduler -- A torch.optim.lr_scheduler that is defined) – the optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.
- __init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
- adjust_optimizer(lr: int = 0.005, history_size: int = 2, line_search_fn: str | None = None)
Change hyperparameters for the utilized optimizer.
- Parameters:
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
history_size (int) – Optional, the length of the history to use.
line_search_fn (str) – Optional, either strong_wolfe or None
- optimize(iterations: int, lr: float = 0.1, history_size: int = 2, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, line_search_fn: str | None = None)
Runs a round of reconstruction using the LBFGS optimizer
Formerly CDIModel.LBFGS_optimize
This algorithm is often less stable that Adam, however in certain situations or geometries it can be shockingly efficient. Like all the other optimization routines, it is defined as a generator function which yields the average loss each epoch.
NOTE: There is no batch size, because it is a usually a bad idea to use LBFGS on anything but all the data at onece
- Parameters:
iterations (int) – How many epochs of the algorithm to run.
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.1.
history_size (int) – Optional, the length of the history to use.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.
- class cdtools.reconstructors.SGDReconstructor(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
Bases:
Reconstructor
The SGDReconstructor subclass handles the optimization (‘reconstruction’) of ptychographic models and datasets using the SGD optimizer.
- Parameters:
model (CDIModel) – Model for CDI/ptychography reconstruction.
dataset (Ptycho2DDataset) – The dataset to reconstruct against.
subset (list(int) or int) – Optional, a pattern index or list of pattern indices to use.
attributes (Important)
used. (- model -- Always points to the core model)
perform (- optimizer -- This class by default uses torch.optim.Adam to) – optimizations.
the (- scheduler -- A torch.optim.lr_scheduler that is defined during) – optimize method.
by (- data_loader -- A torch.utils.data.DataLoader that is defined) – calling the setup_dataloader method.
- __init__(model: CDIModel, dataset: Ptycho2DDataset, subset: List[int] = None)
- adjust_optimizer(lr: int = 0.005, momentum: float = 0, dampening: float = 0, weight_decay: float = 0, nesterov: bool = False)
Change hyperparameters for the utilized optimizer.
- Parameters:
lr (float) – Optional, The learning rate (alpha) to use. Default is 0.005. 0.05 is typically the highest possible value with any chance of being stable.
momentum (float) – Optional, the length of the history to use.
dampening (float) – Optional, dampening for the momentum.
weight_decay (float) – Optional, weight decay (L2 penalty).
nesterov (bool) – Optional, enables Nesterov momentum. Only applicable when momentum is non-zero.
- optimize(iterations: int, batch_size: int = 15, lr: float = 2e-07, momentum: float = 0, dampening: float = 0, weight_decay: float = 0, nesterov: bool = False, regularization_factor: float | List[float] | None = None, thread: bool = True, calculation_width: int = 10, shuffle: bool = True)
Runs a round of reconstruction using the Adam optimizer
Formerly CDIModel.Adam_optimize
This calls the Reconstructor.optimize superclass method (formerly CDIModel.AD_optimize) to run a round of reconstruction once the dataloader and optimizer hyperparameters have been set up.
- Parameters:
iterations (int) – How many epochs of the algorithm to run.
batch_size (int) – Optional, the size of the minibatches to use.
lr (float) – Optional, The learning rate to use. The default is 2e-7.
momentum (float) – Optional, the length of the history to use.
dampening (float) – Optional, dampening for the momentum.
weight_decay (float) – Optional, weight decay (L2 penalty).
nesterov (bool) – Optional, enables Nesterov momentum. Only applicable when momentum is non-zero.
regularization_factor (float or list(float)) – Optional, if the model has a regularizer defined, the set of parameters to pass the regularizer method.
thread (bool) – Default True, whether to run the computation in a separate thread to allow interaction with plots during computation.
calculation_width (int) – Default 10, how many translations to pass through at once for each round of gradient accumulation. Does not affect the result, only the calculation speed.
shuffle (bool) – Optional, enable/disable shuffling of the dataset. This option is intended for diagnostic purposes and should be left as True.