Analysis

Contains basic functions for analyzing the results of reconstructions

The functions in this module are designed to work either with pytorch tensors or numpy arrays, so they can be used either directly after reconstructions on the attributes of the models themselves, or after-the-fact once the data has been stored in numpy arrays.

cdtools.tools.analysis.product_svd(A, B)

Computes the SVD of A @ B

This function uses a method which uses a QR decomposition of A and B to calculate the final reduced SVD, without explicitly calculating the full matrix. The output is defined such that A B = U S Vh, and as a reduced SVD

Parameters:

A (array) – An nxr matrix
B (array) – An rxm matrix

Returns:

U (array) – An nxr matrix of left singular vectors
S (array) – An length-r array, t.diag(S) is the diagonal matrix of singular values
Vh (array) – And rxm matrix of the conjugate-transposed right singular vectors

cdtools.tools.analysis.orthogonalize_probes(probes, weight_matrix=None, n_probe_dims=2, return_reexpressed_weights=False)

Orthogonalizes a set of incoherently mixing probes

This function takes any set of probe modes for mixed-mode ptychography, which are considered to define a mutual coherence function, and returns an orthogonalized set probe modes which refer to the same mutual coherence function. The orthogonalized modes are extracted via a singular value decomposition and are unique up to a global per-mode phase factor.

If a weight matrix is explicitly given, the function will instead orthogonalize the light field defined by weight_matrix @ probes. It accomplishes this via a method which avoids explicitly constructing this potentially large matrix. This can be useful for Orthogonal Probe Relaxation ptychography. In this case, one may have a large stacked matrix of shot-to-shot weights but a small basis set of probes.

In addition to returning the orthogonalized probe modes, this function also returns a re-expression of the original weight matrix in the basis of the orthogonalized probe modes, such that:

reexpressed_weight_matrix @ orthogonalized_probes = weight_matrix @ probes.

This re-expressed weight matrix is guaranteed to have orthonormalized rows, such that:

reexpressed_weight_matrix^dagger @ reexpressed_weight_matrix = I.

There is usually no reason to use the re-expressed weight matrix. However, it can be useful in situations where the individual rows in the weight matrix have a specific meaning, such as an exposure number, which should be preserved.

Warning! The shape of the output orthogonalized_probes may not be equal to the shape of input probes, when a weight matrix is used. If the input weight matrix has m < l rows, where l is the number of probe modes, then the output orthogonalized probes will have length m, not length l.

Parameters:

probes (array) – An l x (<n_probe_pix>) array representing a stack of l probes
weight_matrix (array) – Optional, an m x l weight matrix further elaborating on the state
n_probe_dims (int) – Default is 2, the number of trailing dimensions for each probe state

Returns:

orthogonalized_probes (array) – A min(m,l) x (<n_probe_dims>) array representing a stack of probes
reexpressed_weight_matrix (array) – A the original weight matrix, re-expressed to work with the new probes

cdtools.tools.analysis.standardize(probe, obj, obj_slice=None, correct_ramp=False)

Standardizes a probe and object to prepare them for comparison

There are a number of ambiguities in the definition of a ptychographic reconstruction. This function makes an explicit choice for each ambiguity to allow comparisons between independent reconstructions without confusing these ambiguities for real differences between the reconstructions.

The ambiguities and standardizations are:

1. Probe and object can be scaled inversely to one another
2. So we set the probe intensity to an average per-pixel value of 1
1. The probe and object can aquire equal and opposite phase ramps
2. So we set the centroid of the FFT of the probe to zero frequency
1. The probe and object can each acquire an arbitrary overall phase
2. So we set the phase of the sum of all values of both the probe and object to 0

When dealing with the properties of the object, a slice is used by default as the edges of the object often are dominated by unphysical noise. The default slice is from 3/8 to 5/8 of the way across. If the probe is actually a stack of incoherently mixing probes, then the dominant probe mode (assumed to be the first in the list) is used, but all the probes are updated with the same factors.

Parameters:

probe (array) – A complex array storing a retrieved probe or stack of incoherently mixed probes
obj (array) – A complex array storing a retrieved object
obj_slice (slice) – Optional, a slice to take from the object for calculating normalizations
correct_ramp (bool) – Default False, whether to correct for the relative phase ramps

Returns:

standardized_probe (array) – The standardized probe
standardized_obj (array) – The standardized object

cdtools.tools.analysis.synthesize_reconstructions(probes, objects, use_probe=False, obj_slice=None, correct_ramp=False)

Takes a collection of reconstructions and outputs a single synthesized probe and object

The function first standardizes the sets of probes and objects using the standardize function, passing through the relevant options. Then it calculates the closest overlap of subsequent frames to subpixel precision and uses a sinc interpolation to shift all the probes and objects to a common frame. Then the images are summed.

Parameters:

probes (list(array)) – A list of probes or stacks of probe modes
objects (list(array)) – A list of objects
use_probe (bool) – Default False, whether to use the probe or object for alignment
obj_slice (slice) – Optional, A slice of the object to use for alignment and normalization
correct_ramp (bool) – Default False, whether to correct for a relative phase ramp in the probe and object

Returns:

synth_probe (array) – The synthesized probe
synth_obj (array) – The synthesized object
obj_stack (list(array)) – A list of standardized objects, for further processing

cdtools.tools.analysis.calc_consistency_prtf(synth_obj, objects, basis, obj_slice=None, nbins=None)

Calculates a PRTF between each the individual objects and a synthesized one

The consistency PRTF at any given spatial frequency is defined as the ratio between the intensity of any given reconstruction and the intensity of a synthesized or averaged reconstruction at that spatial frequency. Typically, the PRTF is averaged over spatial frequencies with the same magnitude.

Parameters:

synth_obj (array) – The synthesized object in the numerator of the PRTF
objects (list(array)) – A list of objects or diffraction patterns for the denomenator of the PRTF
basis (array) – The basis for the reconstruction array to allow output in physical unit
obj_slice (slice) – Optional, a slice of the objects to use for calculating the PRTF
nbins (int) – Optional, number of bins to use in the histogram. Defaults to a sensible value

Returns:

freqs (array) – The frequencies for the PRTF
PRTF (array) – The values of the PRTF

cdtools.tools.analysis.calc_deconvolved_cross_correlation(im1, im2, im_slice=None)

Calculates a cross-correlation between two images with their autocorrelations deconvolved.

This is formally defined as the inverse Fourier transform of the normalized product of the Fourier transforms of the two images. It results in a kernel, whose characteristic size is related to the exactness of the possible alignment between the two images, on top of a random background

Parameters:

im1 (array) – The first image, as a complex or real valued array
im2 (array) – The first image, as a complex or real valued array
im_slice (slice) – Default is from 3/8 to 5/8 across the image, a slice to use in the processing.

Returns:

corr – The complex-valued deconvolved cross-correlation, in real space

Return type:

array

cdtools.tools.analysis.calc_frc(im1, im2, basis, im_slice=None, nbins=None, snr=1.0, limit='side')

Calculates a Fourier ring correlation between two images

This function requires an input of a basis to allow for FRC calculations to be related to physical units.

Like other analysis functions, this can take input in numpy or pytorch, and will return output in the respective format.

Parameters:

im1 (array) – The first image, a complex or real valued array
im2 (array) – The first image, a complex or real valued array
basis (array) – The basis for the images, defined as is standard for datasets
im_slice (slice) – Default is the full image
nbins (int) – Number of bins to break the FRC up into
snr (float) – The signal to noise ratio (for the combined information in both images) to return a threshold curve for.
limit (str) – Default is ‘side’. What is the highest frequency to calculate the FRC to? If ‘side’, it chooses the side of the Fourier transform, if ‘corner’ it goes fully to the corner.

Returns:

freqs (array) – The frequencies associated with each FRC value
FRC (array) – The FRC values
threshold (array) – The threshold curve for comparison

cdtools.tools.analysis.calc_vn_entropy(matrix)

Calculates the Von Neumann entropy of a density matrix

Will either accept a single matrix, or a stack of matrices. Matrices are assumed to be Hermetian and positive definite, to be well-formed density matrices

Parameters:: matrix (np.array) – The nxn matrix or lxnxn stack of matrices to calculate the entropy of
Returns:: entropy – The entropy or entropies of the arrays
Return type:: float or np.array

cdtools.tools.analysis.calc_mode_power_fractions(probes, weight_matrix=None, n_probe_dims=2, assume_preorthogonalized=False)

Calculates the fraction of total power in each orthogonalized mode

This code first orthogonalizes the probe modes, so the result of this function are independent of the particular way that the multi-mode breakdown is expressed.

Parameters:

probes (array) – An l x (<n_probe_pix>) array representing a stack of l probes
weight_matrix (array) – Optional, an m x l weight matrix further elaborating on the state
n_probe_dims (int) – Default is 2, the number of trailing dimensions for each probe state
assume_preorthogonalized (bool) – Default is False. If True, will not orthogonalize the probes

Returns:

power_fractions – The fraction of the total power in each mode

Return type:

array

cdtools.tools.analysis.calc_rms_error(field_1, field_2, align_phases=True, normalize=False, dims=2)

Calculates the root-mean-squared error between two complex wavefields

The formal definition of this function is:

output = norm * sqrt(mean(abs(field_1 - gamma * field_2)**2))

Where norm is an optional normalization factor, and gamma is an optional phase factor which is appropriate when the wavefields suffer from a global phase degeneracy as is often the case in diffractive imaging.

The normalization is defined as the square root of the total intensity contained in field_1, which is appropriate when field_1 represents a known ground truth:

norm = sqrt(mean(abs(field_1)**2))

The phase offset is an analytic expression for the phase offset which will minimize the RMS error between the two wavefields:

gamma = exp(1j * angle(sum(field_1 * conj(field_2))))

This implementation is stable even in cases where field_1 and field_2 are completely orthogonal.

In the definitions above, the field_n are n-dimensional wavefields. The dimensionality of the wavefields can be altered via the dims argument, but the default is 2 for a 2D wavefield.

Parameters:

field_1 (array) – The first complex-valued field
field_2 (array) – The second complex-valued field
align_phases (bool) – Default is True, whether to account for a global phase offset
normalize (bool) – Default is False, whether to normalize to the intensity of field_1
dims ((int or tuple of python:ints)) – Default is 2, the number of final dimensions to reduce over.

Returns:

rms_error – The RMS error, or tensor of RMS errors, depending on the dim argument

Return type:

float or t.Tensor

cdtools.tools.analysis.calc_fidelity(fields_1, fields_2, dims=2)

Calculates the fidelity between two density matrices

The fidelity is a comparison metric between two density matrices (i.e. mutual coherence functions) that extends the idea of the overlap to incoherent light. As a reminder, the overlap between two fields is:

overlap = abs(sum(field_1 * field_2))**2

Whereas the fidelity is defined as:

fidelity = trace(sqrt(sqrt(dm_1) <dot> dm_2 <dot> sqrt(dm_1)))**2

where dm_n refers to the density matrix encoded by fields_n such that dm_n = fields_n <dot> fields_<n>.conjtranspose(), sqrt refers to the matrix square root, and <dot> is the matrix product.

This is not a practical implementation, however, as it is not feasible to explicitly construct the matrices dm_1 and dm_2 in memory. Therefore, we take advantage of the alternate definition based directly on the fields_<n> parameter:

fidelity = sum(svdvals(fields_1 <dot> fields_2.conjtranspose()))**2

In the definitions above, the fields_n are regarded as collections of wavefields, where each wavefield is by default 2-dimensional. The dimensionality of the wavefields can be altered via the dims argument, but the fields_n arguments must always have at least one more dimension than the dims argument. Any additional dimensions are treated as batch dimensions.

Parameters:

fields_1 (array) – The first set of complex-valued field modes
fields_2 (array) – The second set of complex-valued field modes
dims (int) – Default is 2, the number of final dimensions to reduce over.

Returns:

fidelity – The fidelity, or tensor of fidelities, depending on the dim argument

Return type:

float or t.Tensor

cdtools.tools.analysis.calc_generalized_rms_error(fields_1, fields_2, normalize=False, dims=2)

Calculates a generalization of the root-mean-squared error between two complex wavefields

This function calculates an generalization of the RMS error which uses the concept of fidelity to extend it to capture the error between incoherent wavefields, defined as a mode decomposition. The extension has several nice properties, in particular:

For coherent wavefields, it precisely matches the RMS error including a correction for the global phase degeneracy (align_phases=True)
All mode decompositions of either field that correspond to the same density matrix / mutual coherence function will produce the same output
The error will only be zero when comparing mode decompositions that correspond to the same density matrix.
Due to (2), one need not worry about the ordering of the modes, properly orthogonalizing the modes, and it is even possible to compare mode decompositions with different numbers of modes.

The formal definition of this function is:

output = norm * sqrt(mean(abs(fields_1)**2)

mean(abs(fields_2)**2)

2 * sqrt(fidelity(fields_1,fields_2)))

Where norm is an optional normalization factor, and the fidelity is defined based on the mean, rather than the sum, to match the convention for the root mean squared error.

The normalization is defined as the square root of the total intensity contained in fields_1, which is appropriate when fields_1 represents a known ground truth:

norm = sqrt(mean(abs(fields_1)**2))

In the definitions above, the fields_n are regarded as collections of wavefields, where each wavefield is by default 2-dimensional. The dimensionality of the wavefields can be altered via the dims argument, but the fields_n arguments must always have at least one more dimension than the dims argument. Any additional dimensions are treated as batch dimensions.

Parameters:

fields_1 (array) – The first set of complex-valued field modes
fields_2 (array) – The second set of complex-valued field modes
normalize (bool) – Default is False, whether to normalize to the intensity of fields_1
dims ((int or tuple of python:ints)) – Default is 2, the number of final dimensions to reduce over.

Returns:

rms_error – The generalized RMS error, or tensor of generalized RMS errors, depending on the dim argument

Return type:

float or t.Tensor

cdtools.tools.analysis.standardize_reconstruction_set(half_1, half_2, full, correct_phase_offset=True, correct_phase_ramp=True, correct_amplitude_exponent=False, window=(slice(None, None, None), slice(None, None, None)), nbins=50, frc_limit='side')

Standardizes and analyses a set of 50/50/100% reconstructions

It’s very common to split a ptychography dataset into two sub-datasets, each with 50% of the exposures, so that the difference between the two sub-datasets can be used to estimate the quality and resolution of the final, full reconstruction. But to do that analysis, first the reconstructions need to be aligned with respect to each other and normalized in a few ways.

This function takes the results (as output by model.save_results) of a set of 50/50/100% reconstructions and:

Aligns the object reconstructions with one another
Corrects for the global phase offset (by default)
Sets a sensible value for the object/probe phase ramp (by default)
Sets a sensible value for the object/probe exponential decay (off by
default)
Calculates the FRC and derived SSNR.

Then, these results are packaged into an output dictionary. The output does not retain all the information from the inputs, so if full traceability is desired, do not delete the files containing the individual reconstructions

Parameters:

half_1 (dict) – The result of the first half dataset, as returned by model.save_results
half_2 (dict) – The result of the second half dataset, as returned by model.save_results
full (dict) – The result of the full dataset, as returned by model.save_results

Returns:

results – A dictionary containing the synthesized results

Return type:

dict

cdtools.tools.analysis.standardize_reconstruction_pair(half_1, half_2, correct_phase_offset=True, correct_phase_ramp=True, correct_amplitude_exponent=False, window=(slice(None, None, None), slice(None, None, None)), nbins=50, probe_nbins=50, frc_limit='side')

Standardizes and analyses a set of two repeat

It’s very common to run two subsequent ptycho reconstructions, so that the effect of sample damage during the first reconstruction can be used in the estimate of thefinal quality. The difference between the two datasets datasets can be used to estimate the quality and resolution of each one. But to do that analysis, first the reconstructions need to be aligned with respect to each other and normalized in a few ways.

This function takes the results (as output by model.save_results) of a pair of reconstructions and:

Aligns the object reconstructions with one another
Corrects for the global phase offset (by default)
Sets a sensible value for the object/probe phase ramp (by default)
Sets a sensible value for the object/probe exponential decay (off by
default)
Calculates the FRC and derived SSNR.

Then, these results are packaged into an output dictionary. The output does not retain all the information from the inputs, so if full traceability is desired, do not delete the files containing the individual reconstructions

Parameters:

half_1 (dict) – The result of the first half dataset, as returned by model.save_results
half_2 (dict) – The result of the second half dataset, as returned by model.save_results

Returns:

results – A dictionary containing the synthesized results

Return type:

dict

cdtools.tools.analysis.calc_spectral_info(dataset, nbins=50)

Makes a properly normalized sum diffraction pattern

This returns a scaled version of sum of all the diffraction patterns within the dataset. The scaling is defined so that the total intensity in the final image is equal to the intensity arising from a region of the scan pattern whose area matches one detector conjugate field of view.

This estimation will start to deviate from the truth if the scan area is not significantly larger than the illumination function, because the nonzero size of the illumination function is not taken into account. Furthermore, in the edge case where all the scan points are colinear, the estimate will fail, and the mean diffraction pattern will be returned instead

Parameters:

dataset (Ptycho2DDataset) – A ptychography dataset to use
nbins (int) – The number of bins to use for the SNR curve

Returns:

spectrum (t.tensor) – An image of the spectral signal rate
freqs (t.tensor) – The frequencies at which the SSNR is estimated
SSNR (t.tensor) – The estimated SSNR