piqa.ssim#

Structural Similarity (SSIM) and Multi-Scale Structural Similarity (MS-SSIM)

This module implements the SSIM and MS-SSIM in PyTorch.

Original

https://ece.uwaterloo.ca/~z70wang/research/ssim/

Wikipedia

https://wikipedia.org/wiki/Structural_similarity

References

Image quality assessment: From error visibility to structural similarity (Wang et al., 2004)
Multiscale structural similarity for image quality assessment (Wang et al., 2003)

Functions#

ms_ssim

Returns the MS-SSIM between \(x\) and \(y\).

ssim

Returns the SSIM and Contrast Sensitivity (CS) between \(x\) and \(y\).

Classes#

MS_SSIM

Measures the MS-SSIM between an input and a target.

SSIM

Measures the SSIM between an input and a target.

Descriptions#

piqa.ssim.ssim(x, y, kernel, channel_avg=True, padding=False, value_range=1.0, k1=0.01, k2=0.03)#

Returns the SSIM and Contrast Sensitivity (CS) between \(x\) and \(y\).

\[\begin{split}\text{SSIM}(x, y) & = \frac{2 \mu_x \mu_y + C_1}{\mu^2_x + \mu^2_y + C_1} \text{CS}(x, y) \\ \text{CS}(x, y) & = \frac{2 \sigma_{xy} + C_2}{\sigma^2_x + \sigma^2_y + C_2}\end{split}\]

where \(\mu_x\), \(\mu_y\), \(\sigma^2_x\), \(\sigma^2_y\) and \(\sigma_{xy}\) are the results of a smoothing convolution over \(x\), \(y\), \((x - \mu_x)^2\), \((y - \mu_y)^2\) and \((x - \mu_x)(y - \mu_y)\), respectively.

In practice, SSIM and CS are averaged over the spatial dimensions. If channel_avg is True, they are also averaged over the channels.

Tip

ssim and SSIM can be applied to images with 1, 2 or even 3 spatial dimensions.

Parameters:
  • x (Tensor) – An input tensor, \((N, C, H, *)\).

  • y (Tensor) – A target tensor, \((N, C, H, *)\).

  • kernel (Tensor) – A smoothing kernel, \((C, 1, K)\).

  • channel_avg (bool) – Whether to average over the channels or not.

  • padding (bool) – Whether to pad with \(\frac{K}{2}\) zeros the spatial dimensions or not.

  • value_range (float) – The value range \(L\) of the inputs (usually 1 or 255).

Note

For the remaining arguments, refer to Wang et al. (2004).

Returns:

The SSIM and CS tensors, both \((N, C)\) or \((N,)\) depending on channel_avg.

Return type:

Tuple[Tensor, Tensor]

Example

>>> x = torch.rand(5, 3, 64, 64, 64)
>>> y = torch.rand(5, 3, 64, 64, 64)
>>> kernel = gaussian_kernel(7).repeat(3, 1, 1)
>>> ss, cs = ssim(x, y, kernel)
>>> ss.shape, cs.shape
(torch.Size([5]), torch.Size([5]))
piqa.ssim.ms_ssim(x, y, kernel, weights, padding=False, value_range=1.0, k1=0.01, k2=0.03)#

Returns the MS-SSIM between \(x\) and \(y\).

\[\text{MS-SSIM}(x, y) = \text{SSIM}(x^M, y^M)^{\gamma_M} \prod^{M - 1}_{i = 1} \text{CS}(x^i, y^i)^{\gamma_i}\]

where \(x^i\) and \(y^i\) are obtained by downsampling the initial tensors by a factor \(2^{i - 1}\).

Parameters:
  • x (Tensor) – An input tensor, \((N, C, H, W)\).

  • y (Tensor) – A target tensor, \((N, C, H, W)\).

  • kernel (Tensor) – A smoothing kernel, \((C, 1, K)\).

  • weights (Tensor) – The weights \(\gamma_i\) of the scales, \((M,)\).

  • padding (bool) – Whether to pad with \(\frac{K}{2}\) zeros the spatial dimensions or not.

  • value_range (float) – The value range \(L\) of the inputs (usually 1 or 255).

Note

For the remaining arguments, refer to Wang et al. (2003).

Returns:

The MS-SSIM vector, \((N,)\).

Return type:

Tensor

Example

>>> x = torch.rand(5, 3, 256, 256)
>>> y = torch.rand(5, 3, 256, 256)
>>> kernel = gaussian_kernel(7).repeat(3, 1, 1)
>>> weights = torch.rand(5)
>>> l = ms_ssim(x, y, kernel, weights)
>>> l.shape
torch.Size([5])
class piqa.ssim.SSIM(window_size=11, sigma=1.5, n_channels=3, reduction='mean', **kwargs)#

Measures the SSIM between an input and a target.

Parameters:
  • window_size (int) – The size of the window.

  • sigma (float) – The standard deviation of the window.

  • n_channels (int) – The number of channels \(C\).

  • reduction (str) – Specifies the reduction to apply to the output: 'none', 'mean' or 'sum'.

  • kwargs – Keyword arguments passed to ssim.

Example

>>> criterion = SSIM()
>>> x = torch.rand(5, 3, 256, 256, requires_grad=True)
>>> y = torch.rand(5, 3, 256, 256)
>>> l = 1 - criterion(x, y)
>>> l.shape
torch.Size([])
>>> l.backward()
forward(x, y)#
Parameters:
  • x (Tensor) – An input tensor, \((N, C, H, W)\).

  • y (Tensor) – A target tensor, \((N, C, H, W)\).

Returns:

The SSIM vector, \((N,)\) or \(()\) depending on reduction.

Return type:

Tensor

class piqa.ssim.MS_SSIM(window_size=11, sigma=1.5, n_channels=3, weights=None, reduction='mean', **kwargs)#

Measures the MS-SSIM between an input and a target.

Parameters:
  • window_size (int) – The size of the window.

  • sigma (float) – The standard deviation of the window.

  • n_channels (int) – The number of channels \(C\).

  • weights (Tensor) – The weights of the scales, \((M,)\). If None, use MS_SSIM.WEIGHTS instead.

  • reduction (str) – Specifies the reduction to apply to the output: 'none', 'mean' or 'sum'.

  • kwargs – Keyword arguments passed to ms_ssim.

Example

>>> criterion = MS_SSIM()
>>> x = torch.rand(5, 3, 256, 256, requires_grad=True)
>>> y = torch.rand(5, 3, 256, 256)
>>> l = 1 - criterion(x, y)
>>> l.shape
torch.Size([])
>>> l.backward()
WEIGHTS: Tensor = tensor([0.0448, 0.2856, 0.3001, 0.2363, 0.1333])#

Scale weights of Wang et al. (2003).

forward(x, y)#
Parameters:
  • x (Tensor) – An input tensor, \((N, C, H, W)\).

  • y (Tensor) – A target tensor, \((N, C, H, W)\).

Returns:

The MS-SSIM vector, \((N,)\) or \(()\) depending on reduction.

Return type:

Tensor