piqa.vsi#

Visual Saliency-based Index (VSI)

This module implements the VSI in PyTorch.

Original

https://www.putianjian.net/linzhang/IQA/VSI/VSI.html

Wikipedia

https://wikipedia.org/wiki/Salience_(neuroscience)#Visual_saliency_modeling

References

VSI: A Visual Saliency-Induced Index for Perceptual Image Quality Assessment (Zhang et al., 2014)
SDSP: A novel saliency detection method by combining simple priors (Zhang et al., 2013)

Functions#

sdsp

Detects salient regions from \(x\).

sdsp_filter

Returns the log-Gabor filter for sdsp.

vsi

Returns the VSI between \(x\) and \(y\), without color space conversion and downsampling.

Classes#

VSI

Measures the VSI between an input and a target.

Descriptions#

piqa.vsi.vsi(x, y, vs_x, vs_y, kernel, value_range=1.0, c1=1.27, c2=0.005936178392925798, c3=0.0019992310649750095, alpha=0.4, beta=0.02)#

Returns the VSI between \(x\) and \(y\), without color space conversion and downsampling.

Parameters:
  • x (Tensor) – An input tensor, \((N, 3 \text{ or } 1, H, W)\).

  • y (Tensor) – A target tensor, \((N, 3 \text{ or } 1, H, W)\).

  • vs_x (Tensor) – The input visual saliency, \((N, H, W)\).

  • vs_y (Tensor) – The target visual saliency, \((N, H, W)\).

  • kernel (Tensor) – A gradient kernel, \((2, 1, K, K)\).

  • value_range (float) – The value range \(L\) of the inputs (usually 1 or 255).

Note

For the remaining arguments, refer to Zhang et al. (2014).

Returns:

The VSI vector, \((N,)\).

Return type:

Tensor

Example

>>> x = torch.rand(5, 3, 256, 256)
>>> y = torch.rand(5, 3, 256, 256)
>>> filtr = sdsp_filter(x)
>>> vs_x, vs_y = sdsp(x, filtr), sdsp(y, filtr)
>>> kernel = gradient_kernel(scharr_kernel())
>>> l = vsi(x, y, vs_x, vs_y, kernel)
>>> l.shape
torch.Size([5])
piqa.vsi.sdsp_filter(x, omega_0=0.021, sigma_f=1.34)#

Returns the log-Gabor filter for sdsp.

Parameters:

x (Tensor) – An input tensor, \((*, H, W)\).

Note

For the remaining arguments, refer to Zhang et al. (2013).

Returns:

The filter tensor, \((H, W)\).

Return type:

Tensor

piqa.vsi.sdsp(x, filtr, value_range=1.0, sigma_c=0.001, sigma_d=145.0)#

Detects salient regions from \(x\).

Parameters:
  • x (Tensor) – An input tensor, \((N, 3, H, W)\).

  • filtr (Tensor) – The frequency domain filter, \((H, W)\).

  • value_range (float) – The value range \(L\) of the input (usually 1 or 255).

Note

For the remaining arguments, refer to Zhang et al. (2013).

Returns:

The visual saliency tensor, \((N, H, W)\).

Return type:

Tensor

Example

>>> x = torch.rand(5, 3, 256, 256)
>>> filtr = sdsp_filter(x)
>>> vs = sdsp(x, filtr)
>>> vs.shape
torch.Size([5, 256, 256])
class piqa.vsi.VSI(chromatic=True, downsample=True, kernel=None, reduction='mean', **kwargs)#

Measures the VSI between an input and a target.

Before applying vsi, the input and target are converted from RBG to L(MN) and downsampled to a 256-ish resolution.

The visual saliency maps of the input and target are determined by sdsp.

Parameters:
  • chromatic (bool) – Whether to use the chromatic channels (MN) or not.

  • downsample (bool) – Whether downsampling is enabled or not.

  • kernel (Tensor) – A gradient kernel, \((2, 1, K, K)\). If None, use the Scharr kernel instead.

  • reduction (str) – Specifies the reduction to apply to the output: 'none', 'mean' or 'sum'.

  • kwargs – Keyword arguments passed to vsi.

Example

>>> criterion = VSI()
>>> x = torch.rand(5, 3, 256, 256, requires_grad=True)
>>> y = torch.rand(5, 3, 256, 256)
>>> l = 1 - criterion(x, y)
>>> l.shape
torch.Size([])
>>> l.backward()
forward(x, y)#
Parameters:
  • x (Tensor) – An input tensor, \((N, 3, H, W)\).

  • y (Tensor) – A target tensor, \((N, 3, H, W)\).

Returns:

The VSI vector, \((N,)\) or \(()\) depending on reduction.

Return type:

Tensor