piqa.vsi#

Visual Saliency-based Index (VSI)

This module implements the VSI in PyTorch.

Original

https://www.putianjian.net/linzhang/IQA/VSI/VSI.html

Wikipedia

https://wikipedia.org/wiki/Salience_(neuroscience)#Visual_saliency_modeling

References

VSI: A Visual Saliency-Induced Index for Perceptual Image Quality Assessment (Zhang et al., 2014)

https://ieeexplore.ieee.org/document/6873260

SDSP: A novel saliency detection method by combining simple priors (Zhang et al., 2013)

https://ieeexplore.ieee.org/document/6738036

Functions#

`sdsp`	Detects salient regions from \(x\).
`sdsp_filter`	Returns the log-Gabor filter for `sdsp`.
`vsi`	Returns the VSI between \(x\) and \(y\), without color space conversion and downsampling.

Classes#

VSI

Measures the VSI between an input and a target.

Descriptions#

piqa.vsi.vsi(x, y, vs_x, vs_y, kernel, value_range=1.0, c1=1.27, c2=0.005936178392925798, c3=0.0019992310649750095, alpha=0.4, beta=0.02)#

Returns the VSI between \(x\) and \(y\), without color space conversion and downsampling.

Parameters:

x (Tensor) – An input tensor, \((N, 3 \text{ or } 1, H, W)\).
y (Tensor) – A target tensor, \((N, 3 \text{ or } 1, H, W)\).
vs_x (Tensor) – The input visual saliency, \((N, H, W)\).
vs_y (Tensor) – The target visual saliency, \((N, H, W)\).
kernel (Tensor) – A gradient kernel, \((2, 1, K, K)\).
value_range (float) – The value range \(L\) of the inputs (usually 1 or 255).

Note

For the remaining arguments, refer to Zhang et al. (2014).

Returns:: The VSI vector, \((N,)\).
Return type:: Tensor

Example

>>> x = torch.rand(5, 3, 256, 256)
>>> y = torch.rand(5, 3, 256, 256)
>>> filtr = sdsp_filter(x)
>>> vs_x, vs_y = sdsp(x, filtr), sdsp(y, filtr)
>>> kernel = gradient_kernel(scharr_kernel())
>>> l = vsi(x, y, vs_x, vs_y, kernel)
>>> l.shape
torch.Size([5])

piqa.vsi.sdsp_filter(x, omega_0=0.021, sigma_f=1.34)#

Returns the log-Gabor filter for sdsp.

Parameters:: x (Tensor) – An input tensor, \((*, H, W)\).

Note

For the remaining arguments, refer to Zhang et al. (2013).

Returns:: The filter tensor, \((H, W)\).
Return type:: Tensor

piqa.vsi.sdsp(x, filtr, value_range=1.0, sigma_c=0.001, sigma_d=145.0)#

Detects salient regions from \(x\).

Parameters:

x (Tensor) – An input tensor, \((N, 3, H, W)\).
filtr (Tensor) – The frequency domain filter, \((H, W)\).
value_range (float) – The value range \(L\) of the input (usually 1 or 255).

Note

For the remaining arguments, refer to Zhang et al. (2013).

Returns:: The visual saliency tensor, \((N, H, W)\).
Return type:: Tensor

Example

>>> x = torch.rand(5, 3, 256, 256)
>>> filtr = sdsp_filter(x)
>>> vs = sdsp(x, filtr)
>>> vs.shape
torch.Size([5, 256, 256])

class piqa.vsi.VSI(chromatic=True, downsample=True, kernel=None, reduction='mean', **kwargs)#

Measures the VSI between an input and a target.

Before applying vsi, the input and target are converted from RBG to L(MN) and downsampled to a 256-ish resolution.

The visual saliency maps of the input and target are determined by sdsp.

Parameters:

chromatic (bool) – Whether to use the chromatic channels (MN) or not.
downsample (bool) – Whether downsampling is enabled or not.
kernel (Tensor) – A gradient kernel, \((2, 1, K, K)\). If None, use the Scharr kernel instead.
reduction (str) – Specifies the reduction to apply to the output: 'none', 'mean' or 'sum'.
kwargs – Keyword arguments passed to vsi.

Example

>>> criterion = VSI()
>>> x = torch.rand(5, 3, 256, 256, requires_grad=True)
>>> y = torch.rand(5, 3, 256, 256)
>>> l = 1 - criterion(x, y)
>>> l.shape
torch.Size([])
>>> l.backward()

forward(x, y)#

Parameters:

x (Tensor) – An input tensor, \((N, 3, H, W)\).
y (Tensor) – A target tensor, \((N, 3, H, W)\).

Returns:

The VSI vector, \((N,)\) or \(()\) depending on reduction.

Return type:

Tensor