Some tasks require first changing image size from 100x100 to 255x255, processing it, and then changing it back to 100x100. These kinds of tasks motivate the study of this article. The question we ask is this: if we change an image of size x0-by-y0 to size x1-by-y1 and then change it back to the original size, will the newly generated image the same as the original image? If they are the same, we call the resize() function to be reversible; if not, we call it not reversible. If it is reversible, that means the whole image conversion process does not add any additional noise, which in general is a good thing to have.
First let us see what is offered in skimage resize() function. Note that the manual use the word "interpolation" for all image size change no matter it becomes larger or smaller. According to the manual, six interpolation methods are provided in the function controlled by a parameter named "order". According to the manual:
The order of interpolation. The order has to be in the range 0-5:
Among these methods, nearest-neighbor, bi-linear and bi-cubic methods are probably the most well-known approaches with increased complexity. For each pixel in the new image, nearest-neighbor method identifies the pixel in the old image the closest to the new pixel and then make the new pixel identical to the old pixel. Nearest-neighbor method only needs one pixel in the old image for interpolation. By comparison, bi-linear method needs four pixels in the old image. Bi-linear interpolation can be explained by the figure below borrowed from wiki. To generate the new pixel P, bi-linear interpolation uses four points in the original image (Q11, Q12, Q21, Q22). Linear interpolation is performed first in x dimension and then y dimension.
Bi-cubic interpolation is more complicated. Bi-cubic represents interpolation using a polynomial with 16 coefficients. To figure out these coefficients, it needs 16 points from the original image as inputs. The bi-cubic equation is as below:
\[f(x,y)=\begin{bmatrix} x^{3}&x^{2}&x&1\\ \end{bmatrix}\begin{bmatrix} a_{3,3}&a_{3,2}&a_{3,1}&a_{3,0}\\ a_{2,3}&a_{2,2}&a_{2,1}&a_{2,0}\\a_{1,3}&a_{1,2}&a_{1,1}&a_{1,0}\\ a_{0,3}&a_{0,2}&a_{0,1}&a_{0,0}\\ \end{bmatrix}\begin{bmatrix} y^{3}\\ y^{2}\\y\\1 \end{bmatrix}\]
In our experiments, we will take two pictures, one for image and another for mask. Image and mask are common settings for image segmentation task. These two original pictures are both of size 101x101. Then we will resize these two to 128x128 and then convert them back to the orignal size of 101x101. The evaluation metric is sum(abs(original image - output image from two resize operations)). When this summation equals to zero, we call the image resize operation reversible. We will evaluate three interpolation approaches: nearest distance, bi-linear, bi-cubic.
Original: left side is the original image while the right side is mask. Mask is binary. Each pixel of mask is either 0 or 1.
Nearest-neighbor: What is shown here is the delta between original and the output of two resize operations using nearest-neighbor interpolation. For nearest-neighbor method, delta is zero for both image and mask. The reason for zero delta is this: nearest-neighbor method is bi-directional. When a new pixel is nearest neighbor to an old pixel, when resize back, the old pixel is the nearest neighbor to the new pixel as well. That means resize operation using nearest-neighbor method is reversible.
Bi-linear: bi-linear interpolation is not reversible. As the diagram below shows, the delta between original and resize output is not all zero. Since mask is binary, the delta for mask in the right is more significant than the image in the left. For the picture in the left, non-zero points show at image boundary and also scatter sporadically over the whole image. For the picture in the right, non-zero points shows in the boundary and also along the border between all-zeros area and all-ones area of the original mask image.
Bi-cubic: bi-cubic has the same issue as bi-linear. However, based on metric of the sum of square error, bi-cubic generates less delta than that of bi-linear. In term of image in the left, the sum square error is 3.14 for bi-linear versus 0.32 for bi-cubic; in term of mask in the right, the sum square error is 13.14 for bi-linear versus 4.78 for bi-cubic.
- 0: Nearest-neighbor
- 1: Bi-linear (default)
- 2: Bi-quadratic
- 3: Bi-cubic
- 4: Bi-quartic
- 5: Bi-quintic
Among these methods, nearest-neighbor, bi-linear and bi-cubic methods are probably the most well-known approaches with increased complexity. For each pixel in the new image, nearest-neighbor method identifies the pixel in the old image the closest to the new pixel and then make the new pixel identical to the old pixel. Nearest-neighbor method only needs one pixel in the old image for interpolation. By comparison, bi-linear method needs four pixels in the old image. Bi-linear interpolation can be explained by the figure below borrowed from wiki. To generate the new pixel P, bi-linear interpolation uses four points in the original image (Q11, Q12, Q21, Q22). Linear interpolation is performed first in x dimension and then y dimension.
Bi-cubic interpolation is more complicated. Bi-cubic represents interpolation using a polynomial with 16 coefficients. To figure out these coefficients, it needs 16 points from the original image as inputs. The bi-cubic equation is as below:
\[f(x,y)=\begin{bmatrix} x^{3}&x^{2}&x&1\\ \end{bmatrix}\begin{bmatrix} a_{3,3}&a_{3,2}&a_{3,1}&a_{3,0}\\ a_{2,3}&a_{2,2}&a_{2,1}&a_{2,0}\\a_{1,3}&a_{1,2}&a_{1,1}&a_{1,0}\\ a_{0,3}&a_{0,2}&a_{0,1}&a_{0,0}\\ \end{bmatrix}\begin{bmatrix} y^{3}\\ y^{2}\\y\\1 \end{bmatrix}\]
In our experiments, we will take two pictures, one for image and another for mask. Image and mask are common settings for image segmentation task. These two original pictures are both of size 101x101. Then we will resize these two to 128x128 and then convert them back to the orignal size of 101x101. The evaluation metric is sum(abs(original image - output image from two resize operations)). When this summation equals to zero, we call the image resize operation reversible. We will evaluate three interpolation approaches: nearest distance, bi-linear, bi-cubic.
Original: left side is the original image while the right side is mask. Mask is binary. Each pixel of mask is either 0 or 1.
Nearest-neighbor: What is shown here is the delta between original and the output of two resize operations using nearest-neighbor interpolation. For nearest-neighbor method, delta is zero for both image and mask. The reason for zero delta is this: nearest-neighbor method is bi-directional. When a new pixel is nearest neighbor to an old pixel, when resize back, the old pixel is the nearest neighbor to the new pixel as well. That means resize operation using nearest-neighbor method is reversible.
Bi-linear: bi-linear interpolation is not reversible. As the diagram below shows, the delta between original and resize output is not all zero. Since mask is binary, the delta for mask in the right is more significant than the image in the left. For the picture in the left, non-zero points show at image boundary and also scatter sporadically over the whole image. For the picture in the right, non-zero points shows in the boundary and also along the border between all-zeros area and all-ones area of the original mask image.
Bi-cubic: bi-cubic has the same issue as bi-linear. However, based on metric of the sum of square error, bi-cubic generates less delta than that of bi-linear. In term of image in the left, the sum square error is 3.14 for bi-linear versus 0.32 for bi-cubic; in term of mask in the right, the sum square error is 13.14 for bi-linear versus 4.78 for bi-cubic.
What this study tells us is that from the perspective of reducing additional noise caused by resize forward and backward, the best choice is nearest-neighbor interpolation since it does not introduce any additional noise. Other than that, bi-cubic is also a good choice although the computation complexity is higher. Note that when applying resize() function, bi-linear is the default. Sample script used in this study can be found here.