Thursday, September 6, 2018

Is Python Image resize() Function Reversible?

During image processing tasks, frequently we need to change the image size. For instance, the original image size is 100x100 but certain neural network needs input size of at least 255x255. That requires us to interpolate, which increases the image size. On the other hand, we might get some high-definition images with size much larger than 255x255. For these images, we will perform image decimation to reduce the image size. For both image interpolation and decimation, we can use resize() function defined in skimage.transform Python module.

Some tasks require first changing image size from 100x100 to 255x255, processing it, and then changing it back to 100x100. These kinds of tasks motivate the study of this article. The question we ask is this: if we change an image of size x0-by-y0 to size x1-by-y1 and then change it back to the original size, will the newly generated image the same as the original image? If they are the same, we call the resize() function to be reversible; if not, we call it not reversible. If it is reversible, that means the whole image conversion process does not add any additional noise, which in general is a good thing to have.

First let us see what is offered in skimage resize() function. Note that the manual use the word "interpolation" for all image size change no matter it becomes larger or smaller. According to the manual, six interpolation methods are provided in the function controlled by a parameter named "order". According to the manual:


The order of interpolation. The order has to be in the range 0-5:
  • 0: Nearest-neighbor
  • 1: Bi-linear (default)
  • 2: Bi-quadratic
  • 3: Bi-cubic
  • 4: Bi-quartic
  • 5: Bi-quintic

Among these methods, nearest-neighbor, bi-linear and bi-cubic methods are probably the most well-known approaches with increased complexity. For each pixel in the new image, nearest-neighbor method identifies the pixel in the old image the closest to the new pixel and then make the new pixel identical to the old pixel. Nearest-neighbor method only needs one pixel in the old image for interpolation. By comparison, bi-linear method needs four pixels in the old image. Bi-linear interpolation can be explained by the figure below borrowed from wiki. To generate the new pixel P, bi-linear interpolation uses four points in the original image (Q11, Q12, Q21, Q22). Linear interpolation is performed first in x dimension and then y dimension.


Bi-cubic interpolation is more complicated. Bi-cubic represents interpolation using a polynomial with 16 coefficients. To figure out these coefficients, it needs 16 points from the original image as inputs. The bi-cubic equation is as below:
\[f(x,y)=\begin{bmatrix} x^{3}&x^{2}&x&1\\  \end{bmatrix}\begin{bmatrix} a_{3,3}&a_{3,2}&a_{3,1}&a_{3,0}\\ a_{2,3}&a_{2,2}&a_{2,1}&a_{2,0}\\a_{1,3}&a_{1,2}&a_{1,1}&a_{1,0}\\ a_{0,3}&a_{0,2}&a_{0,1}&a_{0,0}\\ \end{bmatrix}\begin{bmatrix} y^{3}\\ y^{2}\\y\\1 \end{bmatrix}\]

In our experiments, we will take two pictures, one for image and another for mask. Image and mask are common settings for image segmentation task. These two original pictures are both of size 101x101. Then we will resize these two to 128x128 and then convert them back to the orignal size of 101x101. The evaluation metric is sum(abs(original image - output image from two resize operations)). When this summation equals to zero, we call the image resize operation reversible. We will evaluate three interpolation approaches: nearest distance, bi-linear, bi-cubic.

Original: left side is the original image while the right side is mask. Mask is binary. Each pixel of mask is either 0 or 1.




















Nearest-neighbor: What is shown here is the delta between original and the output of two resize operations using nearest-neighbor interpolation. For nearest-neighbor method, delta is zero for both image and mask. The reason for zero delta is this: nearest-neighbor method is bi-directional. When a new pixel is nearest neighbor to an old pixel, when resize back, the old pixel is the nearest neighbor to the new pixel as well. That means resize operation using nearest-neighbor method is reversible.




















Bi-linear: bi-linear interpolation is not reversible. As the diagram below shows, the delta between original and resize output is not all zero. Since mask is binary, the delta for mask in the right is more significant than the image in the left. For the picture in the left, non-zero points show at image boundary and also scatter sporadically over the whole image. For the picture in the right, non-zero points shows in the boundary and also along the border between all-zeros area and all-ones area of the original mask image.




















Bi-cubic: bi-cubic has the same issue as bi-linear. However, based on metric of the sum of square error, bi-cubic generates less delta than that of bi-linear. In term of image in the left, the sum square error is 3.14 for bi-linear versus 0.32 for bi-cubic; in term of mask in the right, the sum square error is 13.14 for bi-linear versus 4.78 for bi-cubic.



What this study tells us is that from the perspective of reducing additional noise caused by resize forward and backward, the best choice is nearest-neighbor interpolation since it does not introduce any additional noise. Other than that, bi-cubic is also a good choice although the computation complexity is higher. Note that when applying resize() function, bi-linear is the default. Sample script used in this study can be found here.

Wednesday, September 5, 2018

My $1000 Deep Learning Box

For machine learning enthusiastic, it is always good to have their own box to do deep learning number crunching. The easiest way is probably to purchase a desktop from Dell, HP or other vendors, and then add a GPU on top. I have considered this option but eventually gave it up. One reason for abandoning this idea is that it is hard to customize a commercially available desktop. For instance, if you want a 750W power supply to ensure GPU won't be short of power or need a motherboard which can support two GPUs, it is not easy to find such a desktop in the market with acceptable price. There are other reasons as well. I want to use a Linux computer dedicated for machine learning purpose but most of the computers available in the market come with Window OS. Therefore, after some contemplating, I decide to assemble my own computer, for which I have not done before.

Hardware

After searching in Internet, I found lots of useful information. In particular, I took good reference from the computer component list provided in this blog. But I also made some modifications on top of Yanda's list. Here is the list of components I am using:

1. EVGA GeForce GTX 1060 SC GAMING, ACX 2.0 (Single Fan), 6GB GDDR5, DX12 OSD Support (PXOC), 06G-P4-6163-KR   (A GPU not as fancy as GTX1080. But it is a fair choice for personal usage and can be upgraded later. $278)

2. ASUS ROG Strix Z370-G Gaming LGA1151 (Intel 8th Gen) DDR4 DP HDMI M.2 Z370 Micro ATX Motherboard with onboard 802.11ac WiFi, Gigabit LAN and USB 3.1 (Motherboard recommended by Yanda. This board can support up to 2 GPUs, which means there is room to expand. $189)

3. WD Blue 3D NAND 500GB PC SSD - SATA III 6 Gb/s M.2 2280 Solid State Drive - WDS500G2B0B (this 500GB SDD drive is also recommended by Yanda. I like it since this SDD drive can be embedded into the Z370-G motherboard. $95)

4. Corsair Vengeance LPX 16GB (2x8GB) DDR4 DRAM 3200MHz C16 Desktop Memory Kit - Black (CMK16GX4M2B3200C16) (16GM DRAM, 170$)

5. Intel 8th Gen Core i5-8400 Processor (For budget purpose, I did not purchase i7 but instead settled with an 8th gen i5 processor, $180)

6. EVGA Supernova 750 G3, 80 Plus Gold 750W, Fully Modular, Eco Mode with New HDB Fan, 10 Year Warranty, Includes Power ON Self Tester, Compact 150mm Size, Power Supply 220-G3-0750-X1 (this 750W power supply is recommended by Amazon. It seems to be a quite popular choice and is so far so good for me. $97)

7. Thermaltake Versa H15 SPCC Micro ATX Mini Tower Computer Chassis CA-1D4-00S1NN-00 (This case is also recommended by Amazon. I think any case with good reviews and supports microATX standard should do the job. $41)

All seven components add to $1050. If one chooses better CPU (like i7), better GPU (like GTX1080), or more than one GPU, the budget for this box will increase. But it also shows the benefit of DIY since it is fairly easy to tune the computer setting according to your own budget and needs.

I assembled everything myself. Even though it is the first time I did that, the whole process seems not to be that difficult. And my computer already starts running and it seems that I have not blew up anything. However, you want to read the manual especially the motherboard manual carefully before starting. Youtube education videos can also be quite benefitial.

Software

I installed Ubuntu Linux in my computer. Ubuntu website provides a good tutorial of how to create bootable USB stick for Ubuntu. I follow up that direction to install a Ubuntu 18.04 and everything works fine. During my installation process, I did not allow installing 3rd software out of the concern that it can mess up with Nvidia GPU installation done later.

The trickiest thing during the whole procedure of box assembly is installing CUDA. CUDA support for Linux is far from perfect. As of the time I wrote this blog, CUDA website only support Ubuntu 17 and 16, but not Ubuntu 18. Internet search returns lots of results of how to install CUDA in Ubuntu yet many of them are confusing especially for a Ubuntu newbie like me. For example, I had followed some recommendation to turn off X-server and it ends up with endless black screen. What eventually saves me is this blog.  The author lays out a relatively straightforward path to install CUDA in Ubuntu 18.04 and most importantly, it works! The only thing missing in that blog is how to install Nvidia driver version 390. For Nvidia driver installation, this discussion tells how to do it. It shows two ways and below is what I did. After driver installation done, remember to use nvidia-smi to verify the installation.
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ sudo apt install nvidia-390
After installing CUDA, CUDNN, Tensorflow, you are still a few steps away from running your first deep learning code in the box. For example, you may want to install keras and other Python packages. However, these steps are straightforward. Finally, you should observe a keras or tensorflow sample code running on GPU and it calls a good end to your day.