Nan Zhang's Blog: 2019

Monday, December 30, 2019

An Android App for Photo Brightening

I published an Android app used for photo brightening named Brighter Photo. Using a low light photo as the input, the app can post process it and make it look brighter. It is used only for post processing but not as a camera. The app is available at Google Play store and the link is here

Figure below shows what the app can do. The left half is a photo taken at night and the right half is the same photo after processed by Brighter Photo app.

To use the app, the GUI has three buttons: Load, Save and Setting. "Load" button can load photo from Android gallery. After loading, post processing immediately starts. Post processing takes 10-30 seconds depending on the image size. It runs slower than some of the similar apps but produces better results. "Save" button allows to save the results. Note that to enable saving, the storage permission needs to be enabled in Android setting for this app.

This Youtube video shares how to use the app.

Hope that this introduction is useful and you can enjoy using this app. Any feedback is welcomed.

Saturday, November 16, 2019

A Simple Example of Image Processing Using Java

In this post, we share a simple example of image processing using Java. This example benefits from Java tutorials available from Internet.

This example has three steps: 1) read the input image; 2) image processing; 3) display the processed image.

For step 1, an image is read into image buffer.

    File file = new File("Lenna.png");
    BufferedImage image = null;

  // Read image
    try
    {
        image = ImageIO.read(file);
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
    System.out.println("done");

Step 2 has a simple operation of image processing. It does 2D convolution filtering for the image using filt3x3 function in the ConvolutionMatrix class.

    // 2D convolution for the image
    double[][] config = {{1,2,1}, {0,0,0}, {-1,-2,-1}};
    ConvolutionMatrix imageConv = new ConvolutionMatrix(3);
    imageConv.applyConfig(config);
    image2 = imageConv.filt3x3(image, imageConv);

Step 3 displays the modified image

    // Display the modified image
    ImageIcon icon=new ImageIcon(image);
    JFrame frame=new JFrame();
    frame.setLayout(new FlowLayout());
    frame.setSize(image.getWidth(),image.getHeight()); //Window.setSize(int width, int height)
    JLabel lbl=new JLabel();
    lbl.setIcon(icon);
    frame.add(lbl);
    frame.setVisible(true);
    frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

Java-based image processing is used in many places. Hopefully some people can benefit from this tutorial. The code can be found from here

Saturday, November 9, 2019

LibRaw Post Processing Pipeline Translated Into Python

LibRaw is a widely used open source library for raw image conversion. In many people's opinion, the most valuable part of LibRaw library is that it can handle various types of raw image formats, which is not commonly available to other libraries. But people are also interested in its post processing pipeline, which converts a interleaved RGB dot map to a colorful picture by using procedures such as white balance adjustment, demosaicing, gamma mapping, etc. People want to understand LibRaw's post processing pipeline but the code can be hard to deciphered. That is why I wrote a Python script which appears to match with LibRaw's post processing pipeline quite well. The hope is that by reading this Python script, it is easier for folks to understand what actually happens in LibRaw.

I should admit that some corners have been cut in this Python script. For example, the output of the script is only half the size of the input, which means no need for me to write a full-fledged demosaicing which matches the size of the input. However, what I found over the time is that quite often, the most puzzling part of post processing to people include me is how inputs from different color channels are scaled and mixed together. The scaling and mixing parts are covered in this Python script.

Here is the approach we take when writing this script: the input of this script is a raw .dng image. The image is taken using Samsung Galaxy S9 phone. The decoding before post processing is done using Python rawpy library. Python rawpy library is LibRaw wrapped in Python. After decoding, post processing is done using two parallel paths. The first path is to use postprocess() function provided by rawpy. Under the hood, it calls dcraw_process() from dcraw library written by Dave Coffin. The first path is used for reference. The second path is my own pipeline written in Python. The second path takes rawpy decoding outputs as inputs including raw image and some parameters. At the end of the script, we compare the reference and reconstructed image by my code and find them to be almost the same.

Now let me introduce the post processing code I wrote. It includes three main steps: 1) scale the input by white balance, 2) mix R/G/B by using color matrix, 3) Gamma scaling

In step 1, raw image containing RGB info is scaled by white balance (wb_scale). Since Bayer pattern is used for raw image input, there are two green channels (color4[:,:,1] and color4[:,:,3]). The white balance scale comes from the parameter of camera_whitebalance in rawpy decoding output. One can use another parameter of daylight_whitebalance but it makes the final output image yellowish. Since the next step of pipeline only needs three inputs, there is minor step called mix green which mixes two green channels as: color4[:,:,1] = (color4[:,:,1] + color4[:,:,3])/2.

color4[:,:,0] = raw_py.raw_image[0::2,1::2]*wb_scale[0]
color4[:,:,1] = raw_py.raw_image[0::2,0::2]*wb_scale[1]
color4[:,:,2] = raw_py.raw_image[1::2,0::2]*wb_scale[2]
color4[:,:,3] = raw_py.raw_image[1::2,1::2]*wb_scale[3]

In step 2, color_matrix is applied. I believe the purpose is convert camera sensor input to standard RGB image. color_matrix is also among rawpy decoding outputs. While white balance changes from photo to photo, color_matrix is camera's inherent property and it stays the same with the same camera setting.

pic_temp[:,:,0] = color_matrix[0,0]*color4[:,:,0] + color_matrix[0,1] * color4[:,:,1] + color_matrix[0,2] * color4[:,:,2]
pic_temp[:,:,1] = color_matrix[1,0]*color4[:,:,0] + color_matrix[1,1] * color4[:,:,1] + color_matrix[1,2] * color4[:,:,2]
pic_temp[:,:,2] = color_matrix[2,0]*color4[:,:,0] + color_matrix[2,1] * color4[:,:,1] + color_matrix[2,2] * color4[:,:,2]

Step 3 is Gamma mapping. Gamma mapping is a legacy from CRT era to compensate the nonlinear distortion of CRT monitor. But it outlives CRT. The parameter used in Gamma curve is Gamma(0.45, 4.5).

pic_temp = gamma_curve[pic_temp]

Finally we compare the results between reference image by rawpy's postprocess() function and reconstructed image by our code. It turns out that max delta between them is 2 output 255. Thus we claim that these two images are almost the same. The code/image can be found here.

Monday, October 28, 2019

Image White Balancing with Python

Below is a photo I took in the viewing deck of Taipei 101. Taipei 101 has a glass wall which is green color. Thus, all photos taken inside the building through the wall has a green cast including this one.

To remove the cast of green color, we use the technique of image white balancing. First, we need a reference object with known color. Fortunately this photo has clouds (pointed by the arrow). Clouds are good references since they are white. By averaging the pixels of cloud (mean(img_array[400,500:550,:])), the mean of R/G/B is [201.5, 254.9, 253.9]. Since white color has R/G/B channels roughly equal, to make it white, 52 needs to be added to R channel of the whole image. Like DC cancellation in communication, white balancing removes the bias of the signal.

Code below shows how to do white balancing

filename = 'IMG_9254.JPG'
img = image.load_img(filename)
img_array = np.array(image.img_to_array(img),dtype=np.uint8)
img_shape=img_array.shape
plt.figure()
plt.imshow(img_array)

img_array2 = img_array
img_array2 = np.array(img_array2, dtype=np.uint16)
img_array2[:,:,0] = img_array2[:,:,0] + 52 # white balancing
img_array2 = np.clip(img_array2, 0, 255)

The newly generated image after white balancing, img_array2, is shown below. The color does look more natural. In case that you want to repeat this, code/image can be found here.

Wednesday, October 2, 2019

Note on Implementation of “Fast Noise Variance Estimation” by J. Immerkær

Noise variance estimation is a fundamental task of image processing. Among various approaches of noise variance estimate, fast noise variance estimation method proposed by J. Immerkær in [1] stands out because of its good balance between complexity and accuracy.

Based on J. Immerkær's method, to estimate the standard deviation of image noise $\sigma$, we can use equations below:

$F=
\begin{bmatrix}
1 & -2 & 1\\
-2 & 4 & -2\\
1 & -2 & 1\\
\end{bmatrix}
$

$\sigma=\frac{\sqrt{\pi/2}}{6(W-2)(H-2)}\sum_{I}|I(x,y)*F|$

$F$ is the high-pass filter. To estimate noise variance, filter $F$ will be applied on the whole image $I(x,y)$ with convolution. The sum of absolute value of convolution results is normalized by width of height of the image, which is $(W-2)(H-2)$. "-2" counts for the boundary effect. The summation is also scaled by $\frac{\sqrt{\pi/2}}{6}$, which is explained next.

Assume a 3x3 block of $I(x,y)$ with each element of $x_{i}$ to be a Gaussian random variable with mean $\mu$ and standard deviation $\sigma$
$I=
\begin{bmatrix}
x_{1} & x_{2} & x_{3}\\
x_{4} & x_{5} & x_{6}\\
x_{7} & x_{8} & x_{9}\\
\end{bmatrix}
$
Then $|I*F|=|x_{1}-2*x_{2}+x_{3}-2*x_{4}+4*x_{5}-2*x_{6}+x_{7}-2*x_{8}+x_{9}|$
Since $E(|I*F|^2)=36\sigma^2$, $|I*F|=6|s|$ where s is Gaussian random variable with mean 0 and standard deviation $\sigma$. We can derive the distribution of $y=|s|$ and it is $\frac{2}{\sqrt{2\pi}\sigma}e^{-\frac{y^2}{2\sigma^2}}$ for $y>=0$ and 0 for $y < 0$. <0 .="" br="">

Since $E(|s|)=\sqrt{\frac{2}{\pi}}\sigma$, it explains why the summation is also scaled by $\frac{\sqrt{\pi/2}}{6}$.

[1] J. Immerkær, “Fast Noise Variance Estimation”, Computer Vision and Image Understanding, Vol. 64, No. 2, pp. 300-302, Sep. 1996

Sunday, September 22, 2019

Google Cloud Service GPU Profiling: K80/T4 vs GTX1060

Recently I tried training neural network using Google cloud service. The user experience is in general good. The 300$ free credit also helps. To support deep learning task, one can create virtual machines in Google cloud service which contain both CPU and GPU. There are a few options of GPU ranging from K80 to V100 depending on how heavy the computation task is and user's affordability. I compared the performance of two GPUs: K80 and T4 with my local machine. Here are the results:

The benchmark used is a Keras example based on MNIST data set. The metric for profiling is the duration used for one epoch of training. The table below shows the profiling results:

GPU	CPU	Time
GTX1060 6GB GDDR5	Intel 6-Core i5, 16G RAM	35s
Tesla K80, 12GB GDDR5	4vCPU, 26G RAM	8s
Tesla T4, 16GB GDD6	4vCPU, 26G RAM	4s

GTX1060 is the GPU used in my local machine. The profiling results show that for this MNIST benchmark, the time used by K80 is about one fourth and the time used by T4 is about one ninth of that of my local machine. Note that when CUDA/Tensorflow libraries are not set up correctly, the computation might be done in CPU and the computation time will be increased drastically. For example, the duration of training one epoch will increase from 8s to 84s in case the computation is done in CPU not GPU. Thus when computation time is unexpectedly long in Google cloud service, please confirm that the computation is performed indeed in GPU.

Thursday, August 29, 2019

Signal Processing Magic (6) -- Half Band Filter

Half-band filter is a FIR filter often used in decimation by 2 filtering operation. Benefit of this filter is that half of the filter coefficients are 0's, which reduces the complexity of filtering operation. The name of "half-band" comes from that the filter's cut-off frequency is roughly fs/4, thus the filter's pass band is around half of the total.

To design a half-band filter, it starts from a sinc function as \[s[n]=\frac{sin(\pi*n/2)}{\pi*n/2},n=-N,-(N-1),...,N\]
We know that when N approaches infinite, freq response of this sinc function is rectangle ranging from -fs/4 to fs/4. But in real world, N is always a limited number, and when that occurs, the filter does not look great. Ripples occur in pass band and rejection in stop band is often not good enough either.

How to improve this? Windowing method comes for rescue. As introduced before, windowing method enhances the pass band (main lobe) and suppresses the stop band (side lobes). Assuming window is w[n], after adding window, the new filter becomes
\[h[n]=s[n]*w[n]\]
Since half of $s[n]$'s coefficients are 0's, half of $h[n]$'s coefficients are 0's too.

Matlab script to generate a 21-tap half-band filter with Hamming window is shown below:

N = 10;
x = [-10 : 10]/2;
s = sinc(x)
w = hamming(2*N+1);
h = s.*w.';

figure;
subplot(2,1,1);stem(s);title('Sinc');
subplot(2,1,2);stem(h);title('Half-band');

[H, F] = freqz(s, 1, 'whole', 2^15);
[Hf, Ff] = freqz(h, 1, 'whole', 2^15);

figure; hold on;
plot(F(1:end/2)/pi/2, 20*log10(abs(H(1:end/2))));
plot(Ff(1:end/2)/pi/2, 20*log10(abs(Hf(1:end/2))));
grid;
ylabel('Magnitude (dB)'); xlabel('Freq (fs)');
legend('Sinc','Half-band');

Filter coefficients for Sinc and half-band filter

Freq response for Sinc and half-band filter

Tuesday, August 20, 2019

TensorFlow Android App Debugging (3) -- Keras to TensorFlow Model Conversion

Deep learning model can be programmed using different libraries. Among them, Keras is one of the most easy to use. However, network model generated using Keras can''t be directly applied to Android app. First it needs to be converted to TensorFlow. A conversion tool is provided here to automatically convert Keras model to TensorFlow. The instruction of this tool is straightforward. It converts .h5 (Keras model) to .pb (TensorFlow model).

python keras_to_tensorflow.py

--input_model="path/to/keras/model.h5"

--output_model="path/to/save/model.pb"

For verification purpose, we can use tool named Tensorboard to visualize the generated model. As the first step, the model will be read into a folder by script read_pb_model.py. The default folder name is tf_summary.

python read_pb_model.py model.pb

Then tensorboard will convert the folder's content to a graph, and display it in a webpage such as http://nan-System-Product-Name:6006. Below is an instruction of tensorboard

tensorboard --logdir ./tf_summary

An example of generated graph is shown below. The graph displays the network topology. It also shows the name of the node. It is important to know the names. In Android app, when one needs to access certain part of the network model, he has to use the name as reference.

Thursday, August 8, 2019

For CV2 Color to Grayscale Image Conversion

The task is simple: we want to verify the formula used in CV2 (OpenCV 2.0) for converting color image to grayscale. The formula to convert from RGB to gray can be found here: Y = 0.299*R + 0.587*G + 0.114*B

We use the following code to verify the formula. First we read in a color image as both color (img_color) and grayscale (img_gray). Then we use the equation above to convert the color image to gray (img_color2gray). As the final step, we compare img_gray and img_color2gray. This comparison allows to verify the equation. The output of the equation is floating number. We use rounding (np.round) to convert float to integer. np.round(1.499) = 1 and np.round(1.5) = 2. We have tried other quantization methods such as floor or ceil but they don't behave as well.

img_name = 'lena_color_512.tif';
img_color = cv2.imread(img_name, cv2.IMREAD_COLOR)
img_gray = cv2.imread(img_name, cv2.IMREAD_GRAYSCALE)

shape = img_gray.shape
img_color2gray = np.zeros(shape, dtype=np.int32)
img_color2gray[:,:] = np.round(0.299*img_color[:,:,2]+0.587*img_color[:,:,1]+0.114*img_color[:,:,0])
img_color2gray = np.clip(img_color2gray, 0, 255)

The comparison result is this: img_gray and img_color2gray are very similar but not identical. Out of 512*512=262144 pixels in total, 352 pixels don't match (%0.13 of total). The dots in the figure below are the unmatched pixels. This leaves a question of how exactly this conversion is done in OpenCV SW. My source code can be found here.

Saturday, August 3, 2019

Signal Processing Magic (5) -- Windowing

In discrete-time signal processing, if received signal is a tone, its spectrum looks like a sinc function (see this). The reason is by cutting digital signal to finite length, it is like imposing a rectangular window in time domain. And rectangular window in time domain corresponds to sinc function in freq domain.

But some engineers thought side lobes in sinc function are too big. In order to make them smaller, they invented windowing method, which is to impose a window on the time-domain signal and it can suppress side lobes in the frequency domain. Since then people have invented different types of windows (Bartlett, Hanning, Hamming, Blackman etc. see Sec 7.2 in Discrete-time signal processing written by Oppenheim&Schafer). Hamming and Hanning windows are shown below. Assuming the digital signal has M samples, to apply windowing, one should generate a window lasting M samples and then multiple window with digital sample by sample. Each window is well defined mathematically. For example, Hamming window for M samples is defined as:
\[w[n]=0.54-0.46cos(2\pi n/M)\]

Next we compare two signals, one with windowing and one without. The signal is a 100KHz tone with sampling rate of 1MHz. The plot below shows that by adding Hamming window, the side lobe is suppressed by around 20dB but main lobe becomes wider. Windowing function works like a magical rolling pin. It drives the signal energy from side lobes to the main lobe.

In the next plot, instead of one tone, we show the spectrum of two tones, one at 100KHz and another at 300KHz. After applying windowing, these two tones become more distinctive.

Thursday, June 6, 2019

Python Implementation for BM3D Denoising of Color Image

BM3D denoising is a popular method used for removing image noise. The original BM3D paper [1] proposed the algorithm for gray scale image. Then in [2], the method is extended for color images.

Python script for grayscale image BM3D can be found here by liuhuang31. But Python script for color image BM3D can't be identified after searching in Internet. Note that this Python implementation supports BM3D for color images but the kernel part is written in C by Marc Lebrun. By modifying liuhuang31's script, we provide a Python implementation for color BM3D here. In this implementation, all codes including algorithm kernel part is written in Python.

Below is an example including noisy image and images after step 1 and step 2 of denoising processing of BM3D. Noisy image is generated by adding random noise to reference noise-free image. PSNR is peak SNR calculated between each individual image and the reference noise-free image.

Noisy image (PSNR = 22.41dB)

After step 1 of removing noise (PSNR = 29.36dB)

After step 2 of removing noise (PSNR = 30.13dB)

While our Python script is an extension of liuhuang31's work, we should note that we have made a few corrections:

Correction 1:
In the original code, delta of two images is calculated as below. But since img1 and img2 are both uint8 type, the calculation result is wrong.

D = numpy.array(img1 - img2, dtype=numpy.int64)

In the modified code, img1 and img2 are first typecast to int64 and then delta between them is found.

D = numpy.array(img2, dtype=numpy.int64) - numpy.array(img1, dtype=numpy.int64)

Correction 2:
In the original code, to find boundary, it uses the following code. But last line has a typo. shape[0] should be shape[1].

    if LX < 0:   LX = 0
    elif RX > _noisyImg.shape[0]:   LX = _noisyImg.shape[0]-_WindowSize
    if LY < 0:   LY = 0
    elif RY > _noisyImg.shape[0]:   LY = _noisyImg.shape[0]-_WindowSize

The modified code is:

    if LX < 0:   LX = 0
    elif RX > _noisyImg.shape[0]:   LX = _noisyImg.shape[0]-_WindowSize
    if LY < 0:   LY = 0
    elif RY > _noisyImg.shape[1]:   LY = _noisyImg.shape[1]-_WindowSize

Correction 3:
The last one is probably the trickiest one. To calculate Wiener filter in step 2, noise variance is sigma^2. To match with noise variance, signal power should be normalized by the count of similar blocks. This normalization is not in liuhuang31's original code. Other BM3D code such as Marc Lebrun's code also uses this normalization.

                Norm_2 = numpy.float64(tem_Vct_Trans.T * tem_Vct_Trans)
                m_weight = Norm_2/Count/(Norm_2/Count + sigma_color[ch]**2)

[1] K. Dabov et. al., Image denoising by sparse 3D transform-domain collaborative filtering, IEEE Trans. Image Proc., Vol 16-8, Aug 2007
[2] K. Dabov et. al, Color image denoising via sparse 3D collaborative filtering with grouping constraint in luminance-chrominance space, ICIP 2007

Saturday, January 19, 2019

TensorFlow Android App Debugging (2) -- Dump Out Image/Text Files

During Android app debugging, dumping out data from Android to PC to crucial for the following analysis process. In the previous post, we said that Android logging function is a useful tool. But sometime logging only is not sufficient. For example, to save information of an image by logging is time-consuming, prone to packet loss and unproductive. A better way is to save the image into internal memory. In this post, we provided some examples of how to save images and text files in Android.

To save images (modified from the codes found here). bitmap is saved into .JPEG image files with lossless compression.

    // Save bitmap
    String extStorageDirectory = Environment.getExternalStorageDirectory().toString();
    OutputStream outStream = null;
    File file = new File(extStorageDirectory, "/DCIM/bitmap"+fileNo+".JPEG");
    try {
      outStream = new FileOutputStream(file);
      bitmap.compress(Bitmap.CompressFormat.JPEG, 100, outStream);
      outStream.flush();
      outStream.close();
    } catch(Exception e) {

    }

To save text files (modified from function found here)

  private void writeToFile(String content, final String filename) {
    try {
      File file = new File(Environment.getExternalStorageDirectory() + "/DCIM/" + filename);
      //File file = new File("/DCIM/test.txt");

      Log.i(TAG, Environment.getExternalStorageDirectory() + "/DCIM/" + filename);
      if (!file.exists()) {
        file.createNewFile();
      }
      FileWriter writer = new FileWriter(file);
      writer.append(content);
      writer.flush();
      writer.close();
    } catch (IOException e) {
    }
  }

Thursday, January 17, 2019

TensorFlow Android App Debugging (1) -- Check Intermediate Results

To implement deep learning in smart phone, one can refer to the examples of TensorFlow Android app here. These examples include classifier, detector, voice recognition etc. Based on these frameworks, one can develop his own app with customized deep learning model. But during the process of development, on-target debugging is often needed to ensure that Android implementation generates identical results as offline processing. In this blog, we will provide a few tips for this debugging process.

One important step of debugging is that we need to know what happens in the device. For this purpose, Android provides logging function. Taking the file of TensorFlowImageClassifier.java as an example, it read in image data as:

    for (int i = 0; i < intValues.length; ++i) {
      final int val = intValues[i];
      floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;

To know their values, we can add log function as:

Log.i(TAG, "floatValues " + floatValues[i * 3 + 0] + " , " + floatValues[i * 3 + 1]+ " , " + floatValues[i * 3 + 2]);

Then by the device with PC by USB cable, each time we open the app, logcat will show these log information. This logging process can be extended to whatever variables we are interested in. Android Studio provides a logcat window. Another way for collecting log is to use adb instruction from command window. In my PC, it is
C:\Users\Nan\AppData\Local\Android\Sdk\platform-tools>adb logcat > logcat.txt

To implement deep learning neural network, TensorFlowImageClassifier.java loads a Tensorflow model used for image classification. To test our own model, we can exchange the default TensorFlow model to our own model. But sometimes the model may not work as expected. To debug this, not only the final output of the model is needed, sometimes we also want to dump out the intermediate results. The way to dump out intermediate results is to use fetch() function in TensorFlowInferenceInterface class. In the instruction below, batch_normalization_1/FusedBatchNorm_1 is a intermediate point in our customized TensorFlow model. fetch() function sends output of this intermediate point to the variable of outputs_intermediate. This variable has been initialized in earlier part of the code. To identify the labels of intermediate points of TensorFlow model such as batch_normalization_1/FusedBatchNorm_1, the tool of TensorBoard can be used.

inferenceInterface.fetch("batch_normalization_1/FusedBatchNorm_1", outputs_intermediate);

To support this Android debug process, there is often an offline processing flow which does things in parallel. This offline processing flow serves as reference for Android debugging. Below is snippet of Python code used for Tensorflow-based offline processing. Note that g_in is a variable for kernel weights and it is converted to kernel_conv2d_1, which is a numpy array. As a good first step for debugging, constant inputs can be injected for comparing outputs between Android and offline processing. Then we can inject the same image.

from tensorflow.python.platform import gfile
with tf.Session() as sess:
    # load model from pb file
    with gfile.FastGFile(wkdir+'/'+pb_filename,'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        sess.graph.as_default()
        g_in = tf.import_graph_def(graph_def, return_elements=['conv2d_1/kernel/read:0'])
    # write to tensorboard (check tensorboard for each op names)
    writer = tf.summary.FileWriter(wkdir+'/log/')
    writer.add_graph(sess.graph)
    writer.flush()
    writer.close()

    # inference by the model (op name must comes with :0 to specify the index of its output)
    tensor_output = sess.graph.get_tensor_by_name('import/dense_2/Softmax:0')
    tensor_input = sess.graph.get_tensor_by_name('import/input_1:0')
    tensor_intermediate = sess.graph.get_tensor_by_name('import/batch_normalization_1/FusedBatchNorm_1:0')

    kernel_conv2d_1=g_in[0].eval(session=sess)

    predictions = sess.run(tensor_output, {tensor_input: image_test})
    predictions_0 = sess.run(tensor_intermediate, {tensor_input: image_test})

Nan Zhang's Blog