**Why is deconvolutional layer so important?**

## 1. What is Image Segmentation?

**image segmentation**is the process of dividing an image into multiple segments(each segment is called super-pixel). And each super-pixel may represent one common entity just like a super-pixel for dog’s head in the figure. Segmentation creates a representation of the image which is easier to understand and analyze as shown in the example. Segmentation is a computationally very expensive process because we need to classify each pixel of the image.

Convolutional neural networks are the most effective way to understand images. But there is a problem with using convolutional neural networks for Image Segmentation.

**But, How to use convolutional neural networks for image segmentation:**

**we use an upsampling convolutional layer which is called deconvolutional layer or fractionally strided convolutional layer**.

## 2. What is Fractionally Strided convolution or deconvolution?

#### 2.1 Detailed understanding of fractionally strided convolution/deconvolution:

###### FIGURE 2: Depiction of usual convolution process with 1-D input

**Now, let’s reverse these operations.**

**So each output x here depends only on two consecutive inputs y whereas in the previous convolution operation the output was dependent on 4 inputs.**

**Now the big question is, can this operation be cast as a convolution operation wherein a kernel is slid across y(vertically in our case as y is arranged vertically) to get output x.**

**The problem with carrying out the operation in this way is that it’s very inefficient**. The reason is that we will first have to use one kernel for producing outputs at odd numbered pixels and then use other kernel for even numbered pixels. Finally we will have to stitch these different sets of outputs by arranging them alternately to get final output.

**We will now see a trick which can make this process efficient.**

So a deconvolution operation can be performed in the same way as a normal convolution. We just have to insert zeros between the consecutive inputs and define a kernel of an appropriate size and just slide it with stride 1 to the get the output. This will ensure an output with a resolution higher than the resolution of its inputs. ** **

**3.1 Image Upsampling:**

###### FIGURE 7

#### 3.2 How to use this for deconvolutional layer initialization?

###### FIGURE 8

###### FIGURE 9

###### FIGURE 10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
def get_bilinear_filter(filter_shape, upscale_factor): ##filter_shape is [width, height, num_in_channels, num_out_channels] kernel_size = filter_shape[1] ### Centre location of the filter for which value is calculated if kernel_size % 2 == 1: centre_location = upscale_factor - 1 else: centre_location = upscale_factor - 0.5 bilinear = np.zeros([filter_shape[0], filter_shape[1]]) for x in range(filter_shape[0]): for y in range(filter_shape[1]): ##Interpolation Calculation value = (1 - abs((x - centre_location)/ upscale_factor)) * (1 - abs((y - centre_location)/ upscale_factor)) bilinear[x, y] = value weights = np.zeros(filter_shape) for i in range(filter_shape[2]): weights[:, :, i, i] = bilinear init = tf.constant_initializer(value=weights, dtype=tf.float32) bilinear_weights = tf.get_variable(name="decon_bilinear_filter", initializer=init, shape=weights.shape) return bilinear_weights |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
def upsample_layer(bottom, n_channels, name, upscale_factor): kernel_size = 2*upscale_factor - upscale_factor%2 stride = upscale_factor strides = [1, stride, stride, 1] with tf.variable_scope(name): # Shape of the bottom tensor in_shape = tf.shape(bottom) h = ((in_shape[1] - 1) * stride) + 1 w = ((in_shape[2] - 1) * stride) + 1 new_shape = [in_shape[0], h, w, n_channels] output_shape = tf.stack(new_shape) filter_shape = [kernel_size, kernel_size, n_channels, n_channels] weights = get_bilinear_filter(filter_shape,upscale_factor) deconv = tf.nn.conv2d_transpose(bottom, weights, output_shape, strides=strides, padding='SAME') return deconv |

So, we have covered the most important part for implementing segmentation in Tensorflow. In the follow up post, we shall implement the complete algorithm for image segmentation and will see some results.