Skip to content

class Num::NN::ConvolutionalLayer(T)
inherits Num::NN::Layer #

In a CNN, the input is a Tensor with a shape:

(# of inputs) x (input h) x (input w) x (input channels)

After passing through a convolutional layer, the image becomes abstracted to a feature map, also called an activation map, with shape:

(# of inputs) x (feature map h) x (feature map w) x (feature map channels)

Convolutional layers convolve the input and pass its result to the next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus. Each convolutional neuron processes data only for its receptive field. Although fully connected feedforward neural networks can be used to learn features and classify data, this architecture is generally impractical for larger inputs such as high resolution images. It would require a very high number of neurons, even in a shallow architecture, due to the large input size of images, where each pixel is a relevant input feature. For instance, a fully connected layer for a (small) image of size 100 x 100 has 10,000 weights for each neuron in the second layer. Instead, convolution reduces the number of free parameters, allowing the network to be deeper. For example, regardless of image size, using a 5 x 5 tiling region, each with the same shared weights, requires only 25 learnable parameters. Using regularized weights over fewer parameters avoids the vanishing gradients and exploding gradients problems seen during backpropagation in traditional neural networks. Furthermore, convolutional neural networks are ideal for data with a grid-like topology (such as images) as spatial relations between separate features are taken into account during convolution and/or pooling.

Constructors#

.new(context : Num::Grad::Context(T), in_shape : Array(Int), num_filters : Int, kernel_height : Int, kernel_width : Int, padding = {0, 0}, stride = {1, 1}) #

Creates a convolutional layer in a Network

Arguments#
  • context : Num::Grad::Context(T) - Context of the network. This argument is used entirely to determine the generic type of the layer
  • in_shape : Array(Int) - Shape of input to layer
  • num_filters : Int - Number of filters to apply in the convolution
  • kernel_height : Int - Height of kernel for convolution
  • kernel_width : Int - Width of kernel for convolution
  • padding : Int - Padding of kernel
  • stride : Int - Stride of kernel.

Note

The stride argument is currently only supported for the im2colgemm_conv2d, as it is not supported by NNPACK. Using this parameter is rarely worth the large performance difference if you are able to use NNPACK

View source

Methods#

#forward(input : Num::Grad::Variable(T)) : Num::Grad::Variable(T) #

Performs a forward pass of a variable through a ConvolutionalLayer

Arguments#
View source

#output_shape : Array(Int32) #

Returns the output shape of a ConvolutionalLayer. This method is primarily used to infer the input shape of following layers in a Network

View source

#variables : Array(Num::Grad::Variable(T)) #

Returns all Num::Grad::Variables associated with the Layer. Used primarily to register variables with optimizers

View source