Learn Coding Neural Network in C#: Define layers and activations

This post is in continuation to the learning series “Learn Coding Neural Network in C#”. If you are not sure where to start, please go through this post first.

By the end of this post you will be ready for a framework with Layers and Activation function defined. The code will be well structured to be able to extend to more layers and functions.

Some background:
A layer is the highest-level building block in deep learning. A layer is a container that usually receives weighted input, transforms it with a set of mostly non-linear functions and then passes these values as output to the next layer. If you have gone through these two posts it will be easy to make sense of what I am talking about.

Activation function (Non linear function) is defined for every neuron, which will fire the neuron when it reaches a certain state. It makes the network adaptable with variety of data and helps differentiate between the output.

Overall in this post, we will define one FullyConnected layer and two activation layer ReLU and Sigmoid. Lets start with some more ground work code before implementing the layers. So open the NeuroSimple project and add a new folder “Layers”.

Add a class called “BaseLayer” and pen down the following code. The “BaseLayer” will be an abstract class with the following properties and functions:

  • Input: The input for this layer. Usually defined within the Tensor object
  • Output: The output from the layer after performing some operation defined by the Forward function.
  • Parameters: The list of trainable parameters which could be weight and bias of the neuron. This is initialized during layer initialization and then updated during the training.
  • Forward: A function which will perform the operation of the neuron and pass the signal to Output variable. Example function for a linear layer is Y = A * X;

Here is the code of the BaseLayer class

    public abstract class BaseLayer : Operations
{
/// <summary>
/// Name of the layer
/// </summary>
public string Name { get; set; }

/// <summary>
/// Input for the layer
/// </summary>
public NDArray Input { get; set; }

/// <summary>
/// Output after forwarding the input across the neurons
/// </summary>
public NDArray Output { get; set; }

/// <summary>
/// Trainable parameters list, eg, weight, bias
/// </summary>
public Dictionary<string, NDArray> Parameters { get; set; }

/// <summary>
/// Base layer instance
/// </summary>
/// <param name="name"></param>
public BaseLayer(string name)
{
Name = name;
Parameters = new Dictionary<string, NDArray>();
}

/// <summary>
/// Virtual forward method to perform calculation and move the input to next layer
/// </summary>
/// <param name="x"></param>
public virtual void Forward(NDArray x)
{
Input = x;
}
}

Before implementing the core layer, I would like to finish up with Activation function which will be used in the Core layer like FullyConnected. So just to get the development in some order, we need to create another folder withing Layers folder “Activations”. Create a new class withing Activations folder “BaseActivation” inheriting from the “BaseLayer”. The Activation will perform the same functionality of taking input, perform the operation and forward the result to the Output, so to maintain some common standard I thought of reusing the same base class.

Below is the code for the BaseActivation. I have added a helper function Get which will give the Activation class instance based on name. This will be used in the core layer we will see later in this post.

    public class BaseActivation : BaseLayer
{
public BaseActivation(string name) : base(name)
{

}

public static BaseActivation Get(string name)
{
BaseActivation baseActivation = null;
switch (name)
{
case "relu":
baseActivation = new ReLU();
break;
case "sigmoid":
baseActivation = new Sigmoid();
break;
default:
break;
}

return baseActivation;
}
}

The rectified linear activation function is a piecewise linear function that will output the input directly if is positive, otherwise, it will output zero. It has become the default activation function for many types of neural networks because a model that uses it is easier to train and often achieves better performance. This is represented by the following math function:

Create a new class under Activations folder “ReLU” inherited from “BaseActivation”. Implement the following code below:

    public class ReLU : BaseActivation
{
public ReLU() : base("relu")
{

}

public override void Forward(NDArray x)
{
base.Forward(x);

NDArray matches = x > 0;
Output = matches * x;
}
}

A sigmoid function is a mathematical function having a characteristic “S”-shaped curve, variety of sigmoid functions including the logistic and hyperbolic tangent functions have been used as the activation function of artificial neurons.

Represented with the following equation:

Lets implement it like we did for ReLU, create another class file under Activations called “Sigmoid” and add the following code

    public class Sigmoid : BaseActivation
{
public Sigmoid() : base("sigmoid")
{

}

public override void Forward(NDArray x)
{
base.Forward(x);
Output = Exp(x) / (1 + Exp(x));
}
}

Fully connected layers connect every neuron in one layer to every neuron in another layer. Every neuron does a basic linear operation Y = W * X. Where Y is the Output, X the Input and W the weights of the connected neuron.

As a quick implementation note, note that the equation for a single neuron looks very similar to a dot-product of two vectors. For a layer of neurons, it is often convenient for efficiency purposes to compute y as a matrix multiply:y = σ ( w x ), where sigma is a matrix in ℝ n×m and the nonlinearity σ is applied componentwise. Lets jump into implementing the layer.

Under the Layers folder, create a new class file “FullyConnected” inherited from “BaseLayer”.

New parameters for the FullyConnected layer:

Input Dim: The number of input data which is usually the column length of the matrix. So is we have a CSV data to train with 100 rows and 10 columns, which means the data have 10 features with 100 varieties. So we are going to use 10 as input dimension.

Output Neurons: The number of neurons for the layer which will perform the same operation defined in the Forward method.

Activation: An instance of the activation (non linear) which will manipulate the output data to fit better.

    /// <summary>
/// Fully connected layer
/// </summary>
public class FullyConnected: BaseLayer
{
/// <summary>
/// Number of incoming input features
/// </summary>
public int InputDim { get; set; }

/// <summary>
/// Number of neurons for this layers
/// </summary>
public int OutNeurons { get; set; }

/// <summary>
/// Non Linear Activation function for this layer of neurons. All neurons will have the same function
/// </summary>
public BaseActivation Activation { get; set; }

/// <summary>
/// Constructor with in and out parameters
/// </summary>
/// <param name="in">Number of incoming input features</param>
/// <param name="out">Number of neurons for this layers</param>
public FullyConnected(int input_dim, int output_neurons, string act) : base("fc")
{
Parameters["w"] = RandomVariable(input_dim, output_neurons);
InputDim = input_dim;
OutNeurons = output_neurons;

Activation = BaseActivation.Get(act);
}

/// <summary>
/// Forward the input data by performing calculation across all the neurons, store it in the Output to be accessible by next layer.
/// </summary>
/// <param name="x"></param>
public override void Forward(NDArray x)
{
base.Forward(x);
Output = Dot(x, Parameters["w"]);

if(Activation!=null)
{
Activation.Forward(Output);
Output = Activation.Output;
}
}
}

The final project with the structure:

Let’s do basic testing to check if it performs some calculation with a given input. Lets define two layers one with 6 neurons and another one with 1 neuron. Lets say the input have 3 features so we will send one shot of signals with 3 different values and the final output should be single value. Below is the visual representation:

Let write the code in the Test project created last time, implement the following code.

          //Load array to the tensor
NDArray x = new NDArray(1, 3);
x.Load(1, 2, 3);
x.Print("Load array");

//Create two layers, one with 6 neurons and another with 1
FullyConnected fc1 = new FullyConnected(3, 6, "relu");
FullyConnected fc2 = new FullyConnected(6, 1, "sigmoid");

//Connect input by passing data from one layer to another
fc1.Forward(x);
x = fc1.Output;
x.Print("FC1 Output");

fc2.Forward(x);
x = fc2.Output;
x.Print("FC2 Output");

Console.ReadLine();

After running here is the result you will see, because the weights are random, the value will not be same all the time.

Perfect, now we have some more groundwork with layers and activation function ready. In the next post, we will work on implementing loss/cost and metric functions which will help us understand the performance of the neural network. Stay tuned 🙂

Source Code: https://github.com/deepakkumar1984/tech-quantum/tree/master/Learn-NN-CSharp/Post2

Originally published at https://www.tech-quantum.com on March 23, 2019.