About the Blog > AI ML Engineering > Activation function

Activation function

In a neural network, an activation function normalizes the input and produces an output which is then passed forward into the subsequent layer.

It is used to determine the output of neural network like yes or no. It maps the resulting values in between 0 to 1 or -1 to 1 etc.

Reason to use Sigmoid function : It exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.
The function is differentiable.That means, we can find the slope of the sigmoid curve at any two points

tanh is also like logistic sigmoid but better. The range of the tanh function is from (-1 to 1). tanh is also sigmoidal (s - shaped).
The function is differentiable.That means, we can find the slope of the sigmoid curve at any two points
The advantage is that the negative inputs will be mapped strongly negative and the zero inputs will be mapped near zero in the tanh graph
Both tanh and logistic sigmoid activation functions are used in feed-forward nets.
Range : (-1 to 1)

It stands for Rectified Linear Unit
Most popular one
Easy and fast to compute
Dying ReLU problem : When we have input as 0 or -ve values , which brings linear function in place . This in turn introduces its problem
Range: [ 0 to infinity)

We have Parameter ReLU (PReLU) which is same as Leaky ReLU but slope in -ve part is learned by backpropagation

ELU : Exponential Linear Unit
SWISH : From Google , includes 40+ layers
MaxOut : ReLU is special case of maxout. Not practical solution (doubles parameters)
SoftMax : Used for multiple classification.

Multiple classification : SoftMax
RNN : TanH
For everything else
- Start with ReLU
- Next , try with Leaky ReLU
- Last resort for PReLU SWiSH can be utilized for really deep network

Soure : Link is here