Package smile.deep.activation
Class ReLU
java.lang.Object
smile.deep.activation.ReLU
- All Implemented Interfaces:
Serializable
,ActivationFunction
The rectifier activation function
max(0, x)
.
It is introduced with strong biological motivations and mathematical
justifications. The rectifier is the most popular activation function
for deep neural networks. A unit employing the rectifier is called a
rectified linear unit (ReLU).
ReLU neurons can sometimes be pushed into states in which they become
inactive for essentially all inputs. In this state, no gradients flow
backward through the neuron, and so the neuron becomes stuck in a
perpetually inactive state and "dies". This is a form of the vanishing
gradient problem. In some cases, large numbers of neurons in a network
can become stuck in dead states, effectively decreasing the model
capacity. This problem typically arises when the learning rate is
set too high. It may be mitigated by using leaky ReLUs instead,
which assign a small positive slope for x < 0
however the
performance is reduced.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
-
Constructor Details
-
ReLU
public ReLU()Constructor.
-
-
Method Details
-
name
Description copied from interface:ActivationFunction
Returns the name of activation function.- Specified by:
name
in interfaceActivationFunction
- Returns:
- the name of activation function.
-
f
public void f(double[] x) Description copied from interface:ActivationFunction
The output function.- Specified by:
f
in interfaceActivationFunction
- Parameters:
x
- the input vector.
-
g
public void g(double[] g, double[] y) Description copied from interface:ActivationFunction
The gradient function.- Specified by:
g
in interfaceActivationFunction
- Parameters:
g
- the gradient vector. On input, it holds W'*g, where W and g are the weight matrix and gradient of upper layer, respectively. On output, it is the gradient of this layer.y
- the output vector.
-