smile.deep.activation.ReLU

All Implemented Interfaces:: Serializable, ActivationFunction

public class ReLU extends Object implements ActivationFunction

The rectifier activation function max(0, x). It is introduced with strong biological motivations and mathematical justifications. The rectifier is the most popular activation function for deep neural networks. A unit employing the rectifier is called a rectified linear unit (ReLU).

ReLU neurons can sometimes be pushed into states in which they become inactive for essentially all inputs. In this state, no gradients flow backward through the neuron, and so the neuron becomes stuck in a perpetually inactive state and "dies". This is a form of the vanishing gradient problem. In some cases, large numbers of neurons in a network can become stuck in dead states, effectively decreasing the model capacity. This problem typically arises when the learning rate is set too high. It may be mitigated by using leaky ReLUs instead, which assign a small positive slope for x < 0 however the performance is reduced.

See Also:

Serialized Form

Constructor Summary

Constructors

Constructor

Description

ReLU()

Constructor.
Method Summary

Modifier and Type

Method

Description

void

f(double[] x)

The output function.

void

g(double[] g, double[] y)

The gradient function.

String

name()

Returns the name of activation function.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- ReLU
  
  public ReLU()
  
  Constructor.
Method Details
- name
  
  public String name()
  
  Description copied from interface: ActivationFunction
  
  Returns the name of activation function.
  
  Specified by:
  
  name in interface ActivationFunction
  
  Returns:
  
  the name of activation function.
- f
  
  public void f(double[] x)
  
  Description copied from interface: ActivationFunction
  
  The output function.
  
  Specified by:
  
  f in interface ActivationFunction
  
  Parameters:
  
  x - the input vector.
- g
  
  public void g(double[] g, double[] y)
  
  Description copied from interface: ActivationFunction
  
  The gradient function.
  
  Specified by:
  
  g in interface ActivationFunction
  
  Parameters:
  
  g - the gradient vector. On input, it holds W'*g, where W and g are the weight matrix and gradient of upper layer, respectively. On output, it is the gradient of this layer.
  
  y - the output vector.

Class ReLU

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

ReLU

Method Details

name

f

g