
上QQ阅读APP看书,第一时间看更新
Leaky ReLU and maxout
A Leaky ReLU will have a small slope α on the negative side, such as 0.01. The slope α can also be made into a parameter of each neuron, such as in PReLU neurons (P stands for parametric). The problem with this activation function is the inconsistency of the effectiveness of such modifications to various problems.
Maxout is another attempt to solve the dead neuron problem in ReLU. It takes the form . From this form, we can see that both ReLU and leaky ReLU are just special cases of this form, that is, for ReLU, it's
. Although it benefits from linearity and having no saturation, it has doubled the number of parameters for every single neuron.