Submitted by: Submitted by uell
Views: 86
Words: 3949
Pages: 16
Category: Science and Technology
Date Submitted: 09/23/2014 09:21 PM
A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm
Martin Riedmiller Heinrich Braun Institut fur Logik, Komplexitat und Deduktionssyteme University of Karlsruhe W-7500 Karlsruhe
FRG
riedml@ira.uka.de
Abstract- A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behaviour of the errorfunction. In substantial difference to other adaptive techniques, the effect of the RPROP adaptation process is not blurred by the unforseeable influence of the size of the derivative but only dependent on the temporal behaviour of its sign. This leads to an efficient and transparent adaptation process. The promising capabilities of RPROP are shown in comparison to other wellknown adaptive techniques.
1. INTRODUCTION
Obviously, the choice of the learning rate 6 , which scales the derivative, has an important effect on the time needed until convergence is reached. If it is set too small, too many steps are needed to reach an acceptable solution; on the contrary a large learning rate will possibly lead to oscillation, preventing the error t o fall below a certain value. An early way proprosed to get rid of the above problem is to introduce a momentum-term:
A . Backpropagation Learning Backpropagation is the most widely used algorithm for supervised learning with multi-layered feed-forward networks. The basic idea of the backpropagation learning algorithm [l]is the repeated application of the chain rule to compute the influence of each weight in the network with respect to an arbitrary errorfunction E:
dE - a ----- E dsi dnetj
dwij
where the momentum parameter p scales the influence of the previous step on the current. The momentum-term is believed to render the learning procedure more stable and to accelerate convergence in shallow regions of the errorfunction. However, as...