Python Neural Network Backpropagation -
i'm learning neural networks, looking @ mlps back-propagation implementation. i'm trying implement own network in python , thought i'd @ other libraries before started. after searching found neil schemenauer's python implementation bpnn.py. (http://arctrix.com/nas/python/bpnn.py)
having worked through code , read first part of christopher m. bishops book titled 'neural networks pattern recognition' found issue in backpropagate function:
# calculate error terms output output_deltas = [0.0] * self.no k in range(self.no): error = targets[k]-self.ao[k] output_deltas[k] = dsigmoid(self.ao[k]) * error
the line of code calculates error different in bishops book. on page 145, equation 4.41 defines output units error as:
d_k = y_k - t_k
where y_k outputs , t_k targets. (i'm using _ represent subscript) question should line of code:
error = targets[k]-self.ao[k]
be infact:
error = self.ao[k] - targets[k]
i'm wrong clear confusion please. thanks
it depends on error measure use. give few examples of error measures (for brevity, i'll use ys
mean vector of n
outputs , ts
mean vector of n
targets):
mean squared error (mse): sum((y - t) ** 2 (y, t) in zip(ys, ts)) / n mean absolute error (mae): sum(abs(y - t) (y, t) in zip(ys, ts)) / n mean logistic error (mle): sum(-log(y) * t - log(1 - y) * (1 - t) (y, t) in zip(ys, ts)) / n
which 1 use depends entirely on context. mse , mae can used when target outputs can take values, , mle gives results when target outputs either 0
or 1
, when y
in open range (0, 1)
.
with said, haven't seen errors y - t
or t - y
used before (i'm not experienced in machine learning myself). far can see, source code provided doesn't square difference or use absolute value, sure book doesn't either? way see y - t
or t - y
can't error measures , here's why:
n = 2 # have 2 output neurons ts = [ 0, 1 ] # our target outputs ys = [ 0.999, 0.001 ] # our sigmoid outputs # notice outputs exact opposite of want them be. # yet, if use (y - t) or (t - y) measure error each neuron , # sum total error of network, 0. t_minus_y = (0 - 0.999) + (1 - 0.001) y_minus_t = (0.999 - 0) + (0.001 - 1)
edit: per alfa's comment, in book, y - t
derivative of mse. in case, t - y
incorrect. note, however, actual derivative of mse 2 * (y - t) / n
, not y - t
.
if don't divide n
(so have summed squared error (sse), not mean squared error), derivative 2 * (y - t)
. furthermore, if use sse / 2
error measure, 1 / 2
, 2
in derivative cancel out , left y - t
.
Comments
Post a Comment