opencv - Parameter selection in Adaboost -


after using opencv boosting i'm trying implement own version of adaboost algorithm (check here, here , the original paper references).

by reading material i've came questions regarding implementation of algorithm.

1) not clear me how weights a_t of each weak learner assigned.

in sources i've pointed out choice a_t = k * ln( (1-e_t) / e_t ), k being positive constant , e_t error rate of particular weak learner.

at page 7 of source says that particular value minimizes convex differentiable function, don't understand passage.

  • can please explain me?

2) have doubts on procedure of weight update of training samples.

clearly should done in such way guarantee remain probability distribution. references adopt choice:

d_{t+1}(i) = d_{t}(i) * e^(-a_ty_ih_t(x_i)) / z_t (where z_t normalization factor chosen d_{t+1} distribution).

  • but why particular choice of weight update multiplicative exponential of error rate made particular weak learner?
  • are there other updates possible? , if yes there proof update guarantees kind of optimality of learning process?

i hope right place post question, if not please redirect me!
in advance can provide.

1) first question:

a_t = k * ln( (1-e_t) / e_t ) 

since error on training data bounded product of z_t)alpha), , z_t(alpha) convex w.r.t. alpha, , there 1 "global" optimal alpha minimize upperbound of error. intuition of how find magic "alpha"

2) 2nd question: why particular choice of weight update multiplicative exponential of error rate made particular weak learner?

to cut short: intuitive way of finding above alpha indeed improve accuracy. not surprising: trusting more (by giving larger alpha weight) of learners work better others, , trust less (by giving smaller alpha) work worse. learners brining no new knowledge previous learners, assign weight alpha equal 0.

it possible prove (see) final boosted hypothesis yielding training error bounded

exp(-2 \sigma_t (1/2 - epsilon_t)^2 ) 

3) 3rd question: there other updates possible? , if yes there proof update guarantees kind of optimality of learning process?

this hard say. remember here update improving accuracy on "training data" (at risk of over-fitting), hard generality.


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

javascript - Clean way to programmatically use CSS transitions from JS? -

android - send complex objects as post php java -