The randomized weighted majority algorithm is an algorithm in machine learning theory. It improves the mistake bound of the weighted majority algorithm.
Contents
- Motivation
- Randomized weighted majority algorithm RWMA
- Analysis
- Uses of Randomized weighted MajorityRWMN
- Extensions
- References
Imagine that every morning before the stock market opens, we get a prediction from each of our "experts" about whether the stock market will go up or down. Our goal is to somehow combine this set of predictions into a single prediction that we then use to make a buy or sell decision for the day. The RWMA gives us a way to do this combination such that our prediction record will be nearly as good as that of the single best expert in hindsight.
Motivation
In machine learning, the weighted majority algorithm (WMA) is a meta-learning algorithm which "predicts from expert advice". It is not a randomized algorithm:
initialize all experts to weight 1.for each round: poll all the experts and predict based on a weighted majority vote of their predictions. cut in half the weights of all experts that make a mistake.Suppose there are
Randomized weighted majority algorithm (RWMA)
The nonrandomized weighted majority algorithm (WMA) only guarantees an upper bound of
As this is a known limitation of WMA, attempts to improve this shortcoming have been explored in order to improve the dependence on
Analysis
At the
Let's say that
Now, use
Let's see if we made any progress:
If
if
so we can see we made progress. Roughly, of the form
Uses of Randomized weighted Majority(RWMN)
Can use to combine multiple algorithms to do nearly as well as best in hindsight.
can apply Randomized weighted majority algorithm in situations where experts are making choices that cannot be combined (or can't be combined easily).For instance, repeated game-playing or online shortest path problem.In the online shortest path problem, each expert is telling you a different way to drive to work. You pick one using Randomized weighted majority algorithm. Later you find out how well you would have done, and penalize appropriately. To do this right, we want to generalize from just "losS" of 0 to 1 to losses in [0,1]. Goal of having expected loss be not too much worse than loss of best expert.We generalize by penalize
Extensions
- "Bandit" problem
- Efficient algorithm for some cases with many experts.
- Sleeping experts/"specialists" setting.