AdaBoost
The AdaBoost (adaptive boosting) algorithm was proposed in 1995 by Yoav Freund and Robert Shapire as a general method for generating a strong classifier out of a set of weak classifiers . AdaBoost works even when the classifiers come from a continuum of potential classifiers (such as neural networks, linear discriminants, etc.)
AdaBoost
Pros: Low generalization error, easy to code, works with most classifiers, no parameters to adjust
Cons: Sensitive to outliers
Works with: Numeric values, nominal values
General approach to AdaBoost
- Collect: Any method.
- Prepare: It depends on which type of weak learner you’re going to use. In this chapter, we’ll use decision stumps, which can take any type of data. You could use any classifier, so any of the classifiers from chapters 2–6 would work. Simple classifiers work better for a weak learner.
- Analyze: Any method.
- Train: The majority of the time will be spent here. The classifier will train the weak learner multiple times over the same dataset.
- Test: Calculate the error rate.
- Use: Like support vector machines, AdaBoost predicts one of two classes.
Creating a weak learner with a decision stump
A decision stump is a simple decision tree.It’s a tree with only one split, so it’s a stump.
from numpy import *
def loadSimpData():
datMat = matrix([[ 1. , 2.1],
[ 2. , 1.1],
[ 1.3, 1. ],
[ 1. , 1. ],
[ 2. , 1. ]])
classLabels = [1.0, 1.0, -1.0, -1.0, 1.0]
return datMat,classLabels
datMat,classLabels = loadSimpData()
def loadDataSet(fileName): #general function to parse tab -delimited floats
numFeat = len(open(fileName).readline().split('t')) #get number of fields
dataMat = []; labelMat = []
fr = open(fileName)
for line in fr.readlines():
lineArr =[]
curLine = line.strip().split('t')
for i in range(numFeat-1):
lineArr.append(float(curLine[i]))
dataMat.append(lineArr)
labelMat.append(float(curLine[-1]))
return dataMat,labelMat
test If any of the values are less than or gretaer than the threshold value
def stumpClassify(dataMatrix,dimen,threshVal,threshIneq):#just classify the data
retArray = ones((shape(dataMatrix)[0],1))
if threshIneq == 'lt':
retArray[dataMatrix[:,dimen] <= threshVal] = -1.0
else:
retArray[dataMatrix[:,dimen] > threshVal] = -1.0
return retArray
Loop over a weighted version and find the stump that yeilds the lowest error
def buildStump(dataArr,classLabels,D):
dataMatrix = mat(dataArr); labelMat = mat(classLabels).T
m,n = shape(dataMatrix)
numSteps = 10.0; bestStump = {}; bestClasEst = mat(zeros((m,1)))
minError = inf #init error sum, to +infinity
for i in range(n):#loop over all dimensions
rangeMin = dataMatrix[:,i].min(); rangeMax = dataMatrix[:,i].max();
stepSize = (rangeMax-rangeMin)/numSteps
for j in range(-1,int(numSteps)+1):#loop over all range in current dimension
for inequal in ['lt', 'gt']: #go over less than and greater than
threshVal = (rangeMin + float(j) * stepSize)
predictedVals = stumpClassify(dataMatrix,i,threshVal,inequal)#call stump classify with i, j, lessThan
errArr = mat(ones((m,1)))
errArr[predictedVals == labelMat] = 0
#Line where AdaBoost interacts with the classifier.
weightedError = D.T*errArr #calc total error multiplied by D
#print ("split: dim %d, thresh %.2f, thresh ineqal: %s, the weighted error is %.3f" % (i, threshVal, inequal, weightedError))
if weightedError < minError:
minError = weightedError
bestClasEst = predictedVals.copy()
bestStump['dim'] = i
bestStump['thresh'] = threshVal
bestStump['ineq'] = inequal
return bestStump,minError,bestClasEst
D = mat(ones((5,1))/5)
buildStump(datMat,classLabels,D)
Implementing the full AdaBoost algorithm
For each iteration:
Find the best stump using buildStump()
Add the best stump to the stump array
Calculate alpha
Calculate the new weight vector – D
Update the aggregate class estimate
If the error rate ==0.0 : break out of the for loop
def adaBoostTrainDS(dataArr,classLabels,numIt=40):
weakClassArr = []
#mis the number of datapoints in a dataset
m = shape(dataArr)[0]
#D holds all the weights of each peice of data
D = mat(ones((m,1))/m) #init D to all equal
#aggregrate estimate of the class for every data point
aggClassEst = mat(zeros((m,1)))
for i in range(numIt):
bestStump,error,classEst = buildStump(dataArr,classLabels,D)#build Stump
print("D:",D.T)
alpha = float(0.5*log((1.0-error)/max(error,1e-16)))#calc alpha, throw in max(error,eps) to account for error=0
bestStump['alpha'] = alpha
weakClassArr.append(bestStump) #store Stump Params in Array
print ("classEst: ",classEst.T)
expon = multiply(-1*alpha*mat(classLabels).T,classEst) #exponent for D calc, getting messy
D = multiply(D,exp(expon)) #Calc New D for next iteration
D = D/D.sum()
#calc training error of all classifiers, if this is 0 quit for loop early (use break)
aggClassEst += alpha*classEst
print ("aggClassEst: ",aggClassEst.T)
aggErrors = multiply(sign(aggClassEst) != mat(classLabels).T,ones((m,1)))
errorRate = aggErrors.sum()/m
print ("total error: ",errorRate)
if errorRate == 0.0: break
return weakClassArr ,aggClassEst
classifierArray = adaBoostTrainDS(datMat,classLabels,9)
Remember, our class labels were [1.0, 1.0, -1.0, -1.0, 1.0]. In the first iteration, all the D values were equal; then only one value, the first data point, was misclassified. So, in the next iteration, the D vector puts 0.5 weight on the first data point because it was misclassified previously. You can see the total class by looking at the sign of aggClassEst. After the second iteration, you can see that the first data point is correctly classified, but the last data point is now wrong. The D value now becomes 0.5 for the last element, and the other values in the D vector are much smaller. Finally, in the third iteration the sign of all the values in aggClassEst matches your class labels and the training error becomes 0, so you can quit.
classifierArray
Test: Classifying with AdaBoost
All you need to do is take the train of weak classifiers from your training function and apply these to an instance. The result of each weak classifier is weighted by its alpha. The weighted results from all of these weak classifiers are added together, and you take the sign of the final weighted sum to get your final answer.
def adaClassify(datToClass,classifierArr):
dataMatrix = mat(datToClass)#do stuff similar to last aggClassEst in adaBoostTrainDS
m = shape(dataMatrix)[0]
aggClassEst = mat(zeros((m,1)))
for i in range(len(classifierArr)):
classEst =stumpClassify(dataMatrix,classifierArr[i]['dim'],
classifierArr[i]['thresh'],
classifierArr[i]['ineq'])#call stump classify
aggClassEst += classifierArr[i]['alpha']*classEst
print (aggClassEst)
return sign(aggClassEst)
For classify to work please comment out ,aggClassEst in adaBoostTrainingDS
adaClassify([[5, 5],[0,0]],classifierArray)
AdaBoost on a difficult dataset
datArr,labelArr = loadDataSet('horseColicTraining2.txt')
classifierArray = adaBoostTrainDS(datArr,labelArr,10)
I believe that the dataset you used is classified as 1 or -1. What about Datasets that have 3 or more classes? how can I modify your code to run it with multi-classes sets?