SMO算法笔记及个人理解

大家好，又见面了，我是你们的朋友全栈君。

SMO算法介绍

SMO算法是一种启发式算法，其基本思路是：如果所有变量的解都满足此优化问题的KKT条件，那么这个最优化问题的解就得到了。(KKT条件是该最优化问题的充分必要条件)。否则，选择两个变量，固定其他变量针对这两个变量构建一个二次规划问题。

SMO算法笔记及个人理解

特点：

将原始的二次规划问题分解为只含有两个变量的二次规划子问题，对子问题不断求解，使得所有的变量满足KKT条件

包含两部分：

1、求解两个变量二次规划的解析方法

2、选择变量的启发式方法

（1）第1个变量的选择：确定在当前的分类器中，违反KKT条件的元组Xi；

SMO称第1个变量的选择称为外循环。外循环在训练样本中选取违反KKT条件最严重的样本点，将其作为第一个变量。遍历的时候首先遍历满足 SMO算法笔记及个人理解的样本点，也就是在间隔边界上的支持向量点，检验是否满足KKT条件；如果都满足，那么遍历整个训练集，检验是否满足KKT条件。

（2）第2个变量的选择：根据Xi，找到使得|Ei−Ej|最大的元组Xj；

SMO称第2个变量的选择称为内循环。在找到第一个变量的基础上，第二个变量的标准是希望能使有足够大的变化。由于是依赖于|E1−E2|，为了加快计算的速度，所以选择|E1−E2|最大时的 SMO算法笔记及个人理解。

当E1为正时，那么选择最小的Ei作为E2；如果E1为负，选择最大Ei作为E2。

为了节省时间，通常为每个样本的Ei保存在一个列表中，选择最大的|E1−E2|来近似最大化步长。

SMO算法步骤总结：

1.初始化α，一般情况下令初始的αi全部为0；
2.选取优化变量α1和α2，执行相关的优化计算，得到更新后的α1,α2；
3.开始新的一轮迭代，重复执行上面的第2步，直到全部的αi满足公式(2)的KKT条件以及公式(1)中的约束条件；
SMO算法笔记及个人理解

SMO算法笔记及个人理解

（借鉴其他博主的图解）SVM学习总结（三）SMO算法流程图及注释源码_u010484388的博客-CSDN博客_smo算法代码

SMO算法笔记及个人理解

代码细节

下面的伪代码描述了整个SMO算法:

target = desired output vector
point = training point matrix
procedure takeStep(i1,i2)
   if (i1 == i2) return 0
   alph1 = Lagrange multiplier for i1
   y1 = target[i1]
   E1 = SVM output on point[i1] – y1 (check in error cache)
   s = y1*y2
   Compute L, H via equations (13) and (14)
   if (L == H)
       return 0
   k11 = kernel(point[i1],point[i1])
   k12 = kernel(point[i1],point[i2])
   k22 = kernel(point[i2],point[i2])
   eta = k11+k22-2*k12
   if (eta > 0)
   {

       a2 = alph2 + y2*(E1-E2)/eta
       if (a2 < L) a2 = L
       else if (a2 > H) a2 = H
   }
   else
   {

       Lobj = objective function at a2=L
       Hobj = objective function at a2=H
       if (Lobj < Hobj-eps)
           a2 = L
       else if (Lobj > Hobj+eps)
           a2 = H
       else
           a2 = alph2
   }
   if (|a2-alph2| < eps*(a2+alph2+eps))
       return 0
   a1 = alph1+s*(alph2-a2)
   Update threshold to reflect change in Lagrange multipliers
   Update weight vector to reflect change in a1 & a2, if SVM is linear
   Update error cache using new Lagrange multipliers
   Store a1 in the alpha array
   Store a2 in the alpha array
   return 1
endprocedure

procedure examineExample(i2)
   y2 = target[i2]
   alph2 = Lagrange multiplier for i2
   E2 = SVM output on point[i2] – y2 (check in error cache)
   r2 = E2*y2
   if ((r2 < -tol && alph2 < C) || (r2 > tol && alph2 > 0))
   {

       if (number of non-zero & non-C alpha > 1)
       {

           i1 = result of second choice heuristic (section 2.2)
           if takeStep(i1,i2)
               return 1
       }
       loop over all non-zero and non-C alpha, starting at a random point
       {

           i1 = identity of current alpha
           if takeStep(i1,i2)
               return 1
       }
       loop over all possible i1, starting at a random point
       {

           i1 = loop variable
           if (takeStep(i1,i2)
               return 1
       }
   }
   return 0
endprocedure

main routine:
   numChanged = 0;
   examineAll = 1;
   while (numChanged > 0 | examineAll)
   {

       numChanged = 0;
       if (examineAll)
           loop I over all training examples
               numChanged += examineExample(I)
       else
           loop I over examples where alpha is not 0 & not C
               numChanged += examineExample(I)
       if (examineAll == 1)
           examineAll = 0
       else if (numChanged == 0)
           examineAll = 1
}