# 逻辑回归

## 1.简介

逻辑回归并`不是回归，是分类算法`。通过函数映射，通常映射后的值`＞0.5`称为`正例`，反之`反例`，这样的学习称为二分类。

## 2.数学背景

定义映射函数：

![img](https://img-blog.csdnimg.cn/20190218194052888.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F6X0g9cQPbuvxc702bURf%2Ffile.gif?alt=media)

求导性质：

![img](https://img-blog.csdnimg.cn/20190218194201947.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FVwvSPATUvGqNRHZTSSiw%2Ffile.gif?alt=media)

## 3.推导

令z表示线性关系：![img](https://img-blog.csdnimg.cn/20190218194248190.png)![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FPeSSaOychAw6IUq6GbTn%2Ffile.gif?alt=media)

其中x是给定的数据，b是偏置，![img](https://img-blog.csdnimg.cn/20190218194304568.png)![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FMwkcZxX6mn2DOZpXfdz2%2Ffile.gif?alt=media)是需要学习到的参数，通过这样映射得到y∈\[0,1]，记y为正例概率，1-y则是反例概率.

![img](https://img-blog.csdnimg.cn/20190218194319827.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2ppYW5nNDI1Nzc2MDI0,size_16,color_FFFFFF,t_70)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FZuD9Qxtl0h00Gow4HCIL%2Ffile.gif?alt=media)

则：

![img](https://img-blog.csdnimg.cn/20190218194400107.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FkcAY0gliSAND39VGEfTO%2Ffile.gif?alt=media)

记为概率形式,p1,p0有：

![img](https://img-blog.csdnimg.cn/20190218194453496.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2Ff7x5nsjVE7WCojXhWcdV%2Ffile.gif?alt=media)

对正反比取对数发现：

![img](https://img-blog.csdnimg.cn/20190218194517690.png)​

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F9qv9rnrw6iMBD6rDPn1n%2Ffile.gif?alt=media)

结果就是z，也就是说`z越大，正概率比反概率的比越大`，`越可能是正`。这样的一个模型`具有分类表示能力`。

## 4.联合概率

记

![img](https://img-blog.csdnimg.cn/20190218194648735.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F5MKewyVLKvEYpnEBktvo%2Ffile.gif?alt=media)

联合0、1概率可写在一起，为：

![img](https://img-blog.csdnimg.cn/20190218194742670.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2Fmw8EQ5WBUygujdcriio3%2Ffile.gif?alt=media)

取对数时：

![img](https://img-blog.csdnimg.cn/20190218194813549.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FLFGRcevyqKVPCLaS2ElJ%2Ffile.gif?alt=media)

所以`对数联合概率`记为：

![img](https://img-blog.csdnimg.cn/20190218194826884.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FxsqAjOHBXdPhHJDFoT2x%2Ffile.gif?alt=media)

## 5.求参、极大似然

极大似然估计w,b的值，记`m个样本`的联合模型：

![img](https://img-blog.csdnimg.cn/2019021819490538.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FTcaek41b2UNymJhFwJVq%2Ffile.gif?alt=media)

极大化

![img](https://img-blog.csdnimg.cn/20190218195018817.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FK43bYeKqhMR9sbmimH3z%2Ffile.gif?alt=media)

等价于极小化：

![img](https://img-blog.csdnimg.cn/20190218195041140.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FUGFVjY6DyvfsZSEr80P6%2Ffile.gif?alt=media)

最终，转为对`m个数据样本`求极小化：

![img](https://img-blog.csdnimg.cn/20190218195111876.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F0eWljHUpOkhR1m4lW9yQ%2Ffile.gif?alt=media)

## 6.参数求解

![img](https://img-blog.csdnimg.cn/20190218195150447.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F3mN1b5gBzVaHhuNaiFRF%2Ffile.gif?alt=media)

假设求得参数后，就可以利用：

![img](https://img-blog.csdnimg.cn/20190218195215786.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FH2IkJST9ZRc8cxIKEmjg%2Ffile.gif?alt=media)

进行预测了，当`y>0.5`的时候意味着`正的可能性比反的可能性大`，既被预测为`正例`。

## 7.牛顿法、拟牛顿法、梯度下降法等求参

1）牛顿法：<https://blog.csdn.net/jiang425776024/article/details/87601854>

2）拟牛顿法：<https://blog.csdn.net/jiang425776024/article/details/87602847>

3）梯度下降：<https://blog.csdn.net/jiang425776024/article/details/87601506>

### 7.1 牛顿法

方便起见，在这里需要求的是参数w,b的整体![img](https://img-blog.csdnimg.cn/2019021819561028.png)![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FQpBPDQI5Qxrq3LD0g612%2Ffile.gif?alt=media)，根据 5 中的损失函数：

![img](https://img-blog.csdnimg.cn/20190218195627578.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FIxYQh8WeinieyAPxbnAX%2Ffile.gif?alt=media)

一阶导数为：

![img](https://img-blog.csdnimg.cn/20190218195709608.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FSspbqrv34VEqjyTcXv2Y%2Ffile.gif?alt=media)

根据 前面的：

![img](https://img-blog.csdnimg.cn/20190218195737165.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FFYNEkH9StlhHiPBvfvBV%2Ffile.gif?alt=media)

因为

![img](https://img-blog.csdnimg.cn/20190218195755388.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FK1CaZ7a43yyUNyAEIdPI%2Ffile.gif?alt=media)

所以，二阶导数为：

![img](https://img-blog.csdnimg.cn/20190218195815184.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FyM8n7M5FWtqowvevbLna%2Ffile.gif?alt=media)

因此，牛顿法参数的迭代形式：

![img](https://img-blog.csdnimg.cn/20190218195840739.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2ppYW5nNDI1Nzc2MDI0,size_16,color_FFFFFF,t_70)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2Fb63On0hr77lhi8gBAA0W%2Ffile.gif?alt=media)

当然，牛顿法也可以加上一维线性搜索（<https://blog.csdn.net/jiang425776024/article/details/87600422），即，如下梯度法那样，在上面的导数式子上，`加上学习率a`。>

### 7.2 梯度下降法

只需要一阶导数，a为步长，一般∈\[0,1]，代入上面的 1 阶导数即可：

![img](https://img-blog.csdnimg.cn/20190218195908107.png)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FE4W9dOurNcrI1ZJiC6sz%2Ffile.gif?alt=media)

需要说明的是，牛顿法用到了二阶导数的信息，所以通常上，牛顿法的效率总是比梯度下降法好。

## 8.完整流程

输入：

![img](https://img-blog.csdnimg.cn/2019021820001473.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2ppYW5nNDI1Nzc2MDI0,size_16,color_FFFFFF,t_70)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2F44k4iNKi1Fozlu0dfxP1%2Ffile.gif?alt=media)

过程：随机初始化参数![img](https://img-blog.csdnimg.cn/20190218200135682.png)

![img](https://img-blog.csdnimg.cn/20190218200043853.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2ppYW5nNDI1Nzc2MDI0,size_16,color_FFFFFF,t_70)

![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FS76K5kJgfCzGjDf0HzsP%2Ffile.gif?alt=media)

while ![img](https://img-blog.csdnimg.cn/20190218200135682.png)![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FdKOa6UPg9zk8uEveKfip%2Ffile.gif?alt=media)总体变化量<某个阀值：对![img](https://img-blog.csdnimg.cn/20190218200135682.png)![点击并拖拽以移动](https://firebasestorage.googleapis.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LlRDjw7ExCWOBrbokF1%2Fuploads%2FyfZAkf2ApMpTGbk8Xsiz%2Ffile.gif?alt=media)进行`7`中那样的迭代更新参数

## 9.正则化

和其它算法一样，可以对参数进行正则化，L1\L2等，通常都是在损失函数后面加上形如L2正则化：![\frac{1}{2}\alpha||\theta||\_2^2](https://private.codecogs.com/gif.latex?%5Cfrac%7B1%7D%7B2%7D%5Calpha%7C%7C%5Ctheta%7C%7C_2%5E2)，其中a为正则化强度，theta为模型参数。这时参数更新需要加入对正则化这部分的求导。

## 10.多元逻辑回归（有点绕）

二分类中，形如：![ln\frac{P(y=1|x,\theta )}{P(y=0|x,\theta)} = x\theta](https://private.codecogs.com/gif.latex?ln%5Cfrac%7BP%28y%3D1%7Cx%2C%5Ctheta%20%29%7D%7BP%28y%3D0%7Cx%2C%5Ctheta%29%7D%20%3D%20x%5Ctheta)

`多分类中假设有K>2个类别，则有K-1个方程`，（多对多）：

![ln\frac{P(y=1|x,\theta )}{P(y=K|x,\theta)} = x\theta\_1](https://private.codecogs.com/gif.latex?ln%5Cfrac%7BP%28y%3D1%7Cx%2C%5Ctheta%20%29%7D%7BP%28y%3DK%7Cx%2C%5Ctheta%29%7D%20%3D%20x%5Ctheta_1)、![ln\frac{P(y=2|x,\theta )}{P(y=K|x,\theta)} = x\theta\_2](https://private.codecogs.com/gif.latex?ln%5Cfrac%7BP%28y%3D2%7Cx%2C%5Ctheta%20%29%7D%7BP%28y%3DK%7Cx%2C%5Ctheta%29%7D%20%3D%20x%5Ctheta_2),...,![ln\frac{P(y=K-1|x,\theta )}{P(y=K|x,\theta)} = x\theta\_{K-1}](https://private.codecogs.com/gif.latex?ln%5Cfrac%7BP%28y%3DK-1%7Cx%2C%5Ctheta%20%29%7D%7BP%28y%3DK%7Cx%2C%5Ctheta%29%7D%20%3D%20x%5Ctheta_%7BK-1%7D)

则K元逻辑回归的概率分布如下：

![P(y=k|x,\theta ) = e^{x\theta\_k} \bigg/ 1+\sum\limits\_{t=1}^{K-1}e^{x\theta\_t}](https://private.codecogs.com/gif.latex?P%28y%3Dk%7Cx%2C%5Ctheta%20%29%20%3D%20e%5E%7Bx%5Ctheta_k%7D%20%5Cbigg/%201\&plus;%5Csum%5Climits_%7Bt%3D1%7D%5E%7BK-1%7De%5E%7Bx%5Ctheta_t%7D)，k = 1,2,...K-1

![P(y=K|x,\theta ) = 1 \bigg/ 1+\sum\limits\_{t=1}^{K-1}e^{x\theta\_t}](https://private.codecogs.com/gif.latex?P%28y%3DK%7Cx%2C%5Ctheta%20%29%20%3D%201%20%5Cbigg/%201\&plus;%5Csum%5Climits_%7Bt%3D1%7D%5E%7BK-1%7De%5E%7Bx%5Ctheta_t%7D)

剩下的多元逻辑回归的损失函数推导以及优化方法和二元逻辑回归类似（太难写了，省略）。

`还有一种简单粗暴的办法就是，构造多个二分类`，（一对多），比如有A，B,C,D 四个类，那么按照上面的，可以构造4个二分类：正A与反(B，C,D)，正B与反(A,C,D)，.....。这样，就可以按照二分类的情况进行多分类了，只是判断的时候需要进行多个if判断。

## 11.scikit-learn中逻辑回归

主要是3个：[LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)， [LogisticRegressionCV ](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html#sklearn.linear_model.LogisticRegressionCV)和[logistic\_regression\_path](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.logistic_regression_path.html#sklearn.linear_model.logistic_regression_path)。

其中[LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)和 [LogisticRegressionCV](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html#sklearn.linear_model.LogisticRegressionCV)的主要区别是：

[LogisticRegressionCV](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html#sklearn.linear_model.LogisticRegressionCV)`使用了交叉验证`来选择（9中正则化介绍的a）正则化系数C

[LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)需要`自己每次指定一个正则化系数`。

[logistic\_regression\_path](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.logistic_regression_path.html#sklearn.linear_model.logistic_regression_path)主要用在模型选择，不能直接来做预测，只能选择合适逻辑回归的系数和正则化系数。

使用：

这里不会详细介绍api的参数，点击可查看[LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)， [LogisticRegressionCV](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html#sklearn.linear_model.LogisticRegressionCV)的参数说明，

或者：<https://www.cnblogs.com/pinard/p/6035872.html。>

```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LogisticRegressionCV


X, y = load_iris(return_X_y=True)

#solver='lbfgs'使用拟牛顿法迭代寻参
#random_state=0随机种子，这个很必要，数值随便
'''

multi_class参数决定了我们分类方式的选择，有 ovr和multinomial两个值可以选择，默认是 ovr。
ovr即前面提到的one-vs-rest(OvR)，一对多
multinomial即前面提到的many-vs-many(MvM)，多对多。
如果是二元逻辑回归，ovr和multinomial并没有任何区别，区别主要在多元逻辑回归上。
'''
clf = LogisticRegression(random_state=0, C=2, solver='lbfgs', multi_class='multinomial').fit(X, y)#指定了正则化强度为2

clfcv = LogisticRegressionCV(random_state=0, solver='lbfgs', multi_class='multinomial').fit(X, y)#cv类自动寻找正则化强度

pd = clf.predict(X[:2, :])
pdcv = clf.predict(X[:2, :])
print(X[:2, :], '预测类型：', pd)
print(X[:2, :], '预测类型：', pdcv)

pbd = clf.predict_proba(X[:2, :])
pbdcv = clf.predict_proba(X[:2, :])
print(X[:2, :], '预测类型的概率：', pbd)
print(X[:2, :], '预测类型的概率：', pbdcv)

print('预测分数：', clf.score(X, y))
print('预测分数：', clfcv.score(X, y))

'''
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]] 预测类型： [0 0]
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]] 预测类型： [0 0]
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]] 预测类型的概率： [[9.89252266e-01 1.07477333e-02 2.22652613e-10]
 [9.82191282e-01 1.78087169e-02 6.46675701e-10]]
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]] 预测类型的概率： [[9.89252266e-01 1.07477333e-02 2.22652613e-10]
 [9.82191282e-01 1.78087169e-02 6.46675701e-10]]
预测分数： 0.9866666666666667
预测分数： 0.98

Process finished with exit code 0

'''
```

逻辑回归中，若选0.5作为阈值区分正负样本，其决策平面是：

```
逻辑回归和svm默认的决策超平面都是wx+b = 0.
逻辑回归中，y = 1/(1+e^(wx+b)),当0.5作为划分，y=0.5，e^(wx+b)=1，即wx+b=0
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/ji-suan-ji/ml/logisticregression.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.