# 最小二乘法

```python
import numpy as np
import scipy as sp
from scipy.optimize import leastsq
import matplotlib.pyplot as plt
```

* ps: numpy.poly1d(\[1,2,3])  生成  $$1x^2+2x^1+3x^0$$\*

```python
# 目标函数
def real_func(x):
    return np.sin(2*np.pi*x)

# 多项式
def fit_func(p, x):
    f = np.poly1d(p)
    return f(x)

# 残差
def residuals_func(p, x, y):
    ret = fit_func(p, x) - y
    return ret
```

训练

```python
# 十个点
x = np.linspace(0, 1, 10)
x_points = np.linspace(0, 1, 1000)
# 加上正态分布噪音的目标函数的值
y_ = real_func(x)
y = [np.random.normal(0, 0.1) + y1 for y1 in y_]


def fitting(M=0):
    """
    M    为多项式的次数
    """
    # 随机初始化多项式参数
    p_init = np.random.rand(M + 1)
    # 最小二乘法
    p_lsq = leastsq(residuals_func, p_init, args=(x, y))
    print('Fitting Parameters:', p_lsq[0])

    # 可视化
    plt.plot(x_points, real_func(x_points), label='real')
    plt.plot(x_points, fit_func(p_lsq[0], x_points), label='fitted curve')
    plt.plot(x, y, 'bo', label='noise')
    plt.legend()
    return p_lsq
```

```python
# M=0
p_lsq_0 = fitting(M=0)
#Fitting Parameters: [0.02515259]
```

![](/files/-Lr7Ru3I7ya0JkBsQZue)

```python
# M=1
p_lsq_1 = fitting(M=1)
#Fitting Parameters: [-1.50626624  0.77828571]
```

![](/files/-Lr7Ru3L1Spwy8wq75wY)

```python
# M=3
p_lsq_3 = fitting(M=3)
#Fitting Parameters: [ 2.21147559e+01 -3.34560175e+01  1.13639167e+01 -2.82318048e-02]
```

![](/files/-Lr7Ru3Ngco3LQc6AuQv)

```python
# M=9,当M=9时，多项式曲线通过了每个数据点，但是造成了过拟合
p_lsq_9 = fitting(M=9)
#Fitting Parameters: [-1.70872086e+04  7.01364939e+04 -1.18382087e+05  1.06032494e+05
 -5.43222991e+04  1.60701108e+04 -2.65984526e+03  2.12318870e+02
 -7.15931412e-02  3.53804263e-02]
```

![](/files/-Lr7Ru3PKzSXWdxDvqc-)

## 正则化

结果显示过拟合， 引入正则化项(regularizer)，降低过拟合

$$Q(x)=\sum\_{i=1}^n(h(x\_i)-y\_i)^2+\lambda||w||^2$$。

回归问题中，损失函数是平方损失，正则化可以是参数向量的L2范数,也可以是L1范数。

* L1: regularization\*abs(p)
* L2: 0.5 \* regularization \* np.square(p)

```python
regularization = 0.0001


def residuals_func_regularization(p, x, y):
    ret = fit_func(p, x) - y
    ret = np.append(ret,np.sqrt(0.5 * regularization * np.square(p)))  # L2范数作为正则化项
    return ret
```

```python
# 最小二乘法,加正则化项
p_init = np.random.rand(9 + 1)
p_lsq_regularization = leastsq(residuals_func_regularization, p_init, args=(x, y))
```

打印图像

```python
plt.plot(x_points, real_func(x_points), label='real')
plt.plot(x_points, fit_func(p_lsq_9[0], x_points), label='fitted curve')
plt.plot(
    x_points,
    fit_func(p_lsq_regularization[0], x_points),
    label='regularization')
plt.plot(x, y, 'bo', label='noise')
plt.legend()
```

![](/files/-Lr7Ru3RFh4oJrEUJrWY)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/ji-suan-ji/chun-dai-ma-shi-xian-ml/0.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
