# 随机森林

## Random Forest

​ 随机森林RF是Bagging的扩展变体，在其基础上，进一步在决策树上引入了随机性：

* 传统是基于最优属性划分，而**RF先从属性集合中随机选出k个属性作为候选属性集，再从k个属性集中挑一个最佳属性（按照决策树属性划分的那些方式）作为划分点**。
* 显然**k=1的时候就是每次都随机划分属性了，k=所有属性个数则就是传统决策树划分形式**了。通常按照划分，其中d为所有属性数量
* 值得说明的是，**随机森林通常优于Bagging**，因为显然二者都是基于随机的，但是随机森林随机的更加彻底

## 1基本流程

> 1）按照Bagging那样随机自助采样得到m个数据集
>
> 2）运用决策树算法，利用采样数据进行弱学习器学习，但是决策树每次划分的时候是在当且可选属性d中挑选k个(k<=d)然后再选择最优属性划分。
>
> 3）重复采样、学习T次，得到T个弱学习器，
>
> 4）和Bagging一样，对分类任务：使用投票多数表决了；回归：简单平均。

## 随机森林的使用：

中ensemble下

```python
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import ExtraTreesClassifier

from sklearn.tree import DecisionTreeClassifier

seed = 1
X, y = make_blobs(n_samples=1000, n_features=6, centers=50, random_state=seed)

# 普通决策树
clf = DecisionTreeClassifier(max_depth=None, min_samples_split=2, random_state=seed)
scores = cross_val_score(clf, X, y)
print(scores.mean())

# 随机森林
clf = RandomForestClassifier(n_estimators=10, max_depth=None, min_samples_split=2, random_state=seed)
scores = cross_val_score(clf, X, y)
print(scores.mean())

# RF随机森林的变种
clf = ExtraTreesClassifier(n_estimators=10, max_depth=None, min_samples_split=2, random_state=seed)
scores = cross_val_score(clf, X, y)
print(scores.mean())
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/ji-suan-ji/ml/ji-cheng/random-forest.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
