更新1-12

a191d632 · loopyme · 0b9b632c · a191d632
隐藏空白更改
内联并排

Showing with 53 addition and 78 deletion

docs/13.md docs/13.md +53 -78

未找到文件。
--- a/docs/13.md
+++ b/docs/13.md
@@ -3,34 +3,21 @@
 校验者:
         [@溪流-十四号](https://github.com/apachecn/scikit-learn-doc-zh)
         [@大魔王飞仙](https://github.com/apachecn/scikit-learn-doc-zh)
+        [@Loopy](https://github.com/loopyme)
 翻译者:
         [@v](https://github.com/apachecn/scikit-learn-doc-zh)

-Warning
-
-All classifiers in scikit-learn do multiclass classification out-of-the-box. You don’t need to use the [`sklearn.multiclass`](classes.html#module-sklearn.multiclass "sklearn.multiclass") module unless you want to experiment with different multiclass strategies.
+> **警告** scikit-learn中的所有分类器都可以开箱即用进行多类分类。除非您想尝试不同的多类策略，否则无需使用[`sklearn.multiclass`](classes.html#module-sklearn.multiclass "sklearn.multiclass")模块。

 [`sklearn.multiclass`](classes.html#module-sklearn.multiclass "sklearn.multiclass") 模块采用了 _元评估器_ ，通过把``多类`` 和 `多标签` 分类问题分解为 二元分类问题去解决。这同样适用于多目标回归问题。

-*   **Multiclass classification** **多类分类** 意味着一个分类任务需要对多于两个类的数据进行分类。比如，对一系列的橘子，
-
-苹果或者梨的图片进行分类。多类分类假设每一个样本有且仅有一个标签：一个水果可以被归类为苹果，也可以 是梨，但不能同时被归类为两类。
-
-*   **Multilabel classification** **多标签分类** 给每一个样本分配一系列标签。这可以被认为是预测不
-
-相互排斥的数据点的属性，例如与文档类型相关的主题。一个文本可以归类为任意类别，例如可以同时为政治、金融、 教育相关或者不属于以上任何类别。
-
-*   **Multioutput regression** **多输出分类** 为每个样本分配一组目标值。这可以认为是预测每一个样本的多个属性，
-
-比如说一个具体地点的风的方向和大小。
+*   **Multiclass classification** **多类分类** 意味着一个分类任务需要对多于两个类的数据进行分类。比如，对一系列的橘子，苹果或者梨的图片进行分类。多类分类假设每一个样本有且仅有一个标签：一个水果可以被归类为苹果，也可以 是梨，但不能同时被归类为两类。

-*   **Multioutput-multiclass classification** and **multi-task classification** [**](#id2)多输出-多类分类和
+*   **Multilabel classification** **多标签分类** 给每一个样本分配一系列标签。这可以被认为是预测不相互排斥的数据点的属性，例如与文档类型相关的主题。一个文本可以归类为任意类别，例如可以同时为政治、金融、 教育相关或者不属于以上任何类别。

-```py
-多任务分类** 意味着单个的评估器要解决多个联合的分类任务。这是只考虑二分类的 multi-label classification
-```
+*   **Multioutput regression** **多输出分类** 为每个样本分配一组目标值。这可以认为是预测每一个样本的多个属性，比如说一个具体地点的风的方向和大小。

-> 和 multi-class classification 任务的推广。 _此类问题输出的格式是一个二维数组或者一个稀疏矩阵。_
+*   **Multioutput-multiclass classification** and **multi-task classification** **多输出-多类分类和多任务分类** 意味着单个的评估器要解决多个联合的分类任务。这是只考虑二分类的 multi-label classification 和 multi-class classification 任务的推广。 *此类问题输出的格式是一个二维数组或者一个稀疏矩阵*

 每个输出变量的标签集合可以是各不相同的。比如说，一个样本可以将“梨”作为一个输出变量的值，这个输出变 量在一个含有“梨”、“苹果”等水果种类的有限集合中取可能的值；将“蓝色”或者“绿色”作为第二个输出变量的值， 这个输出变量在一个含有“绿色”、“红色”、“蓝色”等颜色种类的有限集合中取可能的值…

@@ -90,9 +77,7 @@ All classifiers in scikit-learn do multiclass classification out-of-the-box. You
    *   [`sklearn.neighbors.RadiusNeighborsClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.RadiusNeighborsClassifier.html#sklearn.neighbors.RadiusNeighborsClassifier "sklearn.neighbors.RadiusNeighborsClassifier")
    *   [`sklearn.ensemble.RandomForestClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier "sklearn.ensemble.RandomForestClassifier")

-Warning
-
-At present, no metric in [`sklearn.metrics`](classes.html#module-sklearn.metrics "sklearn.metrics") supports the multioutput-multiclass classification task.
+>**警告**:目前,[`sklearn.metrics`](classes.html#module-sklearn.metrics "sklearn.metrics")中没有评估方法能够支持多输出多类分类任务。

 ## 1.12.1\. 多标签分类格式

@@ -105,10 +90,10 @@ At present, no metric in [`sklearn.metrics`](classes.html#module-sklearn.metrics
 >>> y = [[2, 3, 4], [2], [0, 1, 3], [0, 1, 2, 3, 4], [0, 1, 2]]
 >>> MultiLabelBinarizer().fit_transform(y)
 array([[0, 0, 1, 1, 1],
- [0, 0, 1, 0, 0],
- [1, 1, 0, 1, 0],
- [1, 1, 1, 1, 1],
- [1, 1, 1, 0, 0]])
+       [0, 0, 1, 0, 0],
+       [1, 1, 0, 1, 0],
+       [1, 1, 1, 1, 1],
+       [1, 1, 1, 0, 0]])

 ```

@@ -128,12 +113,12 @@ array([[0, 0, 1, 1, 1],
 >>> X, y = iris.data, iris.target
 >>> OneVsRestClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X)
 array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1,
- 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
- 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 2,
- 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
+       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+       0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+       1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1,
+       1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 2,
+       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

 ```

@@ -143,9 +128,8 @@ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

 [![http://sklearn.apachecn.org/cn/0.19.0/_images/sphx_glr_plot_multilabel_0011.png](img/2e06713c93719ff874fb9f4fab7a6fbf.jpg)](https://scikit-learn.org/stable/auto_examples/plot_multilabel.html)

-示例:
-
-*   [Multilabel classification](https://scikit-learn.org/stable/auto_examples/plot_multilabel.html#sphx-glr-auto-examples-plot-multilabel-py)
+>示例:
+>*   [Multilabel classification](https://scikit-learn.org/stable/auto_examples/plot_multilabel.html#sphx-glr-auto-examples-plot-multilabel-py)

 ## 1.12.3\. 1对1

@@ -155,7 +139,7 @@ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

 ### 1.12.3.1\. 多类别学习

-Below is an example of multiclass learning using OvO:
+下面是一个使用OvO进行多类别学习的例子:

 ```py
 >>> from sklearn import datasets
@@ -165,18 +149,17 @@ Below is an example of multiclass learning using OvO:
 >>> X, y = iris.data, iris.target
 >>> OneVsOneClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X)
 array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1,
- 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
- 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
- 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
+       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+       0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+       1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1,
+       1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

 ```

-参考文献:
-
-*   “Pattern Recognition and Machine Learning. Springer”, Christopher M. Bishop, page 183, (First Edition)
+>参考文献:
+>*   “Pattern Recognition and Machine Learning. Springer”, Christopher M. Bishop, page 183, (First Edition)

 ## 1.12.4\. 误差校正输出代码

@@ -192,7 +175,7 @@ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

 ### 1.12.4.1\. 多类别学习

-Below is an example of multiclass learning using Output-Codes:
+下面是一个使用Output-Codes进行多类别学习的例子:

 ```py
 >>> from sklearn import datasets
@@ -213,13 +196,10 @@ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

 ```

-参考文献:
-
-*   “Solving multiclass learning problems via error-correcting output codes”, Dietterich T., Bakiri G., Journal of Artificial Intelligence Research 2, 1995.
-
-| [[3]](#id11) | “The error coding method and PICTs”, James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998. |
-
-*   “The Elements of Statistical Learning”, Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) 2008.
+>参考文献:
+>*   “Solving multiclass learning problems via error-correcting output codes”, Dietterich T., Bakiri G., Journal of Artificial Intelligence Research 2, 1995.
+>*  [3]  “The error coding method and PICTs”, James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998.
+>*   “The Elements of Statistical Learning”, Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) 2008.

 ## 1.12.5\. 多输出回归

@@ -234,15 +214,15 @@ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 >>> X, y = make_regression(n_samples=10, n_targets=3, random_state=1)
 >>> MultiOutputRegressor(GradientBoostingRegressor(random_state=0)).fit(X, y).predict(X)
 array([[-154.75474165, -147.03498585,  -50.03812219],
- [   7.12165031,    5.12914884,  -81.46081961],
- [-187.8948621 , -100.44373091,   13.88978285],
- [-141.62745778,   95.02891072, -191.48204257],
- [  97.03260883,  165.34867495,  139.52003279],
- [ 123.92529176,   21.25719016,   -7.84253   ],
- [-122.25193977,  -85.16443186, -107.12274212],
- [ -30.170388  ,  -94.80956739,   12.16979946],
- [ 140.72667194,  176.50941682,  -17.50447799],
- [ 149.37967282,  -81.15699552,   -5.72850319]])
+       [   7.12165031,    5.12914884,  -81.46081961],
+       [-187.8948621 , -100.44373091,   13.88978285],
+       [-141.62745778,   95.02891072, -191.48204257],
+       [  97.03260883,  165.34867495,  139.52003279],
+       [ 123.92529176,   21.25719016,   -7.84253   ],
+       [-122.25193977,  -85.16443186, -107.12274212],
+       [ -30.170388  ,  -94.80956739,   12.16979946],
+       [ 140.72667194,  176.50941682,  -17.50447799],
+       [ 149.37967282,  -81.15699552,   -5.72850319]])

 ```

@@ -269,15 +249,15 @@ Multioutput classification 支持能够被添加到任何带有 `MultiOutputClas
 >>> multi_target_forest = MultiOutputClassifier(forest, n_jobs=-1)
 >>> multi_target_forest.fit(X, Y).predict(X)
 array([[2, 2, 0],
- [1, 2, 1],
- [2, 1, 0],
- [0, 0, 2],
- [0, 2, 1],
- [0, 0, 2],
- [1, 1, 0],
- [1, 1, 1],
- [0, 0, 2],
- [2, 0, 0]])
+       [1, 2, 1],
+       [2, 1, 0],
+       [0, 0, 2],
+       [0, 2, 1],
+       [0, 0, 2],
+       [1, 1, 0],
+       [1, 1, 1],
+       [0, 0, 2],
+       [2, 0, 0]])

 ```

@@ -291,10 +271,5 @@ Classifier chains (查看 `ClassifierChain`) 是一种集合多个二分类器

 很明显，链的顺序是十分重要的。链上的第一个模型没有关于其他标签的信息，而链上的最后一个模型将会具有所有其他标签的信息。 在一般情况下，我们并不知道链上模型最优的顺序，因此通常会使用许多随机的顺序，将他们的预测求平均。

-参考文献:
-
-```py
-Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank,
-```
-
-“Classifier Chains for Multi-label Classification”, 2009.
+>参考文献:
+>* Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank,“Classifier Chains for Multi-label Classification”, 2009.