ch16.

5fd2f6d5 · wizardforcel · fd69eae1 · 5fd2f6d5
隐藏空白更改
内联并排

Showing with 79 addition and 1 deletion

16.md 16.md +79 -1

未找到文件。
--- a/16.md
+++ b/16.md
@@ -531,6 +531,84 @@ Observed statistic: 9.266142572024918
 Empirical P-value: 0.0
 ```

-原始样本中的观测差异约为 9.27 盎司，与此分布不一致：经验性 P 值为 0，这意味着确切的 P 值确实非常小。 因此，测试的结论是，在总体中，不吸烟者和吸烟者的婴儿出生体重的分布是不同的。
+原始样本中的观测差异约为 9.27 盎司，与此分布不一致：经验 P 值为 0，这意味着确切的 P 值确实非常小。 因此，测试的结论是，在总体中，不吸烟者和吸烟者的婴儿出生体重的分布是不同的。


+## 差值的自举置信区间
+
+我们的 A/B 测试得出结论，这两个分布是不同的，但有点不尽人意。他们有多么不同？哪一个均值更大？这些自然是测试无法回答的问题。
+
+回想一下，我们之前已经讨论过这个问题了：不仅仅是问“两个分布是否不同”的是与否的问题，我们可以通过不作任何假设，并简单地估计均值之间的差异，来学到更多。
+
+观测差异（不吸烟者减去吸烟者）约为 9.27 盎司；这个正面迹象表明，不吸烟的母亲通常有更大的婴儿。但由于随机性，样本可能会有所不同。为了了解有多么不同，我们必须生成更多的样本；为了生成更多的样本，我们将使用`bootstrap`，就像我们以前做过的那样。自举过程不会假设这两个分布是否相同。它只是复制原始随机样本并计算统计量的新值。
+
+函数`bootstrap_ci_means`返回总体中两组均值之间差异的自举置信区间。在我们的例子中，置信区间将估计总体中吸烟和不吸烟的母亲的婴儿的平均出生体重之间的差异。
+
+   表名称，它包含原始样本中的数据
+   列标签，它包含数值变量
+   列标签，它包含两个样本的名称
+   自举的重复次数
+
+该函数使用自举百分比方法，返回两个均值之间的差异的约 95% 置信区间。
+
+```py
+def bootstrap_ci_means(table, variable, classes, repetitions):
+
+    """Bootstrap approximate 95% confidence interval
+    for the difference between the means of the two classes
+    in the population"""
+
+    t = table.select(variable, classes)
+
+    mean_diffs = make_array()
+    for i in np.arange(repetitions):
+        bootstrap_sample = t.sample()
+        m_tbl = bootstrap_sample.group(classes, np.mean)
+        new_stat = m_tbl.column(1).item(0) - m_tbl.column(1).item(1)
+        mean_diffs = np.append(mean_diffs, new_stat)
+
+    left = percentile(2.5, mean_diffs)
+    right = percentile(97.5, mean_diffs)
+
+    # Find the observed test statistic
+    means_table = t.group(classes, np.mean) 
+    obs_stat = means_table.column(1).item(0) - means_table.column(1).item(1)
+
+    Table().with_column('Difference Between Means', mean_diffs).hist(bins=20)
+    plots.plot(make_array(left, right), make_array(0, 0), color='yellow', lw=8)
+    print('Observed difference between means:', obs_stat)
+    print('Approximate 95% CI for the difference between means:')
+    print(left, 'to', right)
+bootstrap_ci_means(baby, 'Birth Weight', 'Maternal Smoker', 5000)
+Observed difference between means: 9.266142572024918
+Approximate 95% CI for the difference between means:
+7.23940878698 to 11.3907887554
+
+```
+
+不吸烟的母亲的婴儿比吸烟的母亲的婴儿平均重 7.2 盎司到 11.4 盎司。 这比“两个分布不同”更有用。 由于置信区间不包含 0，它也告诉我们这两个分布是不同的。 所以置信区间估计了我们的均值之间的差异，也让我们决定两个基本分布是否相同。
+
+不吸烟的母亲比吸烟的母亲平均年龄稍大。
+
+```py
+bootstrap_ci_means(baby, 'Maternal Age', 'Maternal Smoker', 5000)
+Observed difference between means: 0.8076725017901509
+Approximate 95% CI for the difference between means:
+0.154278698588 to 1.4701157656
+```
+
+但毫不奇怪，证据并没有指出，他们的平均身高与不吸烟的母亲不同。 零在均值之间差异的置信区间中。
+
+```py
+bootstrap_ci_means(baby, 'Maternal Height', 'Maternal Smoker', 5000)
+Observed difference between means: 0.09058914941267915
+Approximate 95% CI for the difference between means:
+-0.390841928035 to 0.204388297872
+```
+
+总之：
+
+如果您想知道两个基本分布是否相同，则可以使用带有适当检验统计量的排列检验。 当分布是类别时，我们使用总变异距离，而分布是数值时，我们使用均值之间的绝对差。
+
+为了比较两个数值分布，将假设检验替换为估计，通常更富有信息。 只需估计一个差异，比如两组均值之间的差异。 这可以通过构建自举置信区间来完成。 如果零不在这个区间内，你可以得出这样的结论：这两个分布是不同的，你也可以估计均值有多么不同。
+