diff --git a/7.md b/7.md index 2b5c54052f44c9c1a26d474e03c2c056e958a638..96ee9138ad7a71a0564d63d5221a980461e486e0 100644 --- a/7.md +++ b/7.md @@ -496,13 +496,13 @@ cones.group('Flavor', sum) | chocolate | 16.55 | | strawberry | 8.8 | -为了创建这个新表格,`group`已经计算了对应于每种不同口味的,所有行中的`Price`条目的总和。 三个`chocolate`行的价格共计`$16.55`(您可以假设价格是以美元计量的)。 两个`strawberry`行的价格共计`8.80`。 +为了创建这个新表格,`group`已经计算了对应于每种不同口味的,所有行中的`Price`条目的总和。 三个`chocolate`行的价格共计`$16.55`(你可以假设价格是以美元计量的)。 两个`strawberry`行的价格共计`8.80`。 新创建的“总和”列的标签是`Price sum`,它通过使用被求和列的标签,并且附加单词`sum`创建。 由于`group`计算除了类别之外的所有列的`sum`,因此不需要指定必须对价格求和。 -为了更详细地了解`group`在做什么,请注意,您可以自己计算总价格,不仅可以通过心算,还可以使用代码。 例如,要查找所有巧克力圆筒的总价格,您可以开始创建一个仅包含巧克力圆筒的新表,然后访问价格列: +为了更详细地了解`group`在做什么,请注意,你可以自己计算总价格,不仅可以通过心算,还可以使用代码。 例如,要查找所有巧克力圆筒的总价格,你可以开始创建一个仅包含巧克力圆筒的新表,然后访问价格列: ```py cones.where('Flavor', are.equal_to('chocolate')).column('Price') @@ -540,7 +540,7 @@ price_totals | chocolate | [ 4.75 6.55 5.25] | 16.55 | | strawberry | [ 3.55 5.25] | 8.8 | -你可以用任何其他可以用于数组的函数来替换`sum`。 例如,您可以使用`max`来查找每个类别中的最大价格: +你可以用任何其他可以用于数组的函数来替换`sum`。 例如,你可以使用`max`来查找每个类别中的最大价格: ```py cones.group('Flavor', max) @@ -763,7 +763,7 @@ more_cones.group(['Flavor', 'Color'], sum) | strawberry | pink | 8.8 | -三个或更多的变量。 您可以使用`group`,按三个或更多类别变量对行分类。 只要将它们全部包含列表中,它是第一个参数。 但是由多个变量交叉分类可能会变得复杂,因为不同类别组合的数量可能相当大。 +三个或更多的变量。 你可以使用`group`,按三个或更多类别变量对行分类。 只要将它们全部包含列表中,它是第一个参数。 但是由多个变量交叉分类可能会变得复杂,因为不同类别组合的数量可能相当大。 ### 数据透视表:重新排列`group`的输出 @@ -1048,7 +1048,7 @@ table1.join(table1_column_for_joining, table2, table2_column_for_joining) ``` -新的`rated`表使我们能够计算出价格与星星的比值,您可以把它看作是非正式的价值衡量标准。 低的值更好 - 它们意味着你为评分的每个星星花费更少。 +新的`rated`表使我们能够计算出价格与星星的比值,你可以把它看作是非正式的价值衡量标准。 低的值更好 - 它们意味着你为评分的每个星星花费更少。 ```py rated.with_column('$/Star', rated.column('Price') / rated.column('Stars')).sort(3) @@ -1122,3 +1122,301 @@ cones.join('Flavor', average_review, 'Flavor') | vanilla | 4.75 | 5 | 注意草莓圆筒是如何消失的。 没有草莓圆筒的评价,所以没有草莓的行可以连接的东西。 这可能是一个问题,也可能不是 - 这取决于我们试图使用连接表执行的分析。 + +## 湾区共享单车 + +在本章结尾,我们通过使用我们学过的所有方法,来检验新的大型数据集。 我们还将介绍一个强大的可视化工具`map_table`。 + +湾区自行车共享服务公司在其系统中发布了一个数据集,描述了 2014 年 9 月到 2015 年 8 月期间的每个自行车的租赁。 总共有 354152 次出租。 表的列是: + ++ 租赁 ID ++ 租赁的时间,以秒为单位 ++ 开始日期 ++ 起点站的名称和起始终端的代码 ++ 终点站的名称和终止终端的代码 ++ 自行车的序列号 ++ 订阅者类型和邮政编码 + +```py +trips = Table.read_table('trip.csv') +trips +``` + + +| Trip ID | Duration | Start Date | Start Station | Start Terminal | End Date | End Station | End Terminal | Bike # | Subscriber Type | Zip Code | +| --- | --- | +| 913460 | 765 | 8/31/2015 23:26 | Harry Bridges Plaza (Ferry Building) | 50 | 8/31/2015 23:39 | San Francisco Caltrain (Townsend at 4th) | 70 | 288 | Subscriber | 2139 | +| 913459 | 1036 | 8/31/2015 23:11 | San Antonio Shopping Center | 31 | 8/31/2015 23:28 | Mountain View City Hall | 27 | 35 | Subscriber | 95032 | +| 913455 | 307 | 8/31/2015 23:13 | Post at Kearny | 47 | 8/31/2015 23:18 | 2nd at South Park | 64 | 468 | Subscriber | 94107 | +| 913454 | 409 | 8/31/2015 23:10 | San Jose City Hall | 10 | 8/31/2015 23:17 | San Salvador at 1st | 8 | 68 | Subscriber | 95113 | +| 913453 | 789 | 8/31/2015 23:09 | Embarcadero at Folsom | 51 | 8/31/2015 23:22 | Embarcadero at Sansome | 60 | 487 | Customer | 9069 | +| 913452 | 293 | 8/31/2015 23:07 | Yerba Buena Center of the Arts (3rd @ Howard) | 68 | 8/31/2015 23:12 | San Francisco Caltrain (Townsend at 4th) | 70 | 538 | Subscriber | 94118 | +| 913451 | 896 | 8/31/2015 23:07 | Embarcadero at Folsom | 51 | 8/31/2015 23:22 | Embarcadero at Sansome | 60 | 363 | Customer | 92562 | +| 913450 | 255 | 8/31/2015 22:16 | Embarcadero at Sansome | 60 | 8/31/2015 22:20 | Steuart at Market | 74 | 470 | Subscriber | 94111 | +| 913449 | 126 | 8/31/2015 22:12 | Beale at Market | 56 | 8/31/2015 22:15 | Temporary Transbay Terminal (Howard at Beale) | 55 | 439 | Subscriber | 94130 | +| 913448 | 932 | 8/31/2015 21:57 | Post at Kearny | 47 | 8/31/2015 22:12 | South Van Ness at Market | 66 | 472 | Subscriber | 94702 | + +(省略了 354142 行) + +我们只专注于免费行程,这是持续不到 1800 秒(半小时)的行程。 长途行程需要付费。 + +下面的直方图显示,大部分行程需要大约 10 分钟(600 秒)左右。 很少有人花了近 30 分钟(1800 秒),可能是因为人们试图在截止时间之前退还自行车,以免付费。 + +```py +commute = trips.where('Duration', are.below(1800)) +commute.hist('Duration', unit='Second') +``` + +我们可以通过指定更多的桶来获得更多的细节。 但整体形状并没有太大变化。 + +```py +commute.hist('Duration', bins=60, unit='Second') +``` + +### 使用`group `和`pivot`探索数据 + +我们可以使用`group `来识别最常用的起点站。 + +```py +starts = commute.group('Start Station').sort('count', descending=True) +starts +``` + + +| Start Station | count | +| --- | --- | +| San Francisco Caltrain (Townsend at 4th) | 25858 | +| San Francisco Caltrain 2 (330 Townsend) | 21523 | +| Harry Bridges Plaza (Ferry Building) | 15543 | +| Temporary Transbay Terminal (Howard at Beale) | 14298 | +| 2nd at Townsend | 13674 | +| Townsend at 7th | 13579 | +| Steuart at Market | 13215 | +| Embarcadero at Sansome | 12842 | +| Market at 10th | 11523 | +| Market at Sansome | 11023 | + +(省略了 60 行) + +大多数行程起始于 Townsend 的 Caltrain 站,和旧金山的四号车站。 人们乘坐火车进入城市,然后使用共享单车到达下一个目的地。 + +`group`方法也可以用于按照起点站和终点站,对租赁进行分类。 + +```py +commute.group(['Start Station', 'End Station']) +``` + + +| Start Station | End Station | count | +| --- | --- | +| 2nd at Folsom | 2nd at Folsom | 54 | +| 2nd at Folsom | 2nd at South Park | 295 | +| 2nd at Folsom | 2nd at Townsend | 437 | +| 2nd at Folsom | 5th at Howard | 113 | +| 2nd at Folsom | Beale at Market | 127 | +| 2nd at Folsom | Broadway St at Battery St | 67 | +| 2nd at Folsom | Civic Center BART (7th at Market) | 47 | +| 2nd at Folsom | Clay at Battery | 240 | +| 2nd at Folsom | Commercial at Montgomery | 128 | +| 2nd at Folsom | Davis at Jackson | 28 | + +(省略了 1619 行) + +共有五十四次行程开始和结束于在 Folsom 二号车站,。 有很多人(437 人)往返于 Folsom 二号和 Townsend 二号车站之间。 + +`pivot`方法执行相同的分类,但将结果显示在一个透视表中,该表显示了起点和终点站的所有可能组合,即使其中一些不对应任何行程。 请记住,`pivot`函数的第一个参数指定了数据透视表的列标签;第二个参数指定行标签。 + +在 Beale at Market 附近有一个火车站以及一个湾区快速公交(BART)站,解释了从那里开始和结束的大量行程。 + +```py +commute.pivot('Start Station', 'End Station') +``` + + +| End Station | 2nd at Folsom | 2nd at South Park | 2nd at Townsend | 5th at Howard | Adobe on Almaden | Arena Green / SAP Center | Beale at Market | Broadway St at Battery St | California Ave Caltrain Station | Castro Street and El Camino Real | Civic Center BART (7th at Market) | Clay at Battery | Commercial at Montgomery | Cowper at University | Davis at Jackson | Embarcadero at Bryant | Embarcadero at Folsom | Embarcadero at Sansome | Embarcadero at Vallejo | Evelyn Park and Ride | Franklin at Maple | Golden Gate at Polk | Grant Avenue at Columbus Avenue | Harry Bridges Plaza (Ferry Building) | Howard at 2nd | Japantown | MLK Library | Market at 10th | Market at 4th | Market at Sansome | Mechanics Plaza (Market at Battery) | Mezes Park | Mountain View Caltrain Station | Mountain View City Hall | Palo Alto Caltrain Station | Park at Olive | Paseo de San Antonio | Post at Kearny | Powell Street BART | Powell at Post (Union Square) | Redwood City Caltrain Station | Redwood City Medical Center | Redwood City Public Library | Rengstorff Avenue / California Street | Ryland Park | SJSU - San Salvador at 9th | SJSU 4th at San Carlos | San Antonio Caltrain Station | San Antonio Shopping Center | San Francisco Caltrain (Townsend at 4th) | San Francisco Caltrain 2 (330 Townsend) | San Francisco City Hall | San Jose City Hall | San Jose Civic Center | San Jose Diridon Caltrain Station | San Mateo County Center | San Pedro Square | San Salvador at 1st | Santa Clara County Civic Center | Santa Clara at Almaden | South Van Ness at Market | Spear at Folsom | St James Park | Stanford in Redwood City | Steuart at Market | Temporary Transbay Terminal (Howard at Beale) | Townsend at 7th | University and Emerson | Washington at Kearny | Yerba Buena Center of the Arts (3rd @ Howard) | +| --- | --- | +| 2nd at Folsom | 54 | 190 | 554 | 107 | 0 | 0 | 40 | 21 | 0 | 0 | 44 | 78 | 54 | 0 | 9 | 77 | 32 | 41 | 14 | 0 | 0 | 11 | 30 | 416 | 53 | 0 | 0 | 169 | 114 | 302 | 33 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 121 | 88 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 694 | 445 | 21 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 38 | 57 | 0 | 0 | 39 | 237 | 342 | 0 | 17 | 31 | +| 2nd at South Park | 295 | 164 | 71 | 180 | 0 | 0 | 208 | 85 | 0 | 0 | 112 | 87 | 160 | 0 | 37 | 56 | 178 | 83 | 116 | 0 | 0 | 57 | 73 | 574 | 500 | 0 | 0 | 139 | 199 | 1633 | 119 | 0 | 0 | 0 | 0 | 0 | 0 | 299 | 84 | 113 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 559 | 480 | 48 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 66 | 152 | 0 | 0 | 374 | 429 | 143 | 0 | 63 | 209 | +| 2nd at Townsend | 437 | 151 | 185 | 92 | 0 | 0 | 608 | 350 | 0 | 0 | 80 | 329 | 168 | 0 | 386 | 361 | 658 | 506 | 254 | 0 | 0 | 27 | 315 | 2607 | 295 | 0 | 0 | 110 | 225 | 845 | 177 | 0 | 0 | 0 | 0 | 0 | 0 | 120 | 100 | 141 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 905 | 299 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 72 | 508 | 0 | 0 | 2349 | 784 | 417 | 0 | 57 | 166 | +| 5th at Howard | 113 | 177 | 148 | 83 | 0 | 0 | 59 | 130 | 0 | 0 | 203 | 76 | 129 | 0 | 30 | 57 | 49 | 166 | 54 | 0 | 0 | 85 | 78 | 371 | 478 | 0 | 0 | 303 | 158 | 168 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 93 | 183 | 169 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 690 | 1859 | 48 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 116 | 102 | 0 | 0 | 182 | 750 | 200 | 0 | 43 | 267 | +| Adobe on Almaden | 0 | 0 | 0 | 0 | 11 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 17 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 16 | 0 | 0 | 0 | 0 | 0 | 19 | 23 | 265 | 0 | 20 | 4 | 5 | 10 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| Arena Green / SAP Center | 0 | 0 | 0 | 0 | 7 | 64 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 16 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 21 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 24 | 3 | 7 | 0 | 0 | 0 | 0 | 0 | 6 | 20 | 7 | 0 | 56 | 12 | 38 | 259 | 0 | 0 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| Beale at Market | 127 | 79 | 183 | 59 | 0 | 0 | 59 | 661 | 0 | 0 | 201 | 75 | 101 | 0 | 247 | 178 | 38 | 590 | 165 | 0 | 0 | 54 | 435 | 57 | 72 | 0 | 0 | 286 | 236 | 163 | 26 | 0 | 0 | 0 | 0 | 0 | 0 | 49 | 227 | 179 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 640 | 269 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 243 | 128 | 0 | 0 | 16 | 167 | 35 | 0 | 64 | 45 | +| Broadway St at Battery St | 67 | 89 | 279 | 119 | 0 | 0 | 1022 | 110 | 0 | 0 | 62 | 283 | 226 | 0 | 191 | 198 | 79 | 231 | 35 | 0 | 0 | 5 | 70 | 168 | 49 | 0 | 0 | 32 | 97 | 341 | 214 | 0 | 0 | 0 | 0 | 0 | 0 | 169 | 71 | 218 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 685 | 438 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 18 | 106 | 0 | 0 | 344 | 748 | 50 | 0 | 79 | 47 | +| California Ave Caltrain Station | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 38 | 1 | 0 | 0 | 0 | 29 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 192 | 40 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 17 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 57 | 0 | 0 | +| Castro Street and El Camino Real | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 931 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 4 | 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + +(省略了 60 行) + +我们也可以使用`pivot`来寻找起点和终点站之间的最短骑行时间。 在这里,`pivot`已经接受了可选参数`Duration`,以及`min`函数,它对每个单元格中的值执行。 + +```py +commute.pivot('Start Station', 'End Station', 'Duration', min) +``` + + +| End Station | 2nd at Folsom | 2nd at South Park | 2nd at Townsend | 5th at Howard | Adobe on Almaden | Arena Green / SAP Center | Beale at Market | Broadway St at Battery St | California Ave Caltrain Station | Castro Street and El Camino Real | Civic Center BART (7th at Market) | Clay at Battery | Commercial at Montgomery | Cowper at University | Davis at Jackson | Embarcadero at Bryant | Embarcadero at Folsom | Embarcadero at Sansome | Embarcadero at Vallejo | Evelyn Park and Ride | Franklin at Maple | Golden Gate at Polk | Grant Avenue at Columbus Avenue | Harry Bridges Plaza (Ferry Building) | Howard at 2nd | Japantown | MLK Library | Market at 10th | Market at 4th | Market at Sansome | Mechanics Plaza (Market at Battery) | Mezes Park | Mountain View Caltrain Station | Mountain View City Hall | Palo Alto Caltrain Station | Park at Olive | Paseo de San Antonio | Post at Kearny | Powell Street BART | Powell at Post (Union Square) | Redwood City Caltrain Station | Redwood City Medical Center | Redwood City Public Library | Rengstorff Avenue / California Street | Ryland Park | SJSU - San Salvador at 9th | SJSU 4th at San Carlos | San Antonio Caltrain Station | San Antonio Shopping Center | San Francisco Caltrain (Townsend at 4th) | San Francisco Caltrain 2 (330 Townsend) | San Francisco City Hall | San Jose City Hall | San Jose Civic Center | San Jose Diridon Caltrain Station | San Mateo County Center | San Pedro Square | San Salvador at 1st | Santa Clara County Civic Center | Santa Clara at Almaden | South Van Ness at Market | Spear at Folsom | St James Park | Stanford in Redwood City | Steuart at Market | Temporary Transbay Terminal (Howard at Beale) | Townsend at 7th | University and Emerson | Washington at Kearny | Yerba Buena Center of the Arts (3rd @ Howard) | +| --- | --- | +| 2nd at Folsom | 61 | 97 | 164 | 268 | 0 | 0 | 271 | 407 | 0 | 0 | 483 | 329 | 306 | 0 | 494 | 239 | 262 | 687 | 599 | 0 | 0 | 639 | 416 | 282 | 80 | 0 | 0 | 506 | 237 | 167 | 250 | 0 | 0 | 0 | 0 | 0 | 0 | 208 | 264 | 290 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 300 | 303 | 584 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 590 | 208 | 0 | 0 | 318 | 149 | 448 | 0 | 429 | 165 | +| 2nd at South Park | 61 | 60 | 77 | 86 | 0 | 0 | 78 | 345 | 0 | 0 | 290 | 188 | 171 | 0 | 357 | 104 | 81 | 490 | 341 | 0 | 0 | 369 | 278 | 122 | 60 | 0 | 0 | 416 | 142 | 61 | 68 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 237 | 106 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 63 | 66 | 458 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 399 | 63 | 0 | 0 | 79 | 61 | 78 | 0 | 270 | 96 | +| 2nd at Townsend | 137 | 67 | 60 | 423 | 0 | 0 | 311 | 469 | 0 | 0 | 546 | 520 | 474 | 0 | 436 | 145 | 232 | 509 | 494 | 0 | 0 | 773 | 549 | 325 | 221 | 0 | 0 | 667 | 367 | 265 | 395 | 0 | 0 | 0 | 0 | 0 | 0 | 319 | 455 | 398 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 125 | 133 | 742 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 777 | 241 | 0 | 0 | 291 | 249 | 259 | 0 | 610 | 284 | +| 5th at Howard | 215 | 300 | 384 | 68 | 0 | 0 | 357 | 530 | 0 | 0 | 179 | 412 | 364 | 0 | 543 | 419 | 359 | 695 | 609 | 0 | 0 | 235 | 474 | 453 | 145 | 0 | 0 | 269 | 161 | 250 | 306 | 0 | 0 | 0 | 0 | 0 | 0 | 234 | 89 | 202 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 256 | 221 | 347 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 375 | 402 | 0 | 0 | 455 | 265 | 357 | 0 | 553 | 109 | +| Adobe on Almaden | 0 | 0 | 0 | 0 | 84 | 275 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 701 | 387 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 229 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 441 | 452 | 318 | 0 | 0 | 0 | 0 | 0 | 309 | 146 | 182 | 0 | 207 | 358 | 876 | 101 | 0 | 0 | 369 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| Arena Green / SAP Center | 0 | 0 | 0 | 0 | 305 | 62 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 526 | 546 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 403 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 288 | 875 | 685 | 0 | 0 | 0 | 0 | 0 | 440 | 420 | 153 | 0 | 166 | 624 | 759 | 116 | 0 | 0 | 301 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| Beale at Market | 219 | 343 | 417 | 387 | 0 | 0 | 60 | 155 | 0 | 0 | 343 | 122 | 153 | 0 | 115 | 216 | 170 | 303 | 198 | 0 | 0 | 437 | 235 | 149 | 204 | 0 | 0 | 535 | 203 | 88 | 72 | 0 | 0 | 0 | 0 | 0 | 0 | 191 | 316 | 191 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 499 | 395 | 526 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 575 | 173 | 0 | 0 | 87 | 94 | 619 | 0 | 222 | 264 | +| Broadway St at Battery St | 351 | 424 | 499 | 555 | 0 | 0 | 195 | 62 | 0 | 0 | 520 | 90 | 129 | 0 | 70 | 340 | 284 | 128 | 101 | 0 | 0 | 961 | 148 | 168 | 357 | 0 | 0 | 652 | 351 | 218 | 221 | 0 | 0 | 0 | 0 | 0 | 0 | 255 | 376 | 316 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 611 | 599 | 799 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 738 | 336 | 0 | 0 | 169 | 291 | 885 | 0 | 134 | 411 | +| California Ave Caltrain Station | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 82 | 1645 | 0 | 0 | 0 | 628 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1771 | 0 | 484 | 131 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1077 | 0 | 0 | 0 | 870 | 911 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 531 | 0 | 0 | +| Castro Street and El Camino Real | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 74 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 499 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 201 | 108 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 654 | 0 | 0 | 0 | 953 | 696 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + +(省略了 60 行) + +有人从 Folsom 的二号车站骑到 Beale at Market,距离大约五个街区,非常快(271 秒,约 4.5 分钟)。 2nd Avenue 和 Almaden 的 Adobe 之间没有自行车行程,因为后者在不同的城市。 + +### 绘制地图 + +`stations`表包含每个自行车站的地理信息,包括纬度,经度和“地标”,它是该站所在城市的名称。 + +```py +stations = Table.read_table('station.csv') +stations +``` + +| station_id | name | lat | long | dockcount | landmark | installation | +| --- | --- | +| 2 | San Jose Diridon Caltrain Station | 37.3297 | -121.902 | 27 | San Jose | 8/6/2013 | +| 3 | San Jose Civic Center | 37.3307 | -121.889 | 15 | San Jose | 8/5/2013 | +| 4 | Santa Clara at Almaden | 37.334 | -121.895 | 11 | San Jose | 8/6/2013 | +| 5 | Adobe on Almaden | 37.3314 | -121.893 | 19 | San Jose | 8/5/2013 | +| 6 | San Pedro Square | 37.3367 | -121.894 | 15 | San Jose | 8/7/2013 | +| 7 | Paseo de San Antonio | 37.3338 | -121.887 | 15 | San Jose | 8/7/2013 | +| 8 | San Salvador at 1st | 37.3302 | -121.886 | 15 | San Jose | 8/5/2013 | +| 9 | Japantown | 37.3487 | -121.895 | 15 | San Jose | 8/5/2013 | +| 10 | San Jose City Hall | 37.3374 | -121.887 | 15 | San Jose | 8/6/2013 | +| 11 | MLK Library | 37.3359 | -121.886 | 19 | San Jose | 8/6/2013 | + +(省略了 60 行) + +我们可以使用`Marker.map_table`来绘制一个车站所在位置的地图。 该函数在一个表格上进行操作,该表格的列依次是纬度,经度以及每个点的可选标识符。 + +```py +Marker.map_table(stations.select('lat', 'long', 'name')) +``` + +地图使用 OpenStreetMap 创建的,OpenStreetMap 是一个开放的在线地图系统,你可以像使用 Google 地图或任何其他在线地图一样使用。 放大到旧金山,看看车站如何分布。 点击一个标记,看看它是哪个站。 + +你也可以用彩色圆圈表示地图上的点。 这是旧金山自行车站的地图。 + +```py +sf = stations.where('landmark', are.equal_to('San Francisco')) +sf_map_data = sf.select('lat', 'long', 'name') +Circle.map_table(sf_map_data, color='green', radius=200) +``` + +### 更多信息的地图:`join`的应用 + +自行车站位于湾区五个不同的城市。 为了区分每个城市,通过使用不同的颜色,我们首先使用`group`来标识所有城市,并为每个城市分配一个颜色。 + +```py +cities = stations.group('landmark').relabeled('landmark', 'city') +cities +``` + + +| city | count | +| --- | --- | +| Mountain View | 7 | +| Palo Alto | 5 | +| Redwood City | 7 | +| San Francisco | 35 | +| San Jose | 16 | + +```py +colors = cities.with_column('color', make_array('blue', 'red', 'green', 'orange', 'purple')) +colors +``` + +| city | count | color | +| --- | --- | +| Mountain View | 7 | blue | +| Palo Alto | 5 | red | +| Redwood City | 7 | green | +| San Francisco | 35 | orange | +| San Jose | 16 | purple | + +现在我们可以按照`landmark`连接`stations`和`colors`,之后选取绘制地图所需的列。 + +```py +joined = stations.join('landmark', colors, 'city') +colored = joined.select('lat', 'long', 'name', 'color') +Marker.map_table(colored) + +``` + +现在五个不同城市由五种不同颜色标记。 + +要查看大部分自行车租赁的来源,让我们确定起点站: + +```py +starts = commute.group('Start Station').sort('count', descending=True) +starts +``` + +| Start Station | count | +| --- | --- | +| San Francisco Caltrain (Townsend at 4th) | 25858 | +| San Francisco Caltrain 2 (330 Townsend) | 21523 | +| Harry Bridges Plaza (Ferry Building) | 15543 | +| Temporary Transbay Terminal (Howard at Beale) | 14298 | +| 2nd at Townsend | 13674 | +| Townsend at 7th | 13579 | +| Steuart at Market | 13215 | +| Embarcadero at Sansome | 12842 | +| Market at 10th | 11523 | +| Market at Sansome | 11023 | + +(省略了 60 行) + +我们可以包含映射这些车站所需的地理数据,首先连接`starts `的`stations`: + +```py +station_starts = stations.join('name', starts, 'Start Station') +station_starts +``` + + +| name | station_id | lat | long | dockcount | landmark | installation | count | +| --- | --- | +| 2nd at Folsom | 62 | 37.7853 | -122.396 | 19 | San Francisco | 8/22/2013 | 7841 | +| 2nd at South Park | 64 | 37.7823 | -122.393 | 15 | San Francisco | 8/22/2013 | 9274 | +| 2nd at Townsend | 61 | 37.7805 | -122.39 | 27 | San Francisco | 8/22/2013 | 13674 | +| 5th at Howard | 57 | 37.7818 | -122.405 | 15 | San Francisco | 8/21/2013 | 7394 | +| Adobe on Almaden | 5 | 37.3314 | -121.893 | 19 | San Jose | 8/5/2013 | 522 | +| Arena Green / SAP Center | 14 | 37.3327 | -121.9 | 19 | San Jose | 8/5/2013 | 590 | +| Beale at Market | 56 | 37.7923 | -122.397 | 19 | San Francisco | 8/20/2013 | 8135 | +| Broadway St at Battery St | 82 | 37.7985 | -122.401 | 15 | San Francisco | 1/22/2014 | 7460 | +| California Ave Caltrain Station | 36 | 37.4291 | -122.143 | 15 | Palo Alto | 8/14/2013 | 300 | +| Castro Street and El Camino Real | 32 | 37.386 | -122.084 | 11 | Mountain View | 12/31/2013 | 1137 | + +(省略了 58 行) + +现在我们只提取绘制地图所需的数据,为每个站添加一个颜色和一个面积。 面积是起始于每个站点的租用次数的 1000 倍,其中选择了常数 1000,以便在地图上以适当的比例绘制圆圈。 + +```py +starts_map_data = station_starts.select('lat', 'long', 'name').with_columns( + 'color', 'blue', + 'area', station_starts.column('count') * 1000 +) +starts_map_data.show(3) +Circle.map_table(starts_map_data) +``` + + +| lat | long | name | color | area | +| --- | --- | +| 37.7853 | -122.396 | 2nd at Folsom | blue | 7841000 | +| 37.7823 | -122.393 | 2nd at South Park | blue | 9274000 | +| 37.7805 | -122.39 | 2nd at Townsend | blue | 13674000 | + +(省略了 65 行) + +旧金山的一大块表明,这个城市的东部是湾区自行车租赁的重点区域。