3.md修改完成

上级 ae968cd9
......@@ -168,7 +168,7 @@ res9: Long = 15
15
```
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is that these same functions can be used on very large data sets, even when they are striped across tens or hundreds of nodes. You can also do this interactively by connecting `bin/pyspark` to a cluster, as described in the [RDD programming guide](rdd-programming-guide.html#using-the-shell).
使用Spark探索和缓存100行文本文件似乎很愚蠢. T有趣的是,这些相同的功能可用于非常大的数据集,即使它们跨越数十个或数百个节点交错着, 你也可以通过`bin/pyspark` 连接到集群来进行交互,详细描述在 [RDD programming guide](rdd-programming-guide.html#using-the-shell).
# 独立的应用
......@@ -311,9 +311,9 @@ $ YOUR_SPARK_HOME/bin/spark-submit \
Lines with a: 46, Lines with b: 23
```
Now we will show how to write an application using the Python API (PySpark).
现在我们来展示如何用python API 来写一个应用 (pyspark).
As an example, we’ll create a simple Spark application, `SimpleApp.py`:
我们以一个简单的例子为例,创建一个简单的pyspark 应用 `SimpleApp.py`:
```
"""SimpleApp.py"""
......@@ -331,9 +331,9 @@ print("Lines with a: %i, lines with b: %i" % (numAs, numBs))
spark.stop()
```
This program just counts the number of lines containing ‘a’ and the number containing ‘b’ in a text file. Note that you’ll need to replace YOUR_SPARK_HOME with the location where Spark is installed. As with the Scala and Java examples, we use a SparkSession to create Datasets. For applications that use custom classes or third-party libraries, we can also add code dependencies to `spark-submit` through its `--py-files` argument by packaging them into a .zip file (see `spark-submit --help` for details). `SimpleApp` is simple enough that we do not need to specify any code dependencies.
该程序只是统计计算在该文本中包含a字母和包含b字母的行数. 请注意你需要将 YOUR_SPARK_HOME 替换成你的spark路径.就像scala 示例和java示例一样,我们使用 SparkSession 来创建数据集, 对于使用自定义类护着第三方库的应用程序,我们还可以通过 `spark-submit` 带着 `--py-files` 来添加代码依赖 , 我们也可以通过把代码打成zip包来进行依赖添加 (详细请看 `spark-submit --help` ). `SimpleApp` 是个简单的例子我们不需要添加特别的代码或自定义类.
We can run this application using the `bin/spark-submit` script:
我们可以通过 `bin/spark-submit` 脚本来运行应用:
```
# Use spark-submit to run your application
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册