From 8ec47f17be0b20e5204d309f72b0bec9b234a7fb Mon Sep 17 00:00:00 2001
From: Maximilian Michels <mxm@apache.org>
Date: Wed, 4 May 2016 19:44:57 +0200
Subject: [PATCH] [FLINK-3876] improve documentation of Scala Shell

- restructure sections
- improve readability
---
 docs/apis/scala_shell.md | 138 +++++++++++++++++++++++++++++++--------
 1 file changed, 112 insertions(+), 26 deletions(-)
diff --git a/docs/apis/scala_shell.md b/docs/apis/scala_shell.md
index a815f18aa70..0377f5a4aea 100644
--- a/docs/apis/scala_shell.md
+++ b/docs/apis/scala_shell.md
@@ -25,10 +25,8 @@ under the License.
 
 
 Flink comes with an integrated interactive Scala Shell.
-It can be used in a local setup as well as in a cluster setup. To get started with downloading
-Flink and setting up a cluster please refer to
-[local setup]({{ site.baseurl }}/setup/local_setup.html) or
-[cluster setup]({{ site.baseurl }}/setup/cluster_setup.html)
+It can be used in a local setup as well as in a cluster setup.
+
 
 To use the shell with an integrated Flink cluster just execute:
 
@@ -36,14 +34,9 @@ To use the shell with an integrated Flink cluster just execute:
 bin/start-scala-shell.sh local
 ~~~
 
-in the root directory of your binary Flink directory.
-
-To use it with a running cluster start the scala shell with the keyword `remote`
-and supply the host and port of the JobManager with:
+in the root directory of your binary Flink directory. To run the Shell on a
+cluster, please see the Setup section below.
 
-~~~bash
-bin/start-scala-shell.sh remote <hostname> <portnumber>
-~~~
 
 ## Usage
 
@@ -51,6 +44,8 @@ The shell supports Batch and Streaming.
 Two different ExecutionEnvironments are automatically prebound after startup.
 Use "benv" and "senv" to access the Batch and Streaming environment respectively.
 
+### DataSet API
+
 The following example will execute the wordcount program in the Scala shell:
 
 ~~~scala
@@ -59,7 +54,9 @@ Scala-Flink> val text = benv.fromElements(
   "Whether 'tis nobler in the mind to suffer",
   "The slings and arrows of outrageous fortune",
   "Or to take arms against a sea of troubles,")
-Scala-Flink> val counts = text.flatMap { _.toLowerCase.split("\\W+") }.map { (_, 1) }.groupBy(0).sum(1)
+Scala-Flink> val counts = text
+    .flatMap { _.toLowerCase.split("\\W+") }
+    .map { (_, 1) }.groupBy(0).sum(1)
 Scala-Flink> counts.print()
 ~~~
 
@@ -71,7 +68,9 @@ It is possbile to write results to a file. However, in this case you need to cal
 Scala-Flink> benv.execute("MyProgram")
 ~~~
 
-The Batch program above can be executed using the Streaming API through:
+### DataStream API
+
+Similar to the the batch program above, we can execute a streaming program through the DataStream API:
 
 ~~~scala
 Scala-Flink> val textStreaming = senv.fromElements(
@@ -79,33 +78,120 @@ Scala-Flink> val textStreaming = senv.fromElements(
   "Whether 'tis nobler in the mind to suffer",
   "The slings and arrows of outrageous fortune",
   "Or to take arms against a sea of troubles,")
- Scala-Flink> val countsStreaming = textStreaming.flatMap { _.toLowerCase.split("\\W+") }.map { (_, 1) }.keyBy(0).sum(1)
- Scala-Flink> countsStreaming.print()
- Scala-Flink> senv.execute("Streaming Wordcount")
+Scala-Flink> val countsStreaming = textStreaming
+    .flatMap { _.toLowerCase.split("\\W+") }
+    .map { (_, 1) }.keyBy(0).sum(1)
+Scala-Flink> countsStreaming.print()
+Scala-Flink> senv.execute("Streaming Wordcount")
 ~~~
 
 Note, that in the Streaming case, the print operation does not trigger execution directly.
 
-The Flink Shell comes with command history and autocompletion.
+The Flink Shell comes with command history and auto-completion.
+
+
+## Adding external dependencies
 
-## Scala Shell with Flink on YARN
+It is possible to add external classpaths to the Scala-shell. These will be sent to the Jobmanager automatically alongside your shell program, when calling execute.
 
-The Scala shell can connect Flink cluster on YARN. To connect deployed Flink cluster on YARN, use following command:
+Use the parameter `-a <path/to/jar.jar>` or `--addclasspath <path/to/jar.jar>` to load additional classes.
 
 ~~~bash
-bin/start-scala-shell.sh yarn
+bin/start-scala-shell.sh [local | remote <host> <port> | yarn] --addclasspath <path/to/jar.jar>
 ~~~
 
-The shell reads the connection information of the deployed Flink cluster from the `.yarn-properties` file, which is created in the configured `yarn.properties-file.location` directory or the temporary directory. If there is no deployed Flink cluster on YARN, the shell prints an error message.
 
-The shell can deploy a Flink cluster to YARN, which is used exclusively by the shell. The number of YARN containers can be controlled by the parameter `-n <arg>`. The shell deploys a new Flink cluster on YARN and connects the cluster. You can also specify options for YARN cluster such as memory for JobManager, name of YARN application, etc.. 
+## Setup
 
-## Adding external dependencies
+To get an overview of what options the Scala Shell provides, please use
 
-It is possible to add external classpaths to the Scala-shell. These will be sent to the Jobmanager automatically alongside your shell program, when calling execute.
+~~~bash
+bin/start-scala-shell.sh --help
+~~~
 
-Use the parameter `-a <path/to/jar.jar>` or `--addclasspath <path/to/jar.jar>` to load additional classes.
+### Local
+
+To use the shell with an integrated Flink cluster just execute:
 
 ~~~bash
-bin/start-scala-shell.sh [local | remote <host> <port> | yarn] --addclasspath <path/to/jar.jar>
+bin/start-scala-shell.sh local
+~~~
+
+
+### Remote
+
+To use it with a running cluster start the scala shell with the keyword `remote`
+and supply the host and port of the JobManager with:
+
+~~~bash
+bin/start-scala-shell.sh remote <hostname> <portnumber>
+~~~
+
+### Yarn Scala Shell cluster
+
+The shell can deploy a Flink cluster to YARN, which is used exclusively by the
+shell. The number of YARN containers can be controlled by the parameter `-n <arg>`.
+The shell deploys a new Flink cluster on YARN and connects the
+cluster. You can also specify options for YARN cluster such as memory for
+JobManager, name of YARN application, etc.
+ 
+For example, to start a Yarn cluster for the Scala Shell with two TaskManagers
+use the following:
+ 
+~~~bash
+ bin/start-scala-shell.sh yarn -n 2
+~~~
+
+For all other options, see the full reference at the bottom.
+
+
+### Yarn Session
+
+If you have previously deployed a Flink cluster using the Flink Yarn Session,
+the Scala shell can connect with it using the following command:
+
+~~~bash
+ bin/start-scala-shell.sh yarn
+~~~
+
+
+## Full Reference
+
+~~~bash
+Flink Scala Shell
+Usage: start-scala-shell.sh [local|remote|yarn] [options] <args>...
+
+Command: local [options]
+Starts Flink scala shell with a local Flink cluster
+  -a <path/to/jar> | --addclasspath <path/to/jar>
+        Specifies additional jars to be used in Flink
+Command: remote [options] <host> <port>
+Starts Flink scala shell connecting to a remote cluster
+  <host>
+        Remote host name as string
+  <port>
+        Remote port as integer
+
+  -a <path/to/jar> | --addclasspath <path/to/jar>
+        Specifies additional jars to be used in Flink
+Command: yarn [options]
+Starts Flink scala shell connecting to a yarn cluster
+  -n arg | --container arg
+        Number of YARN container to allocate (= Number of TaskManagers)
+  -jm arg | --jobManagerMemory arg
+        Memory for JobManager container [in MB]
+  -nm <value> | --name <value>
+        Set a custom name for the application on YARN
+  -qu <arg> | --queue <arg>
+        Specifies YARN queue
+  -s <arg> | --slots <arg>
+        Number of slots per TaskManager
+  -tm <arg> | --taskManagerMemory <arg>
+        Memory per TaskManager container [in MB]
+  -a <path/to/jar> | --addclasspath <path/to/jar>
+        Specifies additional jars to be used in Flink
+  --configDir <value>
+        The configuration directory.
+  -h | --help
+        Prints this usage text
 ~~~
-- 
GitLab