val df = spark.read.format("org.apache.iotdb.sparkdb").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").
val df = spark.read.format("org.apache.iotdb.spark.db").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").
option("lowerBound", [lower bound of time that you want query(include)]).option("upperBound", [upper bound of time that you want query(include)]).
option("numPartition", [the partition number you want]).load
...
...
@@ -131,7 +131,7 @@ You can also use narrow table form which as follows: (You can see part 4 about h
```
import org.apache.iotdb.spark.db._
val wide_df = spark.read.format("org.apache.iotdb.sparkdb").option("url", "jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root where time < 1100 and time > 1000").load
val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url", "jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root where time < 1100 and time > 1000").load
val narrow_df = Transformer.toNarrowForm(spark, wide_df)
This project provides a connector which reads data from iotdb and sends to grafana(https://grafana.com/). Before you use this tool, make sure grafana and iotdb are correctly installed and started.
This project provides a connector which reads data from IoTDB and sends to Grafana(https://grafana.com/). Before you use this tool, make sure Grafana and IoTDB are correctly installed and started.
If you use Unix, Grafana will auto start after installing, or you can run `sudo service grafana-server start` command. See more information [here](http://docs.grafana.org/installation/debian/).
If you use Mac and `homebrew` to install Grafana, you can use `homebrew` to start Grafana.
First make sure homebrew/services is installed by running `brew tap homebrew/services`, then start Grafana using: `brew services start grafana`.
See more information [here](http://docs.grafana.org/installation/mac/).
If you use Windows, start Grafana by executing grafana-server.exe, located in the bin directory, preferably from the command line. See more information [here](http://docs.grafana.org/installation/windows/).
Copy `application.properties` from `conf/` directory to `target` directory.(Or just make sure that `application.properties` and `iotdb-grafana-{version}.war` are in the same directory.)
Copy `application.properties` from `conf/` directory to `target` directory.(Or just make sure that `application.properties` and `iotdb-grafana-{version}.war` are in the same directory.)
TsFile-Hadoop-Connector implements the support of Hadoop for external data sources of Tsfile type. This enables users to read, write and query Tsfile by Hadoop.
With this connector, you can
* load a single TsFile, from either the local file system or hdfs, into Hadoop
* load all files in a specific directory, from either the local file system or hdfs, into hadoop
* write data from Hadoop into TsFile
## System Requirements
|Hadoop Version | Java Version | TsFile Version|
|------------- | ------------ |------------ |
| `2.7.3` | `1.8` | `0.8.0-SNAPSHOT`|
> Note: For more information about how to download and use TsFile, please see the following link: https://github.com/apache/incubator-iotdb/tree/master/tsfile.
## Data Type Correspondence
| TsFile data type | Hadoop writable |
| ---------------- | --------------- |
| BOOLEAN | BooleanWritable |
| INT32 | IntWritable |
| INT64 | LongWritable |
| FLOAT | FloatWritable |
| DOUBLE | DoubleWritable |
| TEXT | Text |
## TSFInputFormat Explanation
TSFInputFormat extract data from tsfile and format them into records of `MapWritable`.
Supposing that we want to extract data of the device named `d1` which has three sensors named `s1`, `s2`, `s3`.
`s1`'s type is `BOOLEAN`, `s2`'s type is `DOUBLE`, `s3`'s type is `TEXT`.
The `MapWritable` struct will be like:
```
{
"time_stamp": 10000000,
"device_id": d1,
"s1": true,
"s2": 3.14,
"s3": "middle"
}
```
In the Map job of Hadoop, you can get any value you want by key as following:
`mapwritable.get(new Text("s1"))`
> Note: All the keys in `MapWritable` have type of `Text`.
## Examples
### Read Example: calculate the sum
First of all, we should tell InputFormat what kind of data we want from tsfile.
> Note: For the complete code, please see the following link: https://github.com/apache/incubator-iotdb/blob/master/example/hadoop/src/main/java/org/apache/iotdb//hadoop/tsfile/TSFMRReadExample.java
### Write Example: write the average into Tsfile
Except for the `OutputFormatClass`, the rest of configuration code for hadoop map-reduce job is almost same as above.
```
job.setOutputFormatClass(TSFOutputFormat.class);
// set reducer output key and value
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(HDFSTSRecord.class);
```
Then, the `mapper` and `reducer` class is how you deal with the `MapWritable` produced by `TSFInputFormat` class.
```
public static class TSMapper extends Mapper<NullWritable, MapWritable, Text, MapWritable> {
HDFSTSRecord tsRecord = new HDFSTSRecord(1L, key.toString());
DataPoint dPoint1 = new LongDataPoint("sensor_1", sensor1_value_sum / num);
DataPoint dPoint2 = new LongDataPoint("sensor_2", sensor2_value_sum / num);
DataPoint dPoint3 = new DoubleDataPoint("sensor_3", sensor3_value_sum / num);
tsRecord.addTuple(dPoint1);
tsRecord.addTuple(dPoint2);
tsRecord.addTuple(dPoint3);
context.write(NullWritable.get(), tsRecord);
}
}
```
> Note: For the complete code, please see the following link: https://github.com/apache/incubator-iotdb/blob/master/example/hadoop/src/main/java/org/apache/iotdb//hadoop/tsfile/TSMRWriteExample.java
@@ -55,10 +59,13 @@ This chapter provides an example of how to open a database connection, execute a
Requires that you include the packages containing the JDBC classes needed for database programming.
**NOTE: For faster insertion, the insertBatch() in Session is recommended.**
```Java
import java.sql.*;
import org.apache.iotdb.jdbc.IoTDBSQLException;
public class JDBCExample {
/**
* Before executing a SQL statement with a Statement object, you need to create a Statement object using the createStatement() method of the Connection object.
* After creating a Statement object, you can use its execute() method to execute a SQL statement
**Status Code** is introduced in the latest version. For example, as IoTDB requires registering the time series first before writing data, a kind of solution is:
```
try {
writeData();
} catch (SQLException e) {
// the most case is that the time series does not exist
if (e.getMessage().contains("exist")) {
//However, using the content of the error message is not so efficient
registerTimeSeries();
//write data once again
writeData();
}
}
```
With Status Code, instead of writing codes like `if (e.getErrorMessage().contains("exist"))`, we can simply use `e.getErrorCode() == TSStatusCode.TIME_SERIES_NOT_EXIST_ERROR.getStatusCode()`.
Here is a list of Status Code and related message:
|Status Code|Status Type|Meaning|
|:---|:---|:---|
|200|SUCCESS_STATUS||
|201|STILL_EXECUTING_STATUS||
|202|INVALID_HANDLE_STATUS||
|301|TIMESERIES_NOT_EXIST_ERROR|Timeseries does not exist|
val df = spark.read.format("org.apache.iotdb.sparkdb").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").
val df = spark.read.format("org.apache.iotdb.spark.db").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").
option("lowerBound", [lower bound of time that you want query(include)]).option("upperBound", [upper bound of time that you want query(include)]).
option("numPartition", [the partition number you want]).load
...
...
@@ -127,15 +128,15 @@ You can also use narrow table form which as follows: (You can see part 4 about h
## from wide to narrow
```
import org.apache.iotdb.sparkdb._
import org.apache.iotdb.spark.db._
val wide_df = spark.read.format("org.apache.iotdb.sparkdb").option("url", "jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root where time < 1100 and time > 1000").load
val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url", "jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root where time < 1100 and time > 1000").load
val narrow_df = Transformer.toNarrowForm(spark, wide_df)
```
## from narrow to wide
```
import org.apache.iotdb.sparkdb._
import org.apache.iotdb.spark.db._
val wide_df = Transformer.toWideForm(spark, narrow_df)
```
...
...
@@ -145,7 +146,7 @@ val wide_df = Transformer.toWideForm(spark, narrow_df)
TsFile-Spark-Connector implements the support of Spark for external data sources of Tsfile type. This enables users to read, write and query Tsfile by Spark.
...
...
@@ -56,7 +29,6 @@ With this connector, you can
* load all files in a specific directory, from either the local file system or hdfs, into Spark
* write data from Spark into TsFile
<aid="2-system-requirements"></a>
## 2. System Requirements
|Spark Version | Scala Version | Java Version | TsFile |
...
...
@@ -65,11 +37,7 @@ With this connector, you can
> Note: For more information about how to download and use TsFile, please see the following link: https://github.com/apache/incubator-iotdb/tree/master/tsfile.
> Note: Openjdk may not be competible with scala. Use oracle jdk instead.
<aid="3-quick-start"></a>
## 3. Quick Start
<aid="local-mode"></a>
### Local Mode
Start Spark with TsFile-Spark-Connector in local mode:
...
...
@@ -85,7 +53,6 @@ Note:
* See https://github.com/apache/incubator-iotdb/tree/master/tsfile for how to get TsFile.
<aid="distributed-mode"></a>
### Distributed Mode
Start Spark with TsFile-Spark-Connector in distributed mode (That is, the spark cluster is connected by spark-shell):
...
...
@@ -100,7 +67,6 @@ Note:
* Multiple jar packages are separated by commas without any spaces.
* See https://github.com/apache/incubator-iotdb/tree/master/tsfile for how to get TsFile.
<aid="4-data-type-correspondence"></a>
## 4. Data Type Correspondence
| TsFile data type | SparkSQL data type|
...
...
@@ -112,7 +78,6 @@ Note:
| DOUBLE | DoubleType |
| TEXT | StringType |
<aid="5-schema-inference"></a>
## 5. Schema Inference
The way to display TsFile is dependent on the schema. Take the following TsFile structure as an example: There are three Measurements in the TsFile schema: status, temperature, and hardware. The basic information of these three measurements is as follows:
...
...
@@ -152,31 +117,47 @@ The corresponding SparkSQL table is as follows:
| 5 | null | null | null | null | false | null |
| 6 | null | null | ccc | null | null | null |
You can also use narrow table form which as follows: (You can see part 6 about how to use narrow form)
| time | device_name | status | hardware | temperature |
The way to display TsFile is related to TsFile Schema. Take the following TsFile structure as an example: There are three Measurements in the Schema of TsFile: status, temperature, and hardware. The basic info of these three Measurements is as follows:
...
...
@@ -255,7 +265,6 @@ The existing data in the file is as follows:
There are two ways to show it out:
<aid="the-default-way"></a>
#### the default way
Two columns will be created to store the full path of the device: time(LongType) and delta_object(StringType).
...
...
@@ -291,7 +300,6 @@ Next, a column is created for each Measurement to store the specific data. The S
</center>
<aid="unfolding-delta_object-column"></a>
#### unfolding delta_object column
Expand the device column by "." into multiple columns, ignoring the root directory "root". Convenient for richer aggregation operations. If the user wants to use this display way, the parameter "delta\_object\_name" needs to be set in the table creation statement (refer to Example 5 in Section 5.1 of this manual), as in this example, parameter "delta\_object\_name" is set to "root.device.turbine". The number of path layers needs to be one-to-one. At this point, one column is created for each layer of the device path except the "root" layer. The column name is the name in the parameter and the value is the name of the corresponding layer of the device. Next, one column will be created for each Measurement to store the specific data.
...
...
@@ -328,6 +336,5 @@ TsFile-Spark-Connector can display one or more TsFiles as a table in SparkSQL By
The writing process is to write a DataFrame as one or more TsFiles. By default, two columns need to be included: time and delta_object. The rest of the columns are used as Measurement. If user wants to write the second table structure back to TsFile, user can set the "delta\_object\_name" parameter(refer to Section 5.1 of Section 5.1 of this manual).
<aid="appendix-b-old-note"></a>
## Appendix B: Old Note
NOTE: Check the jar packages in the root directory of your Spark and replace libthrift-0.9.2.jar and libfb303-0.9.2.jar with libthrift-0.9.1.jar and libfb303-0.9.1.jar respectively.