Examples.md 3.1 KB
Newer Older
1
---
C
cheddar 已提交
2
layout: doc_page
3
---
C
cheddar 已提交
4 5 6
Examples
========

F
fjy 已提交
7
The examples on this page are setup in order to give you a feel for what Druid does in practice. They are quick demos of Druid based on [CliRealtimeExample](https://github.com/metamx/druid/blob/master/services/src/main/java/io/druid/cli/CliRealtimeExample.java). While you wouldn’t run it this way in production you should be able to see how ingestion works and the kind of exploratory queries that are possible. Everything that can be done on your box here can be scaled out to 10’s of billions of events and terabytes of data per day in a production cluster while still giving the snappy responsive exploratory queries.
C
cheddar 已提交
8 9 10 11 12 13 14 15 16 17

Installing Standalone Druid
---------------------------

There are two options for installing standalone Druid. Building from source, and downloading the Druid Standalone Kit (DSK).

### Building from source

Clone Druid and build it:

C
cheddar 已提交
18 19 20 21
``` bash
git clone https://github.com/metamx/druid.git druid
cd druid
git fetch --tags
F
fjy 已提交
22
git checkout druid-0.6.57
C
cheddar 已提交
23 24
./build.sh
```
C
cheddar 已提交
25 26 27

### Downloading the DSK (Druid Standalone Kit)

F
fjy 已提交
28
[Download](http://static.druid.io/artifacts/releases/druid-services-0.6.57-bin.tar.gz) a stand-alone tarball and run it:
C
cheddar 已提交
29

C
cheddar 已提交
30
``` bash
F
fjy 已提交
31 32
tar -xzf druid-services-0.X.X-bin.tar.gz
cd druid-services-0.X.X
C
cheddar 已提交
33
```
C
cheddar 已提交
34 35 36 37

Twitter Example
---------------

R
Russell Jurney 已提交
38
For a full tutorial based on the twitter example, check out this [Twitter Tutorial](Twitter-Tutorial.html).
C
cheddar 已提交
39 40 41 42

This Example uses a feature of Twitter that allows for sampling of it’s stream. We sample the Twitter stream via our [TwitterSpritzerFirehoseFactory](https://github.com/metamx/druid/blob/master/examples/src/main/java/druid/examples/twitter/TwitterSpritzerFirehoseFactory.java) class and use it to simulate the kinds of data you might ingest into Druid. Then, with the client part, the sample shows what kinds of analytics explorations you can do during and after the data is loaded.

### What you’ll learn
C
cheddar 已提交
43 44
* See how large amounts of data gets ingested into Druid in real-time
* Learn how to do fast, interactive, analytics queries on that real-time data
C
cheddar 已提交
45 46

### What you need
C
cheddar 已提交
47 48
* A build of standalone Druid with the Twitter example (see above)
* A Twitter username and password.
C
cheddar 已提交
49 50 51

### What you’ll do

F
fjy 已提交
52
See [Twitter Tutorial](Twitter-Tutorial.html)
C
cheddar 已提交
53 54 55 56 57 58 59 60

Rand Example
------------

This uses `RandomFirehoseFactory` which emits a stream of random numbers (outColumn, a positive double) with timestamps along with an associated token (target). This provides a timeseries that requires no network access for demonstration, characterization, and testing. The generated tuples can be thought of as asynchronously produced triples (timestamp, outColumn, target) where the timestamp varies depending on speed of processing.

In a terminal window, (NOTE: If you are using the cloned Github repository these scripts are in ./examples/bin) start the server with:

C
cheddar 已提交
61 62 63
``` bash
./run_example_server.sh # type rand when prompted
```
C
cheddar 已提交
64 65 66

In another terminal window:

C
cheddar 已提交
67 68 69
``` bash
./run_example_client.sh # type rand when prompted
```
C
cheddar 已提交
70

71
The result of the client query is in JSON format. The client makes a REST request using the program `curl` which is usually installed on Linux, Unix, and OSX by default.