未验证 提交 c4d7eef4 编写于 作者: A Avi Aryan

add importer docs

上级 22d5358a
...@@ -31,6 +31,8 @@ At the time of writing, the list of parameters supported looks like - ...@@ -31,6 +31,8 @@ At the time of writing, the list of parameters supported looks like -
Note that you only need to set the parameters that are required for the source database type. For example, you don't set `replication_slot` when taking CSV as the source. Note that you only need to set the parameters that are required for the source database type. For example, you don't set `replication_slot` when taking CSV as the source.
**Note** - Help for [transform-file](../importer/transform_file.md) is available here.
## Examples ## Examples
......
# Transform file
A transform file can be specified with the `import` command which implements transforms when data is moved from source to sink.
The most basic form of transform file is the following. It does nothing but move everything from source to sink.
```js
t.Source("source", source, "/.*/").Save("sink", sink, "/.*/")
```
But we can add [transforms](transforms/) in it to manipulate data that is going to the sink.
```js
t.Source("source", source, "/.*/")
.Transform(pretty({"spaces":0}))
// more transforms
.Save("sink", sink, "/.*/")
```
It can also be used to specify mappings to use in ElasticSearch.
To specify mapping, you use the `Mapping` method. It takes only a single argument which is an object containing mapping data.
```js
t.Source("source", source, "/.*/")
.Mapping({
"TypeName": {
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" },
// more properties
}
},
"AnotherType": {
"properties": {
// ....
}
}
})
.Transform(pretty({"spaces":0}))
// transforms
.Save("sink", sink, "/.*/")
```
Note that mapping are set on a type level so the mapping object should contain type and the properties to apply to that type (like we have `TypeName` and `AnotherType` here).
Also the type name used is for the sink, so the type name should be consistent with the namespace that is generated after going through
all the [transforms](transforms/) i.e. if you have a transform that
changes namespace in any way, the type names used in mapping should take care of that.
# goja function
`goja()` creates a JavaScript VM that receives and sends data through the defined javascript function for processing. The parameter passed to the function has been converted from a go map[string]interface{} to a JS object of the following form:
```JSON
{
"ns":"message.namespace",
"ts":12345, // time represented in milliseconds since epoch
"op":"insert",
"data": {
"id": "abcdef",
"name": "hello world"
}
}
```
***NOTE*** when working with data from MongoDB, the _id field will be represented in the following fashion:
```JSON
{
"ns":"message.namespace",
"ts":12345, // time represented in milliseconds since epoch
"op":"insert",
"data": {
"_id": {
"$oid": "54a4420502a14b9641000001"
},
"name": "hello world"
}
}
```
### configuration
```javascript
goja({"filename": "/path/to/transform.js"})
// js() is aliased to goja
// js({"filename": "/path/to/transform.js"})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function"
}
```
config
```javascript
goja({"filename":"transform.js"})
```
transform function (i.e. `transform.js`)
```javascript
function transform(doc) {
doc["data"]["name_type"] = doc["data"]["name"] + " " + doc["data"]["type"];
return doc
}
```
message out
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"name_type": "abc function"
}
```
\ No newline at end of file
# omit function
`omit()` will remove any fields specified from the message and then send down the pipeline. It currently only works for top level fields (i.e. `address.street` would not work).
### configuration
```javascript
omit({"fields": ["name"]})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function"
}
```
config
```javascript
omit({"fields":["type"]})
```
message out
```JSON
{
"_id": 0,
"name": "abc"
}
```
\ No newline at end of file
# otto function
`otto()` creates a JavaScript VM that receives and sends data through the defined javascript function for processing. The parameter passed to the function has been converted from a go map[string]interface{} to a JS object of the following form:
```JSON
{
"ns":"message.namespace",
"ts":12345, // time represented in milliseconds since epoch
"op":"insert",
"data": {
"id": "abcdef",
"name": "hello world"
}
}
```
***NOTE*** when working with data from MongoDB, the _id field will be represented in the following fashion:
```JSON
{
"ns":"message.namespace",
"ts":12345, // time represented in milliseconds since epoch
"op":"insert",
"data": {
"_id": {
"$oid": "54a4420502a14b9641000001"
},
"name": "hello world"
}
}
```
### configuration
```javascript
otto({"filename": "/path/to/transform.js"})
// transform() is also available for backwards compatibility reasons but may be removed in future versions
// transform({"filename": "/path/to/transform.js"})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function"
}
```
config
```javascript
otto({"filename":"transform.js"})
```
transform function (i.e. `transform.js`)
```javascript
module.exports=function(doc) {
doc["data"]["name_type"] = doc["data"]["name"] + " " + doc["data"]["type"];
return doc
}
```
message out
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"name_type": "abc function"
}
```
\ No newline at end of file
# pick function
`pick()` will only include the specified fields from the message when sending down the pipeline. It currently only works for top level fields (i.e. `address.street` would not work).
### configuration
```javascript
pick({"fields": ["name"]})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function"
}
```
config
```javascript
pick({"fields":["_id", "name"]})
```
message out
```JSON
{
"_id": 0,
"name": "abc"
}
```
\ No newline at end of file
# pretty function
`pretty()` will marshal the data to JSON and then log it at the `INFO` level. The default indention setting is `2` spaces and if set to `0`, it will print on a single line.
### configuration
```javascript
pretty({"spaces": 2})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function"
}
```
config
```javascript
pretty({"spaces":0})
```
log line
```shell
INFO[0000]
{"_id":0,"name":"abc","type":"function"}
```
config
```javascript
pretty({"spaces":2})
```
log line
```shell
INFO[0000]
{
"_id":0,
"name":"abc",
"type":"function"
}
```
\ No newline at end of file
# rename function
`rename()` will update the replace existing key names with new ones based on the provided configuration. It currently only works for top level fields (i.e. `address.street` would not work).
### configuration
```javascript
rename({"field_map": {"test":"renamed"}})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"count": 10
}
```
config
```javascript
rename({"field_map": {"count":"total"}})
```
message out
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"total": 10
}
```
\ No newline at end of file
# skip function
`skip()` will evalute the data based on the criteria configured and determine whether the message should continue down the pipeline or be skipped. When evaluating the data, `true` will result in the message being sent down the pipeline and `false` will result in the message being skipped. Take a look at the [tests](skipper_test.go) for all currently supported configurations. It currently only works for top level fields (i.e. `address.street` would not work).
### configuration
```javascript
skip({"field": "test", "operator": "==", "match": 10})
```
### example
message in
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"count": 10
}
```
config
```javascript
skip({"field": "count", "operator": "==", "match": 10})
```
message out
```JSON
{
"_id": 0,
"name": "abc",
"type": "function",
"count": 10
}
```
config
```javascript
skip({"field": "count", "operator": ">", "match": 20})
```
message would be skipped
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册