more apoc.import

70884cfb · Mark Needham · 7519f76b · 70884cfb · 70884cfb · 70884cfb
6 changed file
--- a/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.graphml.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.graphml.adoc
+This procedure imports CSV files that comply with the link:https://neo4j.com/docs/operations-manual/current/tools/neo4j-admin-import/#import-tool-header-format/[Neo4j import tool's header format].
+
+=== Nodes
+
+The following file contains two people:
+
+.persons.csv
+[source,text]
+----
+id:ID,name:STRING
+1,John
+2,Jane
+----
+
+We'll place this file into the `import` directory of our Neo4j instance.
+
+We can create two `Person` nodes with their `name` properties set, by running the following query:
+
+[source,cypher]
+----
+CALL apoc.import.csv([{fileName: 'file:/persons.csv', labels: ['Person']}], [], {});
+----
+
+.Results
+[opts="header"]
+|===
+| file           | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "progress.csv" | "file" | "csv"  | 2     | 0             | 4          | 7    | 0    | -1        | 0       | TRUE | NULL
+|===
+
+We can check what's been imported by running the following query:
+
+[source,cypher]
+----
+MATCH (p:Person)
+RETURN p;
+----
+
+.Results
+[opts="header"]
+|===
+| p
+| (:Person {name: "John", id: "1"})
+| (:Person {name: "Jane", id: "2"})
+|===
+
+
+=== Nodes and relationships
+
+The following files contain nodes and relationships in CSV format:
+
+.people-nodes.csv
+[source,text]
+----
+:ID|name:STRING|speaks:STRING[]
+1|John|en,fr
+2|Jane|en,de
+----
+
+.knows-rels.csv
+[source,text]
+----
+:START_ID|:END_ID|since:INT
+1|2|2016
+----
+
+We will import  two `Person` nodes and a `KNOWS` relationship between them (with the value of the `since` property set).
+The field terminators and the array delimiters are changed from the default value, and the CSVs use numeric ids.
+
+[source,cypher]
+----
+CALL apoc.import.csv(
+  [{fileName: 'file:/people-nodes.csv', labels: ['Person']}],
+  [{fileName: 'file:/knows-rels.csv', type: 'KNOWS'}],
+  {delimiter: '|', arrayDelimiter: ',', stringIds: false}
+);
+----
+
+.Results
+[opts="header"]
+|===
+| file           | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "progress.csv" | "file" | "csv"  | 2     | 1             | 7          | 7    | 0    | -1        | 0       | TRUE | NULL
+|===
+
+We can check what's been imported by running the following query:
+
+[source,cypher]
+----
+MATCH path = (p1:Person)-[:KNOWS]->(p2:Person)
+RETURN path;
+----
+
+.Results
+[opts="header"]
+|===
+| path
+| (:Person {name: "John", speaks: ["en", "fr"], __csv_id: 1})-[:KNOWS {since: 2016}]->(:Person {name: "Jane", speaks: ["en", "de"], __csv_id: 2})
+|===
\ No newline at end of file
--- a/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.json.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.json.adoc
+[[import-graphml-simple]]
+=== Import simple GraphML file
+
+The `simple.graphml` file contains a graph representation from the http://graphml.graphdrawing.org/primer/graphml-primer.html[GraphML primer^].
+
+image::apoc.import.graphml.simple-diagram.png[]
+
+.simple.graphml
+[source,xml]
+----
+<?xml version="1.0" encoding="UTF-8"?>
+<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
+     http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
+  <graph id="G" edgedefault="undirected">
+    <node id="n0"/>
+    <node id="n1"/>
+    <node id="n2"/>
+    <node id="n3"/>
+    <node id="n4"/>
+    <node id="n5"/>
+    <node id="n6"/>
+    <node id="n7"/>
+    <node id="n8"/>
+    <node id="n9"/>
+    <node id="n10"/>
+    <edge source="n0" target="n2"/>
+    <edge source="n1" target="n2"/>
+    <edge source="n2" target="n3"/>
+    <edge source="n3" target="n5"/>
+    <edge source="n3" target="n4"/>
+    <edge source="n4" target="n6"/>
+    <edge source="n6" target="n5"/>
+    <edge source="n5" target="n7"/>
+    <edge source="n6" target="n8"/>
+    <edge source="n8" target="n7"/>
+    <edge source="n8" target="n9"/>
+    <edge source="n8" target="n10"/>
+  </graph>
+</graphml>
+----
+
+
+.The following imports a graph based on `simple.graphml`
+[source,cypher]
+----
+CALL apoc.import.graphml("http://graphml.graphdrawing.org/primer/simple.graphml", {})
+----
+
+If we run this query, we'll see the following output:
+
+.Results
+[opts="header"]
+|===
+| file                                                    | source | format    | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "http://graphml.graphdrawing.org/primer/simple.graphml" | "file" | "graphml" | 11    | 12            | 0          | 618  | 0    | -1        | 0       | TRUE | NULL
+|===
+
+We could also copy `simple.graphml` into Neo4j's `import` directory, and import the file from there.
+
+We can then run the import procedure in the following way:
+
+.The following imports a graph based on `simple.graphml`
+[source,cypher]
+----
+CALL apoc.import.graphml("file://simple.graphml", {})
+----
+
+The Neo4j Browser visualization below shows the imported graph:
+
+image::apoc.import.graphml.simple.png[title="Simple Graph Visualization"]
+
+[[import-graphml-apoc]]
+=== Import GraphML file created by Export GraphML procedures
+
+`movies.graphml` contains a subset of Neo4j's movies graph, and was generated by the xref::export/graphml.adoc#export-graphml-whole-database[Export GraphML procedure].
+
+.movies.graphml
+[source,xml]
+----
+<?xml version="1.0" encoding="UTF-8"?>
+<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
+<key id="born" for="node" attr.name="born"/>
+<key id="name" for="node" attr.name="name"/>
+<key id="tagline" for="node" attr.name="tagline"/>
+<key id="label" for="node" attr.name="label"/>
+<key id="title" for="node" attr.name="title"/>
+<key id="released" for="node" attr.name="released"/>
+<key id="roles" for="edge" attr.name="roles"/>
+<key id="label" for="edge" attr.name="label"/>
+<graph id="G" edgedefault="directed">
+<node id="n188" labels=":Movie"><data key="labels">:Movie</data><data key="title">The Matrix</data><data key="tagline">Welcome to the Real World</data><data key="released">1999</data></node>
+<node id="n189" labels=":Person"><data key="labels">:Person</data><data key="born">1964</data><data key="name">Keanu Reeves</data></node>
+<node id="n190" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Carrie-Anne Moss</data></node>
+<node id="n191" labels=":Person"><data key="labels">:Person</data><data key="born">1961</data><data key="name">Laurence Fishburne</data></node>
+<node id="n192" labels=":Person"><data key="labels">:Person</data><data key="born">1960</data><data key="name">Hugo Weaving</data></node>
+<node id="n193" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Lilly Wachowski</data></node>
+<node id="n194" labels=":Person"><data key="labels">:Person</data><data key="born">1965</data><data key="name">Lana Wachowski</data></node>
+<node id="n195" labels=":Person"><data key="labels">:Person</data><data key="born">1952</data><data key="name">Joel Silver</data></node>
+<edge id="e267" source="n189" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Neo"]</data></edge>
+<edge id="e268" source="n190" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Trinity"]</data></edge>
+<edge id="e269" source="n191" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Morpheus"]</data></edge>
+<edge id="e270" source="n192" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Agent Smith"]</data></edge>
+<edge id="e271" source="n193" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
+<edge id="e272" source="n194" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
+<edge id="e273" source="n195" target="n188" label="PRODUCED"><data key="label">PRODUCED</data></edge>
+</graph>
+</graphml>
+----
+
+
+.The following imports a graph based on `movies.graphml`
+[source,cypher]
+----
+CALL apoc.import.graphml("movies.graphml", {})
+----
+
+If we run this query, we'll see the following output:
+
+.Results
+[opts="header"]
+|===
+| file                                                    | source | format    | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "movies.graphml" | "file" | "graphml" | 8     | 7             | 36         | 23   | 0    | -1        | 0       | TRUE | NULL
+|===
+
+We can run the following query to see the imported graph:
+
+[source,cypher]
+----
+MATCH p=()-->()
+RETURN p
+----
+
+.Results
+[opts="header"]
+|===
+| p
+| ({name: "Laurence Fishburne", born: "1961", labels: ":Person"})-[:ACTED_IN {roles: "[\"Morpheus\"]", label: "ACTED_IN"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999",
+labels: ":Movie"})
+| ({name: "Carrie-Anne Moss", born: "1967", labels: ":Person"})-[:ACTED_IN {roles: "[\"Trinity\"]", label: "ACTED_IN"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", la
+bels: ":Movie"})    | ({name: "Lana Wachowski", born: "1965", labels: ":Person"})-[:DIRECTED {label: "DIRECTED"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"})
+
+                    | ({name: "Joel Silver", born: "1952", labels: ":Person"})-[:PRODUCED {label: "PRODUCED"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"})
+
+                    | ({name: "Lilly Wachowski", born: "1967", labels: ":Person"})-[:DIRECTED {label: "DIRECTED"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"})
+
+                    | ({name: "Keanu Reeves", born: "1964", labels: ":Person"})-[:ACTED_IN {roles: "[\"Neo\"]", label: "ACTED_IN"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":
+Movie"})
+| ({name: "Hugo Weaving", born: "1960", labels: ":Person"})-[:ACTED_IN {roles: "[\"Agent Smith\"]", label: "ACTED_IN"}]->({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", la
+bels: ":Movie"})
+|===
+
+The labels defined in the GraphML file have been added to the `labels` property on each node, rather than being added as a node label.
+We can set the config property `readLabels: true` to import native labels:
+
+.The following imports a graph based on `movies.graphml` and stores node labels
+[source,cypher]
+----
+CALL apoc.import.graphml("movies.graphml", {readLabels: true})
+----
+
+.Results
+[opts="header"]
+|===
+| file                                                    | source | format    | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "movies.graphml" | "file" | "graphml" | 8     | 7             | 21         | 23   | 0    | -1        | 0       | TRUE | NULL
+|===
+
+And now let's re-run the query to see the imported graph:
+
+[source,cypher]
+----
+MATCH p=()-->()
+RETURN;
+----
+
+.Results
+[opts="header"]
+|===
+| p
+| (:Person {name: "Lilly Wachowski", born: "1967"})-[:DIRECTED]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Carrie-Anne Moss", born: "1967"})-[:ACTED_IN {roles: "[\"Trinity\"]"}]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Hugo Weaving", born: "1960"})-[:ACTED_IN {roles: "[\"Agent Smith\"]"}]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Laurence Fishburne", born: "1961"})-[:ACTED_IN {roles: "[\"Morpheus\"]"}]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Keanu Reeves", born: "1964"})-[:ACTED_IN {roles: "[\"Neo\"]"}]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Joel Silver", born: "1952"})-[:PRODUCED]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+| (:Person {name: "Lana Wachowski", born: "1965"})-[:DIRECTED]->(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"})
+|===
--- a/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.xml.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/apoc.import.xml.adoc
+The `apoc.import.json` procedure can be used to import JSON files created by the xref::overview/apoc.export/index.adoc[`apoc.export.json.*`] procedures.
+
+`all.json` contains a subset of Neo4j's movies graph, and was generated by xref::overview/apoc.export/apoc.export.json.all.adoc[].
+
+.all.json
+[source,json]
+----
+include::example$data/exportJSON/all.json[leveloffset]
+----
+
+We can import this file using `apoc.import.json`.
+
+[source,cypher]
+----
+CALL apoc.import.json("file:///all.json")
+----
+
+.Results
+[opts=header]
+|===
+| file               | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "file:///all.json" | "file" | "json" | 3     | 1             | 15         | 105  | 4    | -1        | 0       | TRUE | NULL
+|===
\ No newline at end of file
--- a/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.csv.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.csv.adoc
+The procedure support the following config parameters:
+
+.Config parameters
+[opts=header]
+|===
+| name | type | default | description
+| readLabels | Boolean | false | Creates node labels based on the value in the `labels` property of `node` elements
+| defaultRelationshipType | String | RELATED | The default relationship type to use if none is specified in the GraphML file
+| storeNodeIds | Boolean | false | store the `id` property of `node` elements
+| batchSize | Integer | 20000 | The number of elements to process per transaction
+|===
\ No newline at end of file
--- a/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.graphml.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.graphml.adoc
+The procedure support the following config parameters:
+
+.Config parameters
+[opts=header]
+|===
+| name | type | default | description
+| nodes | Map<String, List<String>> | {}| properties to include for each node label e.g. `{Movie: ['title']}`
+| rels | Map<String, List<String>> | {} | properties to include for each relationship type e.g. `{`ACTED_IN`: ["roles"]}`
+|===
\ No newline at end of file
--- a/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.json.adoc
+++ b/docs/asciidoc/modules/ROOT/partials/usage/config/apoc.import.json.adoc
+The procedure support the following config parameters:
+
+.Config parameters
+[opts=header]
+|===
+| name | type | default | description | https://neo4j.com/docs/operations-manual/current/tools/import/options/[import tool counterpart]
+| delimiter | String | ,  |delimiter character between columns  | `--delimiter=,`
+| arrayDelimiter | String | ; | delimiter character in arrays | `--array-delimiter=;`
+| ignoreDuplicateNodes | Boolean | false | for duplicate nodes, only load the first one and skip the rest (true) or fail the import (false)  | `--ignore-duplicate-nodes=false`
+| quotationCharacter | String | " | quotation character   | `--quote='"'`
+| stringIds | Boolean | true | treat ids as strings  | `--id-type=STRING`
+| skipLines | Integer | 1 | lines to skip (incl. header)  | N/A
+|===
\ No newline at end of file