21_readstream.md 2.1 KB
Newer Older
S
SheetJS 已提交
1 2 3
### Streaming Read

<details>
S
SheetJS 已提交
4
  <summary><b>Why is there no Streaming Read API?</b> (click to show)</summary>
S
SheetJS 已提交
5 6 7 8

The most common and interesting formats (XLS, XLSX/M, XLSB, ODS) are ultimately
ZIP or CFB containers of files.  Neither format puts the directory structure at
the beginning of the file: ZIP files place the Central Directory records at the
9
end of the logical file, while CFB files can place the storage info anywhere in
S
SheetJS 已提交
10 11
the file! As a result, to properly handle these formats, a streaming function
would have to buffer the entire file before commencing.  That belies the
S
SheetJS 已提交
12 13 14 15 16 17 18 19 20
expectations of streaming, so we do not provide any streaming read API.

</details>

When dealing with Readable Streams, the easiest approach is to buffer the stream
and process the whole thing at the end.  This can be done with a temporary file
or by explicitly concatenating the stream:

<details>
S
SheetJS 已提交
21
  <summary><b>Explicitly concatenating streams</b> (click to show)</summary>
S
SheetJS 已提交
22 23 24 25 26

```js
var fs = require('fs');
var XLSX = require('xlsx');
function process_RS(stream/*:ReadStream*/, cb/*:(wb:Workbook)=>void*/)/*:void*/{
S
SheetJS 已提交
27 28 29 30 31
  var buffers = [];
  stream.on('data', function(data) { buffers.push(data); });
  stream.on('end', function() {
    var buffer = Buffer.concat(buffers);
    var workbook = XLSX.read(buffer, {type:"buffer"});
S
SheetJS 已提交
32

S
SheetJS 已提交
33 34 35
    /* DO SOMETHING WITH workbook IN THE CALLBACK */
    cb(workbook);
  });
S
SheetJS 已提交
36 37 38 39 40 41 42 43
}
```

More robust solutions are available using modules like `concat-stream`.

</details>

<details>
S
SheetJS 已提交
44
  <summary><b>Writing to filesystem first</b> (click to show)</summary>
S
SheetJS 已提交
45

S
SheetJS 已提交
46
This example uses [`tempfile`](https://npm.im/tempfile) to generate file names:
S
SheetJS 已提交
47 48 49 50 51

```js
var fs = require('fs'), tempfile = require('tempfile');
var XLSX = require('xlsx');
function process_RS(stream/*:ReadStream*/, cb/*:(wb:Workbook)=>void*/)/*:void*/{
S
SheetJS 已提交
52 53 54 55 56 57 58
  var fname = tempfile('.sheetjs');
  console.log(fname);
  var ostream = fs.createWriteStream(fname);
  stream.pipe(ostream);
  ostream.on('finish', function() {
    var workbook = XLSX.readFile(fname);
    fs.unlinkSync(fname);
S
SheetJS 已提交
59

S
SheetJS 已提交
60 61 62
    /* DO SOMETHING WITH workbook IN THE CALLBACK */
    cb(workbook);
  });
S
SheetJS 已提交
63 64 65 66
}
```

</details>
S
SheetJS 已提交
67