Commit 1b27a59b authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 0f0a9ad7
...@@ -140,7 +140,7 @@ We can also dump the dataset into one single LMDB file and read it sequentially. ...@@ -140,7 +140,7 @@ We can also dump the dataset into one single LMDB file and read it sequentially.
```python ```python
from tensorpack.dataflow import * from tensorpack.dataflow import *
class BinaryILSVRC12(ILSVRCFiles): class BinaryILSVRC12(dataset.ILSVRCFiles):
def get_data(self): def get_data(self):
for fname, label in super(BinaryILSVRC12, self).get_data(): for fname, label in super(BinaryILSVRC12, self).get_data():
with open(fname, 'rb') as f: with open(fname, 'rb') as f:
...@@ -155,7 +155,7 @@ The above script builds a DataFlow which produces jpeg-encoded ImageNet data. ...@@ -155,7 +155,7 @@ The above script builds a DataFlow which produces jpeg-encoded ImageNet data.
We store the jpeg string as a numpy array because the function `cv2.imdecode` later expect this format. We store the jpeg string as a numpy array because the function `cv2.imdecode` later expect this format.
Please note we can only use 1 prefetch process to speed up. If `nr_proc>1`, `ds1` will take data Please note we can only use 1 prefetch process to speed up. If `nr_proc>1`, `ds1` will take data
from several forks of `ds0`, then neither the content nor the order of `ds1` will be the same as `ds0`. from several forks of `ds0`, then neither the content nor the order of `ds1` will be the same as `ds0`.
See [documentation](http://localhost:8000/modules/dataflow.html#tensorpack.dataflow.PrefetchDataZMQ) See [documentation](http://tensorpack.readthedocs.io/en/latest/modules/dataflow.html#tensorpack.dataflow.PrefetchDataZMQ)
about caveats of `PrefetchDataZMQ`. about caveats of `PrefetchDataZMQ`.
It will generate a database file of 140G. We build a DataFlow to read this LMDB file sequentially: It will generate a database file of 140G. We build a DataFlow to read this LMDB file sequentially:
......
...@@ -126,9 +126,9 @@ class PrefetchDataZMQ(ProxyDataFlow): ...@@ -126,9 +126,9 @@ class PrefetchDataZMQ(ProxyDataFlow):
collect datapoints from `ds` in each process by ZeroMQ IPC pipe. collect datapoints from `ds` in each process by ZeroMQ IPC pipe.
Note: Note:
1. An iterator cannot run faster automatically -- the underlying dataflow worker 1. An iterator cannot run faster automatically -- what's happenning is
will be forked ``nr_proc`` times. As a result, we have the following that the underlying dataflow will be forked ``nr_proc`` times.
guarantee on the dataflow correctness: As a result, we have the following guarantee on the dataflow correctness:
a. When ``nr_proc=1``, the dataflow produces the same data as ``ds`` in the same order. a. When ``nr_proc=1``, the dataflow produces the same data as ``ds`` in the same order.
b. When ``nr_proc>1``, the dataflow produces the same distribution b. When ``nr_proc>1``, the dataflow produces the same distribution
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment