update docs

fb2a1f34 · Yuxin Wu · 13ee370c · fb2a1f34
Commit fb2a1f34 authored Aug 08, 2019 by Yuxin Wu
Show whitespace changes
Inline Side-by-side

Showing with 12 additions and 8 deletions

docs/tutorial/parallel-dataflow.md docs/tutorial/parallel-dataflow.md +12 -8

No files found.
--- a/docs/tutorial/parallel-dataflow.md
+++ b/docs/tutorial/parallel-dataflow.md
 # Parallel DataFlow

 This tutorial explains the parallel building blocks
-inside DataFlow, since most of the time they are the only thing
+inside DataFlow, since most of the time they are the only things
 needed to build an efficient dataflow.


@@ -23,7 +23,7 @@ d1 = MyDataFlow()   # some dataflow written by the user
 d2 = MultiProcessRunnerZMQ(d1, num_proc=20)
 ```

-The second line starts 25 processes running `d1`, and merge the results.
+The second line starts 20 processes running `d1`, and merge the results.
 You can then obtain the results in `d2`.

 Note that, all the workers run independently in this pattern.
@@ -59,7 +59,7 @@ The main difference between this pattern and the first, is that:
 1. `d1` is not executed in parallel. Only `f` runs in parallel.
  Therefore you don't have to worry about randomness or data distribution shift.
  Also you need to make `d1` very efficient (e.g., just produce small metadata).
-2. More communication is required to send data to workers.
+2. More communication is required, because it needs to send data to workers.

 See its [API documentation](../modules/dataflow.html#tensorpack.dataflow.MultiProcessMapData)
 to learn more details.
@@ -89,6 +89,10 @@ Using threads and processes have their pros and cons:
 The best choice of the above parallel utilities varies across machines and tasks.
 You can even combine threads and processes sometimes.

+Note that in tensorpack, all the multiprocessing DataFlow with "ZMQ" in the name creates
+__zero Python threads__: this is a key implementation detail that makes tensorpack DataFlow
+faster than the alternatives in Keras or Pytorch.
+
 For a new task, you often need to do a quick benchmark to choose the best pattern.
 See [Performance Tuning Tutorial](performance-tuning.html)
 on how to effectively understand the performance of a DataFlow.