update docs

5772d5fd · Yuxin Wu · 3e0dffac · 5772d5fd · 5772d5fd · 5772d5fd
Commit 5772d5fd authored Nov 02, 2018 by Yuxin Wu
9 changed files
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -23,7 +23,7 @@ Some typical questions that we DO NOT answer:
 + "The examples do not perform well after I change the models/dataset/parameters/etc."
  Tensorpack maintainers make sure the examples perform well without modification.
  But it's your job to pick the model and parameters that are suitable for your own situation.
-  We do not help with such questions unless they appear to be a bug in tensorpack.
+  We cannot help with such questions unless they appear to be a bug in tensorpack.
 + "Why my model doesn't work?", "I don't understand this paper you implement.",
  "How should I change the examples for my own dataset?"
  We do not answer machine learning questions.

--- a/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
+++ b/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
@@ -4,22 +4,40 @@ about: Report unexpected problems about Tensorpack or its examples.

 ---

-__PLEASE ALWAYS INCLUDE__:
-1. What you did:
-  + If you're using examples:
-    + What's the command you run:
-    + Have you made any changes to the examples? Paste them if any:
-  + If not, tell us what you did that may be relevant.
-    But we may not investigate it if there is no reproducible code.
-  + Better to paste what you did instead of describing them.
-2. What you observed, including but not limited to the __entire__ logs.
-  + Better to paste what you observed instead of describing them.
-3. What you expected, if not obvious.
-4. Your environment:
-  + Python version.
+If you're asking about an unexpected problem you met, use this template.
+__PLEASE DO NOT DELETE THIS TEMPLATE, FILL IT__:
+
+### 1. What you did:
+
+(1) **If you're using examples, what's the command you run:**
+
+(2) **If you're using examples, have you made any changes to the examples? Paste them here:**
+
+(3) **If not using examples, tell us what you did here:**
+
+Note that we may not be able to investigate it if there is no reproducible code.
+It's always better to paste what you did instead of describing them.
+
+### 2. What you observed:
+
+(1) **Include the ENTIRE logs here:**
+
+It's always better to paste what you observed instead of describing them.
+
+A part of logs is sometimes enough, but it's always better to paste as much as possible.
+
+You can run a command with `CMD 2>&1 | tee logs.txt` to save all stdout & stderr logs to one file.
+
+(2) **Other observations, if any:**
+For example, CPU/GPU utilization, output images, tensorboard curves, if relevant to your issue.
+
+### 3. What you expected, if not obvious.
+### 4. Your environment:
+  + Python version:
  + TF version: `python -c 'import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)'`.
  + Tensorpack version: `python -c 'import tensorpack; print(tensorpack.__version__)'`.
-      You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`.:
+      You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`
+      and see if your issue is already solved.
  + Hardware information, e.g. number of GPUs used.

 About efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
--- a/.github/ISSUE_TEMPLATE/using-understanding-tensorpack.md
+++ b/.github/ISSUE_TEMPLATE/using-understanding-tensorpack.md
@@ -4,7 +4,7 @@ about: More general questions about Tensorpack.

 ---

-+ If you did something specific and it failed, please use the "Unexpected Problems /
+ If you did something with tensorpack and it failed, please use the "Unexpected Problems /
  Bugs" category.

 + Your question is probably answered in [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials). Read it first.
@@ -21,7 +21,7 @@ Some typical questions that we DO NOT answer:
 + "The examples do not perform well after I change the models/dataset/parameters/etc."
  Tensorpack maintainers make sure the examples perform well without modification.
  But it's your job to pick the model and parameters that are suitable for your own situation.
-  We do not help with such questions unless they appear to be a bug in tensorpack.
+  We cannot help with such questions unless they appear to be a bug in tensorpack.
 + "Why my model doesn't work?", "I don't understand this paper you implement.",
  "How should I change the examples for my own dataset?"
  We do not answer machine learning questions. 

--- a/docs/conf.py
+++ b/docs/conf.py
@@ -30,7 +30,6 @@ MOCK_MODULES = ['tabulate', 'h5py',
                'scipy', 'scipy.misc', 'scipy.io',
                'tornado', 'tornado.concurrent',
                'horovod', 'horovod.tensorflow',
-                'pyarrow',
                'subprocess32', 'functools32']
 for mod_name in MOCK_MODULES:
    sys.modules[mod_name] = mock.Mock(name=mod_name)

--- a/docs/tutorial/dataflow.md
+++ b/docs/tutorial/dataflow.md
@@ -61,6 +61,7 @@ the rest of the data pipeline.
 Nevertheless, tensorpack supports data loading with native TF operators / TF datasets as well.

 ### Use DataFlow outside Tensorpack
+
 Normally, tensorpack `InputSource` interface links DataFlow to the graph for training.
 If you use DataFlow in other places such as your custom code, call `reset_state()` first to initialize it,
 and then use the generator however you like:
@@ -71,3 +72,6 @@ df.reset_state()
 for dp in df:
 	# dp is now a list. do whatever
 ```
+
+Read the [API documentation](../../modules/dataflow.html#tensorpack.dataflow.DataFlw)
+to see API details of DataFlow.
--- a/docs/tutorial/extend/dataflow.md
+++ b/docs/tutorial/extend/dataflow.md
@@ -28,21 +28,16 @@ class MyDataFlow(DataFlow):
      digit = np.random.rand(28, 28)
      label = np.random.randint(10)
      yield [digit, label]
+      
+df = MyDataFlow()
+df.reset_state()
+for datapoint in df:
+    print(datapoint[0], datapoint[1])
 ```

-Optionally, you can implement the following two methods:
-
-+ `__len__()`. Return the number of elements the generator can produce. Certain tensorpack features might need it.
-  This is optional, and even when implemented, it is
-  not guaranteed to be an accurate length because it's impossible to know the length of certain generator.
-
-+ `reset_state()`. It is guaranteed that the actual process which runs a DataFlow will invoke this method before using it.
-  So if this DataFlow needs to do something after a `fork()`, you should put it here.
-  `reset_state()` must be called once and only once for each DataFlow instance.
-
-  A typical example is when your DataFlow uses random number generator (RNG). Then you would need to reset the RNG here.
-  Otherwise, child processes will have the same random seed. The `RNGDataFlow` base class does this for you.
-  You can subclass `RNGDataFlow` to access `self.rng` whose seed has been taken care of.
+Optionally, you can implement the `__len__` and `reset_state` method. 
+The detailed semantics of these three methods are explained 
+in the [API documentation](../../modules/dataflow.html#tensorpack.dataflow.DataFlow).

 DataFlow implementations for several well-known datasets are provided in the
 [dataflow.dataset](../../modules/dataflow.dataset.html)
@@ -66,5 +61,5 @@ class ProcessingDataFlow(DataFlow):

 Some built-in dataflows, e.g.
 [MapData](../../modules/dataflow.html#tensorpack.dataflow.MapData) and 
-[MapDataComponent](../../https://tensorpack.readthedocs.io/modules/dataflow.html#tensorpack.dataflow.MapDataComponent)
-can do the above type of data processing for you.
+[MapDataComponent](../../modules/dataflow.html#tensorpack.dataflow.MapDataComponent)
+can do common types of data processing for you.
--- a/examples/ImageNetModels/README.md
+++ b/examples/ImageNetModels/README.md
@@ -12,7 +12,7 @@ Reproduce ImageNet results of the following two papers:
 + [ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices](https://arxiv.org/abs/1707.01083)
 + [ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design](https://arxiv.org/abs/1807.11164)

-| Model                                                                                                    | Flops | Top-1 Error | Claimed Error | Flags         |
+| Model                                                                                                    | Flops | Top-1 Error | Paper's Error | Flags         |
 |:---------------------------------------------------------------------------------------------------------|:------|:-----------:|:-------------:|:-------------:|
 | ShuffleNetV1 0.5x  [:arrow_down:](http://models.tensorpack.com/ImageNetModels/ShuffleNetV1-0.5x-g=8.npz) | 40M   | 40.8%       | 42.3%         | `-r=0.5`      |
 | ShuffleNetV1 1x    [:arrow_down:](http://models.tensorpack.com/ImageNetModels/ShuffleNetV1-1x-g=8.npz)   | 140M  | 32.6%       | 32.4%         | `-r=1`        |

--- a/examples/ResNet/README.md
+++ b/examples/ResNet/README.md
@@ -9,7 +9,7 @@ __Training__ code of three variants of ResNet on ImageNet:

 The training follows the __exact__ recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677)
 and gets the same performance.
-Models can be [downloaded here](http://models.tensorpack.com/ResNet/).
+Models trained with 8 GPUs and a total batch size of 256 is listed in the table below.

 This recipe has better performance than most open source implementations.
 In fact, many papers that claim to "improve" ResNet only compete with a lower

--- a/tensorpack/dataflow/base.py
+++ b/tensorpack/dataflow/base.py
@@ -39,7 +39,6 @@ class DataFlowReentrantGuard(object):
        return False


-# NOTE: we cannot use six here
 class DataFlowMeta(ABCMeta):
    """
    DataFlow uses "__iter__()" and "__len__()" instead of
@@ -64,7 +63,20 @@ class DataFlow(object):
    @abstractmethod
    def __iter__(self):
        """
-        The method to generate datapoints.
+        * A dataflow is an iterable. The :meth:`__iter__` method should yield a list each time.
+          Each element in the list should be either a number or a numpy array.
+          For now, tensorpack also partially supports dict instead of list.
+
+        * The :meth:`__iter__` method can be either finite (will stop iteration) or infinite
+          (will not stop iteration). For a finite dataflow, :meth:`__iter__` can be called
+          again after the previous call returned.
+
+        * For many dataflow, the :meth:`__iter__` method is non-reentrant, which means for an dataflow
+          instance ``df``, :meth:`df.__iter__` cannot be called before the previous
+          :meth:`df.__iter__` call has finished (iteration has stopped).
+          If a dataflow is non-reentrant, :meth:`df.__iter__` should throw an exception if
+          called before the previous call has finished.
+          If you need to use the same dataflow in two places, you can simply create two dataflow instances.

        Yields:
            list: The datapoint, i.e. list of components.
@@ -75,8 +87,31 @@ class DataFlow(object):

    def __len__(self):
        """
+        * A dataflow can optionally implement :meth:`__len__`. If not implemented, it will
+          throw :class:`NotImplementedError`.
+
+        * It returns an integer representing the size of the dataflow.
+          The return value **may not be accurate or meaningful** at all.
+          When it's accurate, it means that :meth:`__iter__` will always yield this many of datapoints.
+
+        * There could be many reasons why :meth:`__len__` is inaccurate.
+          For example, some dataflow has dynamic size.
+          Some dataflow mixes the datapoints between consecutive epochs
+          due to parallelism and buffering, then it does not make sense to stop the
+          iteration anywhere.
+
+        * Due to the above reasons, the length is only a rough guidance. Inside
+          tensorpack it's only used in these places:
+
+          + A default ``steps_per_epoch`` in training, but you probably want to customize
+            it yourself, especially when using data-parallel trainer.
+          + The length of progress bar when processing a dataflow.
+          + Used by :class:`InferenceRunner` to get the number of iterations in inference.
+            In this case users are responsible for making sure that :meth:`__len__` is accurate.
+            This is to guarantee that inference is run on a fixed set of images.
+
        Returns:
-            int: size of this data flow.
+            int: rough size of this dataflow.

        Raises:
            :class:`NotImplementedError` if this DataFlow doesn't have a size.
@@ -88,17 +123,21 @@ class DataFlow(object):

    def reset_state(self):
        """
-        Reset state of the dataflow.
-        It **has to** be called once and only once before producing datapoints.
+        * It's guaranteed that :meth:`reset_state` should be called **once and only once**
+          by the **process that uses the dataflow** before :meth:`__iter__` is called.
+          The caller thread of this method should stay alive to keep this dataflow alive.
+
+        * It is meant for initialization works that involve processes,
+          e.g., initialize random number generator (RNG), create worker processes.
+
+          Because it's very common to use RNG in data processing,
+          developers of dataflow can also subclass :class:`RNGDataFlow` to simplify the work.

-        Note:
-            1. If the dataflow is forked, each process will call this method
-               before producing datapoints.
-            2. The caller thread of this method must remain alive to keep this dataflow alive.
+        * A dataflow is not fork-safe after :meth:`reset_state` is called (because this will violate the guarantee).
+          A few number of dataflow is not fork-safe anytime, which will be mentioned in the docs.

-        For example, RNG **has to** be reset if used in the DataFlow,
-        otherwise it won't work well with prefetching, because different
-        processes will have the same RNG state.
+        * You should follow the above guarantee if you're using a dataflow yourself
+          (either outside of tensorpack, or writing a wrapper dataflow)
        """
        pass

@@ -106,6 +145,12 @@ class DataFlow(object):
 class RNGDataFlow(DataFlow):
    """ A DataFlow with RNG"""

+    rng = None
+    """
+    ``self.rng`` is a ``np.random.RandomState`` instance that is initialized
+    correctly in ``RNGDataFlow.reset_state()``.
+    """
+
    def reset_state(self):
        """ Reset the RNG """
        self.rng = get_rng(self)