update docs

ba293da8 · Yuxin Wu · 6a0d33d1 · ba293da8 · ba293da8 · ba293da8
Commit ba293da8 authored Jun 05, 2018 by Yuxin Wu
6 changed files
--- a/README.md
+++ b/README.md
@@ -22,8 +22,9 @@ It's Yet Another TF high-level API, with __speed__, __readability__ and __flexib
      some benchmark scripts.
 2. Focus on __large datasets__.
-	+ It's unnecessary to read/preprocess data with a new language called TF.
+	+ [You don't need `tf.data`](http://tensorpack.readthedocs.io/tutorial/input-source.html#tensorflow-reader-cons).
-	  Tensorpack helps you load large datasets (e.g. ImageNet) in __pure Python__ with autoparallelization.
+ It's unnecessary and painful to process data with a new language called TF.
+	  Tensorpack helps you efficiently load large datasets (e.g. ImageNet) in __pure Python__ with autoparallelization.
 3. It's not a model wrapper.
 	+ There are too many symbolic function wrappers in the world. Tensorpack includes only a few common models.

--- a/docs/tutorial/input-source.md
+++ b/docs/tutorial/input-source.md
@@ -29,8 +29,9 @@ Assuming you have 5GB/s `memcpy` bandwidth (roughly like this if you run single-
 down your training by 10%. Think about how many more copies are made during your preprocessing.
 Failure to hide the data preparation latency is the major reason why people
-cannot see good GPU utilization. __Always choose a framework that allows latency hiding.__
+cannot see good GPU utilization. You should __always choose a framework that enables latency hiding.__
 However most other TensorFlow wrappers are designed to be `feed_dict` based.
+Tensorpack has built-in mechanisms to hide latency of the above stages.
 This is the major reason why tensorpack is [faster](https://github.com/tensorpack/benchmarks).
 ## Python Reader or TF Reader ?
@@ -40,32 +41,25 @@ either Python code or TensorFlow operators, or a mix of two.
 Both are supported in tensorpack, while we recommend using Python. 
 ### TensorFlow Reader: Pros
-* Faster read/preprocessing.
-	* Often true, but not necessarily. With Python you have access to many other fast libraries, which might be unsupported in TF.
+People often think they should use `tf.data` because it's fast.
-	* Python may be just fast enough.
+* Indeed it's often fast, but not necessarily. With Python you have access to many other fast libraries, which might be unsupported in TF.
+* Python may be just fast enough.
    As long as data preparation keeps up with training, and the latency of all four blocks in the
-		above figure is hidden, running faster brings no more gains to overall throughput.
+    above figure is hidden, __faster reader brings no gains to overall throughput__.
    For most types of problems, up to the scale of multi-GPU ImageNet training,
    Python can offer enough speed if you use a fast library (e.g. `tensorpack.dataflow`).
    See the [Efficient DataFlow](efficient-dataflow.html) tutorial on how to build a fast Python reader with DataFlow.
-* No "Copy to TF" (i.e. `feed_dict`) stage.
-	* True. But as mentioned above, the latency can usually be hidden.
-		In tensorpack, TF queues are usually used to hide the "Copy to TF" latency,
-		and TF `StagingArea` can help hide the "Copy to GPU" latency.
-		They are used by most examples in tensorpack.
 ### TensorFlow Reader: Cons
 The disadvantage of TF reader is obvious and it's huge: it's __too complicated__.
-Unlike running a mathematical model, reading data is a complicated and poorly-structured task.
+Unlike running a mathematical model, data processing is a complicated and poorly-structured task.
-You need to handle different formats, handle corner cases, noisy data, combination of data,
+You need to handle different formats, handle corner cases, noisy data, combination of data.
-which require condition operations, loops, data structures, sometimes even exception handling. These operations
+Doing these require condition operations, loops, data structures, sometimes even exception handling. 
-are __naturally not suitable__ for a symbolic graph.
+These operations are __naturally not the right task for a symbolic graph__.
 Let's take a look at what users are asking for `tf.data`:
 * Different ways to [pad data](https://github.com/tensorflow/tensorflow/issues/13969), [shuffle data](https://github.com/tensorflow/tensorflow/issues/14518)
@@ -75,14 +69,14 @@ Let's take a look at what users are asking for `tf.data`:
 * [Sort/skip some data](https://github.com/tensorflow/tensorflow/issues/14250)
 * [Write data to files](https://github.com/tensorflow/tensorflow/issues/15014)
-To support all these features which could've been done with __3 lines of code in Python __, you need either a new TF
+To support all these features which could've been done with __3 lines of code in Python__, you need either a new TF
 API, or ask [Dataset.from_generator](https://www.tensorflow.org/versions/r1.4/api_docs/python/tf/contrib/data/Dataset#from_generator)
 (i.e. Python again) to the rescue.
 It only makes sense to use TF to read data, if your data is originally very clean and well-formated.
 If not, you may feel like writing a script to format your data, but then you're almost writing a Python loader already!
-Think about it: it's a waste of time to write a Python script to transform from raw data to TF-friendly format,
+Think about it: it's a waste of time to write a Python script to transform from some format to TF-friendly format,
 then a TF script to transform from this format to tensors.
 The intermediate format doesn't have to exist.
 You just need the right interface to connect Python to the graph directly, efficiently.

--- a/examples/FasterRCNN/model.py
+++ b/examples/FasterRCNN/model.py
@@ -315,7 +315,7 @@ def sample_fast_rcnn_targets(boxes, gt_boxes, gt_labels):
 @under_name_scope()
 def crop_and_resize(image, boxes, box_ind, crop_size, pad_border=True):
    """
-    Better-aligned version of tf.image.crop_and_resize, following our definition of floating point boxes.
+    Aligned version of tf.image.crop_and_resize, following our definition of floating point boxes.
    Args:
        image: NCHW
@@ -375,7 +375,7 @@ def crop_and_resize(image, boxes, box_ind, crop_size, pad_border=True):
    image_shape = tf.shape(image)[2:]
    boxes = transform_fpcoor_for_tf(boxes, image_shape, [crop_size, crop_size])
-    image = tf.transpose(image, [0, 2, 3, 1])   # 1hwc
+    image = tf.transpose(image, [0, 2, 3, 1])   # nhwc
    ret = tf.image.crop_and_resize(
        image, boxes, tf.to_int32(box_ind),
        crop_size=[crop_size, crop_size])

--- a/examples/keras/README.md
+++ b/examples/keras/README.md
@@ -9,7 +9,7 @@ The article [Towards Efficient Multi-GPU Training in Keras with TensorFlow](http
 has mentioned some of it.
 Even on a single GPU, tensorpack can run [1.2~2x faster](https://github.com/tensorpack/benchmarks/tree/master/other-wrappers)
-than the equivalent Keras code. The gap becomes larger when you scale.
+than the equivalent Keras code. The gap becomes larger when you scale to multiple GPUs.
 Tensorpack and [horovod](https://github.com/uber/horovod/blob/master/examples/keras_imagenet_resnet50.py)
 are the only two tools I know that can scale the training of a large Keras model.
@@ -28,15 +28,15 @@ It has:
 + ResNet-50 model modified from [keras.applications](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/_impl/keras/applications/resnet50.py).
 	(We put stride on 3x3 conv in each bottleneck, which is different from certain other implementations).
 + Multi-GPU data-parallel __training and validation__ which scales
-	+ Finished 100 epochs in 19.5 hours on 8 V100s, with >90% GPU utilization.
+	+ Finished 100 epochs in 19 hours on 8 V100s, with >90% GPU utilization.
 	+ Still slightly slower than native tensorpack examples.
 + Good accuracy (same as [tensorpack ResNet example](../ResNet))
 ### Note:
-Keras support is __not official__. Keras does not use variable scopes or variable
+Keras support is __not official__. Keras does not respect variable scopes or variable
-collections, which contradicts with tensorpack trainers.
+collections, which contradicts with TensorFlow conventions and tensorpack trainers.
-Therefore, not all Keras layers are supported in tensorpack.
+Therefore, the support in tensorpack is experimental.
 These simple examples can run within tensorpack smoothly, but note that a future version
-of Keras may still break them (unlikely, though).
+of Keras may break them (unlikely, though).
--- a/tensorpack/callbacks/prof.py
+++ b/tensorpack/callbacks/prof.py
@@ -211,4 +211,4 @@ class PeakMemoryTracker(Callback):
        results = rv.results
        if results is not None:
            for mem, dev in zip(results, self._devices):
-                self.trainer.monitors.put_scalar('PeakMemory(MB) ' + dev, mem / 1e6)
+                self.trainer.monitors.put_scalar('PeakMemory(MB)' + dev, mem / 1e6)
--- a/tensorpack/contrib/keras.py
+++ b/tensorpack/contrib/keras.py
@@ -75,6 +75,7 @@ class KerasModelCaller(object):
                                "This was automatically corrected by tensorpack.".format(n))
            # Keras models might not use this collection at all (in some versions).
+            # This is a BC-breaking change of tf.keras: https://github.com/tensorflow/tensorflow/issues/19643
            restore_collection(update_ops_backup)
            for op in model.updates:
                tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, op)