docs update

e323c134 · Yuxin Wu · 7dc54821 · e323c134 · e323c134 · e323c134
Commit e323c134 authored Nov 30, 2017 by Yuxin Wu
7 changed files
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -5,7 +5,7 @@ Bug Reports: PLEASE always include
 2. What you observed, e.g. logs.
 3. What you expected, if not obvious.
 4. Your environment (TF version, cudnn version, number & type of GPUs), if it matters.
-5. About low performance, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
+5. About efficiency, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html

 Feature Requests:
 1. Improve an existing feature.

--- a/README.md
+++ b/README.md
@@ -57,7 +57,7 @@ See [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html)
 Dependencies:

 + Python 2.7 or 3
-+ TensorFlow >= 1.0.0 (>=1.1.0 for Multi-GPU)
+ TensorFlow >= 1.2.0
 + Python bindings for OpenCV (Optional, but required by a lot of features)
 ```
 pip install -U git+https://github.com/ppwwyyxx/tensorpack.git

--- a/docs/tutorial/callback.md
+++ b/docs/tutorial/callback.md
@@ -36,7 +36,7 @@ callbacks=[
  # schedule the learning rate based on epoch number
  ScheduledHyperParamSetter('learning_rate',
                            [(30, 1e-2), (60, 1e-3), (85, 1e-4), (95, 1e-5)]),
-  # can manually change the learning rate through a file during training
+  # can manually change the learning rate through a file, without interrupting training
  HumanHyperParamSetter('learning_rate'),
  # send validation error to my phone through pushbullet
  SendStat('curl -u your_id_xxx: https://api.pushbullet.com/v2/pushes \\
@@ -45,12 +45,12 @@ callbacks=[
             'val-error-top1'),
  # record GPU utilizations during training
  GPUUtilizationTracker(),
-  # can pause the training and start a debug shell, to observe what's going on
+  # Touch a file to pause the training and start a debug shell, to observe what's going on
  InjectShell(shell='ipython')
 ] + [    # these callbacks are enabled by default already, though you can customize them
-  # maintain those moving average summaries already defined in the model (e.g. training loss, training error)
+  # maintain those moving average summaries defined in the model (e.g. training loss, training error)
  MovingAverageSummary(),
-  # draw a nice progress bar
+  # draw a progress bar
  ProgressBar(),
  # run `tf.summary.merge_all` every epoch and log to monitors
  MergeAllSummaries(),
@@ -69,7 +69,7 @@ monitors=[        # monitors are a special kind of callbacks. these are also ena

 Notice that callbacks cover every detail of training, ranging from graph operations to the progress bar.
 This means you can customize every part of the training to your preference, e.g. display something
-different in the progress bar, evaluating part of the summaries at a different frequency, etc.
+different in the progress bar, evaluate part of the summaries at a different frequency, etc.

 These features may not be always useful, but think about how messy the main loop would look like if you
 were to write these logic together with the loops, and how easy your life will be if you could enable

--- a/docs/tutorial/inference.md
+++ b/docs/tutorial/inference.md
@@ -24,13 +24,13 @@ They are sufficient to use with whatever deployment methods TensorFlow supports.
 But you'll need to read TF docs and do it on your own.

 Please note that, the metagraph saved during training is the training graph.
-But you may need a different one for inference.
+But sometimes you need a different one for inference.
 For example, you may need a different data layout for CPU inference,
 or you may need placeholders in the inference graph, or the training graph contains multi-GPU replication
 which you want to remove.
-In this case, you can always create a new graph by yourself with TF symbolic functions.
+In this case, you can always create a new graph with pure TensorFlow.

-The only thing tensorpack has is `OfflinePredictor`,
+The only thing tensorpack has for this purpose is `OfflinePredictor`,
 a simple function to build the graph and a callable for you.
 It is mainly for quick demo purpose.
 It only runs inference on Python data, therefore may not be the most efficient way.

--- a/examples/README.md
+++ b/examples/README.md

 # tensorpack examples

-Training examples with __reproducible__ performance.
+Training examples with __reproducible performance__.
+
+Reproducing a method is usually easy, but you don't know whether you've made mistakes, because wrong code will often appear to work.
+Reproducible performance results are what really matters.
+See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba).

-__Reproducible performance is important__. Usually deep learning code is easy to write,
-but hard to know the correctness -- wrong code will usually still converge.
-Without a setting and performance comparable to someone else, you don't know if an implementation is correct or not.

 ## Getting Started:
 + [An illustrative mnist example with explanation of the framework](mnist-convnet.py)
 + The same mnist example using [tf-slim](mnist-tfslim.py), [Keras layers](mnist-keras.py), [Higher-level Keras](mnist-keras-v2.py) and [with weights visualizations](mnist-visualizations.py)
-+ [A tiny SVHN ConvNet with 97.8% accuracy](svhn-digit-convnet.py)
+ A tiny [Cifar ConvNet](cifar-convnet.py) and [SVHN ConvNet](svhn-digit-convnet.py)
 + [A boilerplate file to start with, for your own tasks](boilerplate.py)

 ## Vision:
@@ -26,14 +27,13 @@ Without a setting and performance comparable to someone else, you don't know if
 | [Visualize CNN saliency maps](Saliency) | visually reproduce |
 | [Similarity learning on MNIST](SimilarityLearning) | |
 | Learn steering filters with [Dynamic Filter Networks](DynamicFilterNetwork) | visually reproduce |
-| Load a pre-trained [AlexNet](load-alexnet.py) or [VGG16](load-vgg16.py) model | |
-| Load a pre-trained [Convolutional Pose Machines](ConvolutionalPoseMachines/) | |
+| Load a pre-trained [AlexNet](load-alexnet.py), [VGG16](load-vgg16.py), or [Convolutional Pose Machines](ConvolutionalPoseMachines/) | |

 ## Reinforcement Learning:
 | Name | Performance |
 | ---  | --- |
-| [Deep Q-Network(DQN) variants on Atari games](DeepQNetwork), including DQN, DoubleDQN, DuelingDQN.  | reproduce paper |
-| [Asynchronous Advantage Actor-Critic(A3C) with demos on OpenAI Gym](A3C-Gym) | reproduce paper |
+| [Deep Q-Network(DQN) variants on Atari games](DeepQNetwork), including <br/> DQN, DoubleDQN, DuelingDQN.  | reproduce paper |
+| [Asynchronous Advantage Actor-Critic(A3C) on Atari games](A3C-Gym) | reproduce paper |

 ## Speech / NLP:
 | Name | Performance |
@@ -41,10 +41,3 @@ Without a setting and performance comparable to someone else, you don't know if
 | [LSTM-CTC for speech recognition](CTC-TIMIT) | reproduce paper |
 | [char-rnn for fun](Char-RNN) | fun |
 | [LSTM language model on PennTreebank](PennTreebank) | reproduce reference code |
-
-
-#### Note to contributors:
-
-Example needs to satisfy one of the following:
-+ Reproduce performance of a published or well-known paper.
-+ Illustrate a new way of using the library that is currently not covered.
--- a/examples/ShuffleNet/README.md
+++ b/examples/ShuffleNet/README.md
@@ -4,10 +4,8 @@
 Reproduce [ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices](https://arxiv.org/abs/1707.01083)
 on ImageNet.

-This is a 40Mflops ShuffleNet,
-roughly corresponding to `ShuffleNet 0.5x (arch2)	g=8` in the paper.
-But detailed architecture may not be the same.
-After 100 epochs it reaches top-1 error of 42.62.
+This is a 40Mflops ShuffleNet, corresponding to `ShuffleNet 0.5x (arch2)	g=8` in the paper.
+After 100 epochs it reaches top-1 error of 42.62, matching the paper's number.

 ### Usage:

@@ -15,7 +13,7 @@ Print flops with tensorflow:
 ```bash
 ./shufflenet.py --flops
 ```
-It will print about 80Mflops, because TF counts FMA as 2 flops while the paper counts it as 1 flop.
+It will print about 80Mflops, because the paper counts multiply+add as 1 flop.

 Train (takes 24 hours on 8 Maxwell TitanX):
 ```bash

--- a/tensorpack/libinfo.py
+++ b/tensorpack/libinfo.py
@@ -30,7 +30,7 @@ try:
    _version = tf.__version__.split('.')
    assert int(_version[0]) >= 1, "TF>=1.0 is required!"
    if int(_version[1]) < 2:
-        print("TF<1.2 support will be removed in the future!")
+        print("TF<1.2 support will be removed after 2018-02-28!")
    _HAS_TF = True
 except ImportError:
    _HAS_TF = False