Commit a156f2bf authored by Yuxin Wu's avatar Yuxin Wu

update DoReFa readme

parent b5b22a85
...@@ -2,6 +2,22 @@ Code and model for the paper: ...@@ -2,6 +2,22 @@ Code and model for the paper:
[DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](http://arxiv.org/abs/1606.06160), by Zhou et al. [DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](http://arxiv.org/abs/1606.06160), by Zhou et al.
It also contains an implementation of the following papers:
+ [Binary Weight Network](https://arxiv.org/abs/1511.00363), with (W,A,G)=(1,32,32).
+ [Trained Ternary Quantization](https://arxiv.org/abs/1612.01064), with (W,A,G)=(t,32,32).
+ [Binarized Neural Networks](https://arxiv.org/abs/1602.02830), with (W,A,G)=(1,1,32).
These different quantization techniques achieves the following accuracy in this implementation:
| Model | W,A,G | Top 1 Error |
|:-------------------|-------------|------------:|
| Full Precision | 32,32,32 | 41.4% |
| TTQ | t,32,32 | 41.9% |
| BWN | 1,32,32 | 44.3% |
| BNN | 1,1,32 | 53.4% |
| DoReFa | 1,2,6 | 47.6% |
| DoReFa | 1,2,4 | 58.4% |
We hosted a demo at CVPR16 on behalf of Megvii, Inc, running a real-time 1/4-VGG size DoReFa-Net on ARM and half-VGG size DoReFa-Net on FPGA. We hosted a demo at CVPR16 on behalf of Megvii, Inc, running a real-time 1/4-VGG size DoReFa-Net on ARM and half-VGG size DoReFa-Net on FPGA.
We're not planning to release our C++ runtime for bit-operations. We're not planning to release our C++ runtime for bit-operations.
In this repo, bit operations are performed through `tf.float32`. In this repo, bit operations are performed through `tf.float32`.
...@@ -11,11 +27,6 @@ Pretrained model for (1,4,32)-ResNet18 and (1,2,6)-AlexNet are available at ...@@ -11,11 +27,6 @@ Pretrained model for (1,4,32)-ResNet18 and (1,2,6)-AlexNet are available at
They're provided in the format of numpy dictionary, so it should be very easy to port into other applications. They're provided in the format of numpy dictionary, so it should be very easy to port into other applications.
The __binary-weight 4-bit-activation ResNet-18__ model has 59.2% top-1 validation accuracy. The __binary-weight 4-bit-activation ResNet-18__ model has 59.2% top-1 validation accuracy.
Note that when (W,A,G) is set to (1,32,32), this code implements [Binary Weight Network](https://arxiv.org/abs/1511.00363).
When (W,A,G) is set to (t,32,32), this code implements
[Trained Ternary Quantization](https://arxiv.org/abs/1612.01064).
But with (W,A,G) set to (1,1,32), it is not equivalent to [XNOR-Net](https://arxiv.org/abs/1603.05279), although it won't be hard to implement it.
Alternative link to this page: [http://dorefa.net](http://dorefa.net) Alternative link to this page: [http://dorefa.net](http://dorefa.net)
## Preparation: ## Preparation:
......
...@@ -31,11 +31,10 @@ This is our attempt to reproduce it on tensorpack & TensorFlow. ...@@ -31,11 +31,10 @@ This is our attempt to reproduce it on tensorpack & TensorFlow.
Accuracy: Accuracy:
Trained with 4 GPUs and (W,A,G)=(1,2,6), it can reach top-1 single-crop validation error of 47.6%, Trained with 4 GPUs and (W,A,G)=(1,2,6), it can reach top-1 single-crop validation error of 47.6%,
after 70 epochs. This number is better than what's in the paper after 70 epochs. This number is better than what's in the paper due to more sophisticated augmentations.
due to more sophisticated augmentations.
With (W,A,G)=(32,32,32) -- full precision baseline, 41.4% error. With (W,A,G)=(32,32,32) -- full precision baseline, 41.4% error.
With (W,A,G)=(t,32,32) -- TTQ, 42.3% error With (W,A,G)=(t,32,32) -- TTQ, 41.9% error
With (W,A,G)=(1,32,32) -- BWN, 44.3% error With (W,A,G)=(1,32,32) -- BWN, 44.3% error
With (W,A,G)=(1,1,32) -- BNN, 53.4% error With (W,A,G)=(1,1,32) -- BNN, 53.4% error
With (W,A,G)=(1,2,6), 47.6% error With (W,A,G)=(1,2,6), 47.6% error
......
...@@ -15,8 +15,10 @@ exec(open(libinfo_py, "rb").read()) ...@@ -15,8 +15,10 @@ exec(open(libinfo_py, "rb").read())
try: try:
import pypandoc import pypandoc
long_description = pypandoc.convert_file('README.md', 'rst') long_description = pypandoc.convert_file('README.md', 'rst')
description_type = 'text/x-rst'
except ImportError: except ImportError:
long_description = open('README.md').read() long_description = open('README.md').read()
description_type = 'text/markdown'
# configure requirements # configure requirements
reqfile = os.path.join(CURRENT_DIR, 'requirements.txt') reqfile = os.path.join(CURRENT_DIR, 'requirements.txt')
...@@ -27,7 +29,7 @@ setup( ...@@ -27,7 +29,7 @@ setup(
version=__version__, version=__version__,
description='Neural Network Toolbox on TensorFlow', description='Neural Network Toolbox on TensorFlow',
long_description=long_description, long_description=long_description,
long_description_content_type=description_type,
install_requires=req, install_requires=req,
tests_require=['flake8', 'scikit-image'], tests_require=['flake8', 'scikit-image'],
extras_require={ extras_require={
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment