Commit afba8dee authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 0a0b387e
...@@ -20,7 +20,7 @@ feel free to delete everything in this template. ...@@ -20,7 +20,7 @@ feel free to delete everything in this template.
It's always better to copy-paste what you did than to describe them. It's always better to copy-paste what you did than to describe them.
Please try to provide enough information to let other __reproduce__ your issues. Please try to provide enough information to let others __reproduce__ your issues.
Without reproducing the issue, we may not be able to investigate it. Without reproducing the issue, we may not be able to investigate it.
### 2. What you observed: ### 2. What you observed:
...@@ -44,11 +44,11 @@ If you expect higher speed, please read ...@@ -44,11 +44,11 @@ If you expect higher speed, please read
http://tensorpack.readthedocs.io/tutorial/performance-tuning.html http://tensorpack.readthedocs.io/tutorial/performance-tuning.html
before posting. before posting.
If you expect certain accuracy, only in one of the two conditions can we help with it: If you expect certain training results (e.g., accuracy), only in one of the two conditions can we help with it:
(1) You're unable to reproduce the accuracy documented in tensorpack examples. (1) You're unable to reproduce the results documented in tensorpack examples.
(2) It appears to be a tensorpack bug. (2) It appears to be a tensorpack bug.
Otherwise, how to train a model to certain accuracy is a machine learning question. Otherwise, how to train a model is a machine learning question.
We do not answer machine learning questions and it is your responsibility to We do not answer machine learning questions and it is your responsibility to
figure out how to make your models more accurate. figure out how to make your models more accurate.
......
...@@ -60,21 +60,28 @@ Model: ...@@ -60,21 +60,28 @@ Model:
6. Another alternative to BatchNorm is GroupNorm (`BACKBONE.NORM=GN`) which has better performance. 6. Another alternative to BatchNorm is GroupNorm (`BACKBONE.NORM=GN`) which has better performance.
Speed: Efficiency:
1. If CuDNN warmup is on, the training will start very slowly, until about 1. If CuDNN warmup is on, the training will start very slowly, until about
10k steps (or more if scale augmentation is used) to reach a maximum speed. 10k steps (or more if scale augmentation is used) to reach a maximum speed.
As a result, the ETA is also inaccurate at the beginning. As a result, the ETA is also inaccurate at the beginning.
CuDNN warmup is by default on when no scale augmentation is used. CuDNN warmup is by default enabled when no scale augmentation is used.
1. After warmup, the training speed will slowly decrease due to more accurate proposals. 1. After warmup, the training speed will slowly decrease due to more accurate proposals.
1. The code should have around 70% GPU utilization on V100s, and 85%~90% scaling 1. The code should have around 80~90% GPU utilization on V100s, and 85%~90% scaling
efficiency from 1 V100 to 8 V100s. efficiency from 1 V100 to 8 V100s.
1. This implementation does not use specialized CUDA ops (e.g. AffineChannel, ROIAlign). 1. This implementation does not use specialized CUDA ops (e.g. AffineChannel, ROIAlign).
Therefore it might be slower than other highly-optimized implementations. Therefore it might be slower than other highly-optimized implementations.
1. To reduce RAM usage on host: (1) make sure you're using the "spawn" method as
set in `train.py`; (2) reduce `buffer_size` or `NUM_WORKERS` in `data.py`
(which may negatively impact your throughput). The training needs <10G RAM if `NUM_WORKERS=0`.
1. Inference is unoptimized. Tensorpack is a training interface, therefore it
does not help you on optimized inference.
Possible Future Enhancements: Possible Future Enhancements:
1. Define a better interface to load different datasets. 1. Define a better interface to load different datasets.
......
...@@ -92,9 +92,7 @@ class LMDBData(RNGDataFlow): ...@@ -92,9 +92,7 @@ class LMDBData(RNGDataFlow):
logger.info("Found {} entries in {}".format(self._size, self._lmdb_path)) logger.info("Found {} entries in {}".format(self._size, self._lmdb_path))
# Clean them up after finding the list of keys, since we don't want to fork them # Clean them up after finding the list of keys, since we don't want to fork them
self._lmdb.close() self._close_lmdb()
del self._lmdb
del self._txn
def _set_keys(self, keys=None): def _set_keys(self, keys=None):
def find_keys(txn, size): def find_keys(txn, size):
...@@ -131,6 +129,11 @@ class LMDBData(RNGDataFlow): ...@@ -131,6 +129,11 @@ class LMDBData(RNGDataFlow):
map_size=1099511627776 * 2, max_readers=100) map_size=1099511627776 * 2, max_readers=100)
self._txn = self._lmdb.begin() self._txn = self._lmdb.begin()
def _close_lmdb(self):
self._lmdb.close()
del self._lmdb
del self._txn
def reset_state(self): def reset_state(self):
self._guard = DataFlowReentrantGuard() self._guard = DataFlowReentrantGuard()
super(LMDBData, self).reset_state() super(LMDBData, self).reset_state()
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment