Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
afba8dee
Commit
afba8dee
authored
May 27, 2019
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
0a0b387e
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
23 additions
and
13 deletions
+23
-13
.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
+7
-7
examples/FasterRCNN/NOTES.md
examples/FasterRCNN/NOTES.md
+10
-3
tensorpack/dataflow/format.py
tensorpack/dataflow/format.py
+6
-3
No files found.
.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
View file @
afba8dee
...
...
@@ -20,7 +20,7 @@ feel free to delete everything in this template.
It's always better to copy-paste what you did than to describe them.
Please try to provide enough information to let other
__reproduce__ your issues.
Please try to provide enough information to let other
s __reproduce__ your issues.
Without reproducing the issue, we may not be able to investigate it.
### 2. What you observed:
...
...
@@ -44,11 +44,11 @@ If you expect higher speed, please read
http://tensorpack.readthedocs.io/tutorial/performance-tuning.html
before posting.
If you expect certain
accuracy
, only in one of the two conditions can we help with it:
(1) You're unable to reproduce the
accuracy
documented in tensorpack examples.
If you expect certain
training results (e.g., accuracy)
, only in one of the two conditions can we help with it:
(1) You're unable to reproduce the
results
documented in tensorpack examples.
(2) It appears to be a tensorpack bug.
Otherwise, how to train a model
to certain accuracy
is a machine learning question.
Otherwise, how to train a model is a machine learning question.
We do not answer machine learning questions and it is your responsibility to
figure out how to make your models more accurate.
...
...
examples/FasterRCNN/NOTES.md
View file @
afba8dee
...
...
@@ -60,21 +60,28 @@ Model:
6. Another alternative to BatchNorm is GroupNorm (`
BACKBONE.NORM=GN
`) which has better performance.
Speed
:
Efficiency
:
1. If CuDNN warmup is on, the training will start very slowly, until about
10k steps (or more if scale augmentation is used) to reach a maximum speed.
As a result, the ETA is also inaccurate at the beginning.
CuDNN warmup is by default
on
when no scale augmentation is used.
CuDNN warmup is by default
enabled
when no scale augmentation is used.
1. After warmup, the training speed will slowly decrease due to more accurate proposals.
1. The code should have around
7
0% GPU utilization on V100s, and 85%~90% scaling
1. The code should have around
80~9
0% GPU utilization on V100s, and 85%~90% scaling
efficiency from 1 V100 to 8 V100s.
1. This implementation does not use specialized CUDA ops (e.g. AffineChannel, ROIAlign).
Therefore it might be slower than other highly-optimized implementations.
1. To reduce RAM usage on host: (1) make sure you're using the "spawn" method as
set in `
train.py
`; (2) reduce `
buffer_size
` or `
NUM_WORKERS
` in `
data.py
`
(which may negatively impact your throughput). The training needs <10G RAM if `
NUM_WORKERS=0
`.
1. Inference is unoptimized. Tensorpack is a training interface, therefore it
does not help you on optimized inference.
Possible Future Enhancements:
1. Define a better interface to load different datasets.
...
...
tensorpack/dataflow/format.py
View file @
afba8dee
...
...
@@ -92,9 +92,7 @@ class LMDBData(RNGDataFlow):
logger
.
info
(
"Found {} entries in {}"
.
format
(
self
.
_size
,
self
.
_lmdb_path
))
# Clean them up after finding the list of keys, since we don't want to fork them
self
.
_lmdb
.
close
()
del
self
.
_lmdb
del
self
.
_txn
self
.
_close_lmdb
()
def
_set_keys
(
self
,
keys
=
None
):
def
find_keys
(
txn
,
size
):
...
...
@@ -131,6 +129,11 @@ class LMDBData(RNGDataFlow):
map_size
=
1099511627776
*
2
,
max_readers
=
100
)
self
.
_txn
=
self
.
_lmdb
.
begin
()
def
_close_lmdb
(
self
):
self
.
_lmdb
.
close
()
del
self
.
_lmdb
del
self
.
_txn
def
reset_state
(
self
):
self
.
_guard
=
DataFlowReentrantGuard
()
super
(
LMDBData
,
self
)
.
reset_state
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment