Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
3c61c318
Commit
3c61c318
authored
Jan 19, 2018
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
some notes about TIMIT (#603)
parent
bc8da700
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
17 additions
and
6 deletions
+17
-6
examples/CTC-TIMIT/README.md
examples/CTC-TIMIT/README.md
+14
-3
examples/CTC-TIMIT/create-lmdb.py
examples/CTC-TIMIT/create-lmdb.py
+2
-2
examples/CTC-TIMIT/train-timit.py
examples/CTC-TIMIT/train-timit.py
+1
-1
No files found.
examples/CTC-TIMIT/README.md
View file @
3c61c318
...
@@ -5,10 +5,21 @@
...
@@ -5,10 +5,21 @@
+
python binding for
`lmdb`
+
python binding for
`lmdb`
+
`pip install --user lmdb`
+
`pip install --user lmdb`
+
`bob.ap`
package for MFCC extraction
+
`bob.ap`
package for MFCC extraction
+
install blitz
and openblas as dependencies of bob.ap
+
install
[
blitz
](
https://github.com/blitzpp/blitz
)
and openblas as dependencies of bob.ap
+
`pip install --user
bob.ap`
+
`pip install --user bob.extension bob.blitz bob.core bob.sp
bob.ap`
### Prepare Data:
### Prepare Data:
We assume the following file structure:
```
TRAIN/
DR1/
FCJF0/
*.WAV # NIST WAV file
*.TXT
*.PHN
...
```
Convert NIST wav format to RIFF wav format:
Convert NIST wav format to RIFF wav format:
```
```
cd /PATH/TO/TIMIT
cd /PATH/TO/TIMIT
...
...
examples/CTC-TIMIT/create-lmdb.py
View file @
3c61c318
...
@@ -8,12 +8,12 @@ import string
...
@@ -8,12 +8,12 @@ import string
import
numpy
as
np
import
numpy
as
np
import
argparse
import
argparse
import
bob.ap
from
tensorpack.dataflow
import
dftools
,
DataFlow
,
LMDBDataPoint
from
tensorpack.dataflow
import
dftools
,
DataFlow
,
LMDBDataPoint
from
tensorpack.utils.argtools
import
memoized
from
tensorpack.utils.argtools
import
memoized
from
tensorpack.utils.stats
import
OnlineMoments
from
tensorpack.utils.stats
import
OnlineMoments
from
tensorpack.utils
import
serialize
,
fs
,
logger
from
tensorpack.utils
import
serialize
,
fs
,
logger
from
tensorpack.utils.utils
import
get_tqdm
from
tensorpack.utils.utils
import
get_tqdm
import
bob.ap
CHARSET
=
set
(
string
.
ascii_lowercase
+
' '
)
CHARSET
=
set
(
string
.
ascii_lowercase
+
' '
)
PHONEME_LIST
=
[
PHONEME_LIST
=
[
...
@@ -85,7 +85,7 @@ class RawTIMIT(DataFlow):
...
@@ -85,7 +85,7 @@ class RawTIMIT(DataFlow):
self
.
filelists
=
[
k
for
k
in
fs
.
recursive_walk
(
self
.
dirname
)
self
.
filelists
=
[
k
for
k
in
fs
.
recursive_walk
(
self
.
dirname
)
if
k
.
endswith
(
'.wav'
)]
if
k
.
endswith
(
'.wav'
)]
logger
.
info
(
"Found {} wav files ..."
.
format
(
len
(
self
.
filelists
)))
logger
.
info
(
"Found {} wav files ..."
.
format
(
len
(
self
.
filelists
)))
assert
len
(
self
.
filelists
),
self
.
filelists
assert
len
(
self
.
filelists
),
"Found no '.wav' files!"
assert
label
in
[
'phoneme'
,
'letter'
],
label
assert
label
in
[
'phoneme'
,
'letter'
],
label
self
.
label
=
label
self
.
label
=
label
...
...
examples/CTC-TIMIT/train-timit.py
View file @
3c61c318
...
@@ -79,7 +79,7 @@ class Model(ModelDesc):
...
@@ -79,7 +79,7 @@ class Model(ModelDesc):
def
get_data
(
path
,
isTrain
,
stat_file
):
def
get_data
(
path
,
isTrain
,
stat_file
):
ds
=
LMDBDataPoint
(
path
,
shuffle
=
isTrain
)
ds
=
LMDBDataPoint
(
path
,
shuffle
=
isTrain
)
mean
,
std
=
serialize
.
loads
(
open
(
stat_file
)
.
read
())
mean
,
std
=
serialize
.
loads
(
open
(
stat_file
,
'rb'
)
.
read
())
ds
=
MapDataComponent
(
ds
,
lambda
x
:
(
x
-
mean
)
/
std
)
ds
=
MapDataComponent
(
ds
,
lambda
x
:
(
x
-
mean
)
/
std
)
ds
=
TIMITBatch
(
ds
,
BATCH
)
ds
=
TIMITBatch
(
ds
,
BATCH
)
if
isTrain
:
if
isTrain
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment