Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
37a0f153
Commit
37a0f153
authored
Dec 13, 2016
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update readme for timit
parent
cf1ca7ae
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
39 additions
and
2 deletions
+39
-2
README.md
README.md
+2
-1
examples/README.md
examples/README.md
+1
-0
examples/TIMIT/README.md
examples/TIMIT/README.md
+35
-0
examples/TIMIT/train-timit.py
examples/TIMIT/train-timit.py
+1
-1
No files found.
README.md
View file @
37a0f153
...
...
@@ -12,7 +12,8 @@ They're not only for demonstration of the framework -- you can train them and re
+
[
Generative Adversarial Network(GAN) variants (DCGAN,Image2Image,InfoGAN)
](
examples/GAN
)
+
[
Deep Q-Network(DQN) variants on Atari games
](
examples/Atari2600
)
+
[
Asynchronous Advantage Actor-Critic(A3C) with demos on OpenAI Gym
](
examples/OpenAIGym
)
+
[
char-rnn language model
](
examples/char-rnn
)
+
[
LSTM-CTC for speech recognition
](
examples/TIMIT
)
+
[
char-RNN language model
](
examples/char-rnn
)
## Features:
...
...
examples/README.md
View file @
37a0f153
...
...
@@ -22,4 +22,5 @@ Training examples with __reproducible__ and meaningful performance.
+
[
Generative Adversarial Network(GAN) variants (DCGAN,Image2Image,InfoGAN)
](
examples/GAN
)
## Speech / NLP:
+
[
LSTM-CTC for speech recognition
](
TIMIT
)
+
[
char-rnn for fun
](
char-rnn
)
examples/TIMIT/README.md
0 → 100644
View file @
37a0f153
## LSTM + CTC on TIMIT speech recognition dataset
### Install Dependencies:
+
python binding for
`lmdb`
+
`pip install --user lmdb`
+
`bob.ap`
package for MFCC extraction
+
install blitz and openblas as dependencies of bob.ap
+
`pip install --user bob.ap`
### Prepare Data:
Convert NIST wav format to RIFF wav format:
```
cd /PATH/TO/TIMIT
find -name '*.WAV' | parallel -P20 sox {} '{.}.wav'
```
Extract MFCC features and save everything to LMDB database:
```
./create-lmdb.py build --dataset /PATH/TO/TIMIT/TRAIN --db train.mdb
./create-lmdb.py build --dataset /PATH/TO/TIMIT/TEST --db test.mdb
```
Compute mean/std of the training set (and save to
`stats.data`
by default):
```
./create-lmdb.py stat --db train.mdb
```
### Train:
```
./train-timit.py --train train.mdb --test test.mdb --stat stats.data
```
### Results:
To be done
examples/TIMIT/train-timit.py
View file @
37a0f153
...
...
@@ -94,7 +94,7 @@ def get_config(ds_train, ds_test):
]),
model
=
Model
(),
step_per_epoch
=
step_per_epoch
,
max_epoch
=
5
00
,
max_epoch
=
3
00
,
)
if
__name__
==
'__main__'
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment