update readme for timit

37a0f153 · Yuxin Wu · cf1ca7ae · 37a0f153 · 37a0f153 · 37a0f153
Commit 37a0f153 authored Dec 13, 2016 by Yuxin Wu
Showing with 39 additions and 2 deletions

README.md README.md +2 -1

examples/README.md examples/README.md +1 -0

examples/TIMIT/README.md examples/TIMIT/README.md +35 -0

examples/TIMIT/train-timit.py examples/TIMIT/train-timit.py +1 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -12,7 +12,8 @@ They're not only for demonstration of the framework -- you can train them and re
 + [Generative Adversarial Network(GAN) variants (DCGAN,Image2Image,InfoGAN)](examples/GAN)
 + [Deep Q-Network(DQN) variants on Atari games](examples/Atari2600)
 + [Asynchronous Advantage Actor-Critic(A3C) with demos on OpenAI Gym](examples/OpenAIGym)
-+ [char-rnn language model](examples/char-rnn)
+ [LSTM-CTC for speech recognition](examples/TIMIT)
+ [char-RNN language model](examples/char-rnn)

 ## Features:


--- a/examples/README.md
+++ b/examples/README.md
@@ -22,4 +22,5 @@ Training examples with __reproducible__ and meaningful performance.
 + [Generative Adversarial Network(GAN) variants (DCGAN,Image2Image,InfoGAN)](examples/GAN)

 ## Speech / NLP:
+ [LSTM-CTC for speech recognition](TIMIT)
 + [char-rnn for fun](char-rnn)
--- a/examples/TIMIT/README.md
+++ b/examples/TIMIT/README.md
+
+## LSTM + CTC on TIMIT speech recognition dataset
+
+### Install Dependencies:
+ python binding for `lmdb`
+	+ `pip install --user lmdb`
+ `bob.ap` package for MFCC extraction
+	+ install blitz and openblas as dependencies of bob.ap
+	+ `pip install --user bob.ap`
+
+### Prepare Data:
+Convert NIST wav format to RIFF wav format:
+```
+cd /PATH/TO/TIMIT
+find -name '*.WAV' | parallel -P20 sox {} '{.}.wav'
+```
+
+Extract MFCC features and save everything to LMDB database:
+```
+./create-lmdb.py build --dataset /PATH/TO/TIMIT/TRAIN --db train.mdb
+./create-lmdb.py build --dataset /PATH/TO/TIMIT/TEST --db test.mdb
+```
+
+Compute mean/std of the training set (and save to `stats.data` by default):
+```
+./create-lmdb.py stat --db train.mdb
+```
+
+### Train:
+```
+./train-timit.py --train train.mdb --test test.mdb --stat stats.data
+```
+
+### Results:
+To be done
--- a/examples/TIMIT/train-timit.py
+++ b/examples/TIMIT/train-timit.py
@@ -94,7 +94,7 @@ def get_config(ds_train, ds_test):
        ]),
        model=Model(),
        step_per_epoch=step_per_epoch,
-        max_epoch=500,
+        max_epoch=300,
    )

 if __name__ == '__main__':