Python por Komputilaj Ludoj: CREATING THE CD UNTIED MODEL DEFINITION FILE

TRAINING CONTINUOUS MODELS

The next step is the CD-untied training, in which HMMs are trained for all context-dependent phones (usually triphones) that are seen in the training corpus. For the CD-untied training, we first need to to generate a model definition file for all the triphones occuring in the training set. This is done in several steps:

phone1 0 0 0 0
phone2 0 0 0 0
phone3 0 0 0 0
phone4 0 0 0 0
...

SIL    SIL

FLAG	DESCRIPTION
-q	mandatory flag to tell quick_count to consider all word pairs while constructing triphone list
-p	formatted phonelist
-b	temporary dictionary
-o	output triphone list

AA(AA,AA)s              1
AA(AA,AE)b              1
AA(AA,AO)1              1
AA(AA,AW)e              1

AA AA AA s
AA AA AE b
AA AA AO i
AA AA AW e

AA - - -
AE - - -
AO - - -
AW - - -
..
..                                                         
AA AA AA s
AA AA AE b
AA AA AO i
AA AA AW e

For example, if the output of the quick_count is stored in a file named "quick_count.out", the following perl command will generate the phone list in the desired form. perl -nae '$F[0] =~ s/$|$|\,/ /g; $F[0] =~ s/1/i/g; print $F[0]; if ($F[0] =~ /\s+$/){print "i"}; print "\n"' quick_count.out

FLAG	DESCRIPTION
-moddeffn	model definition file with all possible triphones(alltriphones_mdef)to be written
-phonelstfn	list of all triphones
-n_state_pm

FLAG	DESCRIPTION
-moddeffn	model definition file with all possible triphones(alltriphones_mdef)
-ts2cbfn	takes the value ".cont." if you are building continuous models
-ctlfn	control file corresponding to your training transcripts
-lsnfn	transcript file for training
-dictfn	training dictionary
-fdictfn	filler dictionary
-paramtype	write "phone" here, without the double quotes
-segdir	/dev/null

(param_cnt [arguments] > triphone_count_file) >&! LOG

+GARBAGE+ - - - 98
+LAUGH+ - - - 29
SIL - - - 31694
AA - - - 0
AE - - - 0
...
AA AA AA s 1
AA AA AE s 0
AA AA AO s 4

AA - - -
AE - - -
AO - - -
AW - - -
..
..                                 
AA AA AO s
..

FLAG	DESCRIPTION
-moddeffn	model definition file for CD untied training
-phonelstfn	list of shortlisted triphones
-n_state_pm

Finally, therefore, a model definition file which lists all CI phones and seen triphones is constructed. This file, like the CI model-definition file, assigns unique id's to each HMM state and serves as a reference file for handling and identifying the CD-untied model parameters. Here is an example of the CD-untied model-definition file: If you have listed five phones in your phones.list file,
SIL B AE T
and specify that you want to build three state HMMs for each of these phones, and if you have one utterance listed in your transcript file:
<s> BAT A TAB </s> for which your dictionary and fillerdict entries are:

Fillerdict:
<s>   SIL
</s>  SIL

Dictionary:
A      AX 
BAT    B AE T
TAB    T AE B

then your CD-untied model-definition file will look like this:

# Generated by /mk_model_def on Thu Aug 10 14:57:15 2000
0.3
5 n_base
7 n_tri
48 n_state_map
36 n_tied_state
15 n_tied_ci_state
5 n_tied_tmat                                                                  
#
# Columns definitions
#base lft  rt p attrib   tmat  ...state id's ...
SIL     -   -  - filler    0    0       1      2     N
AE      -   -  -    n/a    1    3       4      5     N
AX      -   -  -    n/a    2    6       7      8     N
B       -   -  -    n/a    3    9       10     11    N
T       -   -  -    n/a    4    12      13     14    N
AE      B   T  i    n/a    1    15      16     17    N
AE      T   B  i    n/a    1    18      19     20    N
AX      T   T  s    n/a    2    21      22     23    N
B       SIL AE b    n/a    3    24      25     26    N
B       AE  SIL e   n/a    3    27      28     29    N
T       AE  AX e    n/a    4    30      31     32    N
T       AX  AE b    n/a    4    33      34     35    N

The # lines are simply comments. The rest of the variables mean the following:

  n_base      : no. of CI phones (also called "base" phones), 5 here
  n_tri       : no. of triphones , 7 in this case
  n_state_map : Total no. of HMM states (emitting and non-emitting)
                The Sphinx appends an extra terminal non-emitting state
                to every HMM, hence for 5+7 phones, each specified by
                the user to be modeled by a 3-state HMM, this number
                will be 12phones*4states = 48
  n_tied_state: no. of states of all phones after state-sharing is done.
                We do not share states at this stage. Hence this number is the
                same as the total number of emitting states, 12*3=36
n_tied_ci_state:no. of states for your CI phones after state-sharing     
                is done. The CI states are not shared, now or later.
                This number is thus again the total number of emitting CI
                states, 5*3=15
 n_tied_tmat   : The total number of transition matrices is always the same
                 as the total number of CI phones being modeled. All triphones
                 for a given phone share the same transition matrix. This
                 number is thus 5.

Columns definitions: The following columns are defined:
       base  : name of each phone
       lft   : left-context of the phone (- if none)
       rt    : right-context of the phone (- if none)
       p     : position of a triphone. Four position markers are supported:
               b = word begining triphone
               e = word ending triphone
               i = word internal triphone
               s = single word triphone 
       attrib: attribute of phone. In the phone list, if the phone is "SIL",
               or if the phone is enclosed by "+", as in "+BANG+", these
              phones are interpreted as non-speech events. These are
               also called "filler" phones, and the attribute "filler" is
               assigned to each such phone. The base phones and the
               triphones have no special attributes, and hence are 
               labelled as "n/a", standing for "no attribute"
      tmat   : the id of the transition matrix associated with the phone      
 state id's  : the ids of the HMM states associated with any phone. This list
               is terminated by an "N" which stands for a non-emitting
               state. No id is assigned to it. However, it exists, and is
               listed.

Python por Komputilaj Ludoj

Páginas

domingo, 3 de março de 2013

CREATING THE CD UNTIED MODEL DEFINITION FILE

TRAINING CONTINUOUS MODELS

Nenhum comentário:

Postar um comentário