Python por Komputilaj Ludoj: TRAINING CONTEXT INDEPENDENT MODELS

TRAINING CONTINUOUS MODELS

Once the flat initialization is done, you are ready to begin training the acoustic models for the base or "context-independent" or CI phones. This step is called CI-training. In CI-training, the flat-initialized models are re-estimated through the forward-backward re-estimation algorithm called the Baum-Welch algorithm. This is an iterative re-estimation process, so you have to run many "passes" of the Baum-Welch re-estimation over your training data. Each of these passes, or iterations, results in a slightly better set of models for the CI phones. However, since the objective function maximized in each of theses passes is the likelihood, too many iterations would ultimately result in models which fit very closely to the training data. you might not want this to happen for many reasons. Typically, 5-8 iterations of Baum-Welch are sufficient for getting good estimates of the CI models. You can automatically determine the number of iterations that you need by looking at the total likelihood of the training data at the end of the first iteration and deciding on a "convergence ratio" of likelihoods. This is simply the ratio of the total likelihood in the current iteration to that of the previous iteration. As the models get more and more fitted to the training data in each iteration, the training data likelihoods typically increase monotonically. The convergence ratio is therefore a small positive number. The convergence ratio becomes smaller and smaller as the iterations progress, since each time the current models are a little less different from the previous ones. Convergence ratios are data and task specific, but typical values at which you may stop the Baum-Welch iterations for your CI training may range from 0.1-0.001. When the models are variance-normalized, the convergence ratios are much smaller.
The executable used to run a Buam-Welch iteration is called "bw", and takes the following example arguments for training continuous CI models:

FLAG	DESCRIPTION
-moddeffn	model definition file for CI phones
-ts2cbfn	this flag should be set to ".cont." if you are training continuous models, and to ".semi." if you are training semi-continuous models, without the double quotes
-mixwfn	name of the file in which the mixture-weights from the previous iteration are stored. Full path must be provided
-mwfloor	Floor value for the mixture weights. Any number below the floor value is set to the floor value.
-tmatfn	name of the file in which the transition matrices from the previous iteration are stored. Full path must be provided
-meanfn	name of the file in which the means from the previous iteration are stored. Full path must be provided
-varfn	name of the file in which the variances fromt he previous iteration are stored. Full path must be provided
-dictfn	Dictionary
-fdictfn	Filler dictionary
-ctlfn	control file
-part	You can split the training into N equal parts by setting a flag. If there are M utterances in your control file, then this will enable you to run the training separately on each (M/N)^th part. This flag may be set to specify which of these parts you want to currently train on. As an example, if your total number of parts is 3, this flag can take one of the values 1,2 or 3
-npart	number of parts in which you have split the training
-cepdir	directory where your feature files are stored
-cepext	the extension that comes after the name listed in the control file. For example, you may have a file called a/b/c.d and may have listed a/b/c in your control file. Then this flag must be given the argument "d", without the double quotes or the dot before it
-lsnfn	name of the transcript file
-accumdir	Intermediate results from each part of your training will be written in this directory. If you have T means to estimate, then the size of the mean buffer from the current part of your training will be T*4 bytes (say). There will likewise be a variance buffer, a buffer for mixture weights, and a buffer for transition matrices
-varfloor	minimum variance value allowed
-topn	no. of gaussians to consider for computing the likelihood of each state. For example, if you have 8 gaussians/state models and topn is 4, then the 4 most likely gaussian are used.
-abeam	forward beamwidth
-bbeam	backward beamwidth
-agc	automatic gain control
-cmn	cepstral mean normalization
-varnorm	variance normalization
-meanreest	mean re-estimation
-varreest	variance re-estimation
-2passvar	Setting this flag to "yes" lets bw use the previous means in the estimation of the variance. The current variance is then estimated as E[(x - prev_mean)²]. If this flag is set to "no" the current estimate of the means are used to estimate variances. This requires the estimation of variance as E[x²] - (E[x])², an unstable estimator that sometimes results in negative estimates of the variance due to arithmetic imprecision
-tmatreest	re-estimate transition matrices or not
-feat	feature configuration
-ceplen	length of basic feature vector

If you have run the training in many parts, or even if you have run the training in one part, the executable for Baum-Welch described above generates only intermediate buffer(s). The final model parameters, namely the means, variances, mixture-weights and transition matrices, have to be estimated using the values stored in these buffers. This is done by the executable called "norm", which takes the following arguments:

FLAG	DESCRIPTION
-accumdir	Intermediate buffer directory
-feat	feature configuration
-mixwfn	name of the file in which you want to write the mixture weights. Full path must be provided
-tmatfn	name of the file in which you want to write the transition matrices. Full path must be provided
-meanfn	name of the file in which you want to write the means. Full path must be provided
-varfn	name of the file in which you want to write the variances. Full path must be provided
-ceplen	length of basic feature vector

If you have not re-estimated any of the model parameters in the bw step, then the corresponding flag must be omitted from the argument given to the norm executable. The executable will otherwise try to read a non-existent buffer from the buffer directory and will not go through. Thus if you have set -meanreest to be "no" in the argument for bw, then the flag -meanfn must not be given in the argument for norm. This is useful mostly during adaptation. Iterations of baum-welch and norm finally result CI models. The iterations can be stopped once the likelihood on the training data converges. The model parameters computed by norm in the final iteration are now used to initialize the models for context-dependent phones (triphones) with untied states. This is the next major step of the training process. We refer to the process of training triphones HMMs with untied states as the "CD untied training".

Python por Komputilaj Ludoj

Páginas

domingo, 3 de março de 2013

TRAINING CONTEXT INDEPENDENT MODELS

TRAINING CONTINUOUS MODELS

Nenhum comentário:

Postar um comentário