java - CMUSphinx never recognizes any word from audio files -


sphinx doesn't seem recognize or process audio files accepts audio stream spits out empty array(speechresult result). feel there isn't issues audio file i'm using because i've tried several , doesn't work on of them. have audio file know works? , there stands out causing stream not produce transcription?

public static void main(string args[]) throws ioexception {     configuration configuration = new configuration();     configuration.setacousticmodelpath("resource:/edu/cmu/sphinx/models/en-us/en-us");     configuration.setdictionarypath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");     configuration.setlanguagemodelpath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.dmp");      streamspeechrecognizer recognizer = new streamspeechrecognizer(configuration);     //recognizer.startrecognition(new fileinputstream("e:/1video/hello-5.mp3"));      file file = new file("e:/1video/bargain_not.wav");     fileinputstream fis = new fileinputstream(file);     inputstream = new fileinputstream(file);      //is = automaticspeechrecognition.class.getresourceasstream("/edu/cmu/sphinx/demo/aligner/10001-90210-01803.wav");     recognizer.startrecognition(is);     speechresult result = null;     while((result = recognizer.getresult()) != null) {         system.out.println(result.getresult());          system.out.println(result.gethypothesis());          system.out.println(result.getwords());      }     //result = recognizer.getresult();     //system.out.println(result);     //system.out.println(result.tostring());     //system.out.println(result.getwords());     /*for (wordresult wordresult : result.getwords())     {         system.out.println(wordresult);     }*/     recognizer.stoprecognition();   } 

here's output running -- doesn't seem have failures

 09:31:13.430 info unitmanager          ci unit: *+nsn+  09:31:13.433 info unitmanager          ci unit: *+spn+  09:31:13.433 info unitmanager          ci unit: aa  09:31:13.434 info unitmanager          ci unit: ae  09:31:13.434 info unitmanager          ci unit: ah  09:31:13.434 info unitmanager          ci unit: ao  09:31:13.434 info unitmanager          ci unit: aw  09:31:13.434 info unitmanager          ci unit: ay  09:31:13.434 info unitmanager          ci unit: b  09:31:13.434 info unitmanager          ci unit: ch  09:31:13.434 info unitmanager          ci unit: d  09:31:13.434 info unitmanager          ci unit: dh  09:31:13.434 info unitmanager          ci unit: eh  09:31:13.435 info unitmanager          ci unit: er  09:31:13.435 info unitmanager          ci unit: ey  09:31:13.435 info unitmanager          ci unit: f  09:31:13.435 info unitmanager          ci unit: g  09:31:13.435 info unitmanager          ci unit: hh  09:31:13.435 info unitmanager          ci unit: ih  09:31:13.435 info unitmanager          ci unit: iy  09:31:13.435 info unitmanager          ci unit: jh  09:31:13.435 info unitmanager          ci unit: k  09:31:13.435 info unitmanager          ci unit: l  09:31:13.435 info unitmanager          ci unit: m  09:31:13.436 info unitmanager          ci unit: n  09:31:13.436 info unitmanager          ci unit: ng  09:31:13.436 info unitmanager          ci unit: ow  09:31:13.436 info unitmanager          ci unit: oy  09:31:13.436 info unitmanager          ci unit: p  09:31:13.436 info unitmanager          ci unit: r  09:31:13.436 info unitmanager          ci unit: s  09:31:13.436 info unitmanager          ci unit: sh  09:31:13.436 info unitmanager          ci unit: t  09:31:13.436 info unitmanager          ci unit: th  09:31:13.436 info unitmanager          ci unit: uh  09:31:13.437 info unitmanager          ci unit: uw  09:31:13.437 info unitmanager          ci unit: v  09:31:13.437 info unitmanager          ci unit: w  09:31:13.437 info unitmanager          ci unit: y  09:31:13.437 info unitmanager          ci unit: z  09:31:13.437 info unitmanager          ci unit: zh  09:31:14.014 info autocepstrum         cepstrum component auto-configured      follows: autocepstrum {melfrequencyfilterbank, denoise,      discretecosinetransform2, lifter}  09:31:14.030 info dictionary           loading dictionary from: jar:file:/c:/users/kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-snapshot/sphinx4-data-1.0-snapshot.jar!/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict  09:31:14.132 info dictionary           loading filler dictionary from: jar:file:/c:/users/kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-snapshot/sphinx4-data-1.0-snapshot.jar!/edu/cmu/sphinx/models/en-us/en-us/noisedict  09:31:14.132 info acousticmodelloader  loading tied-state acoustic model from: jar:file:/c:/users/kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-snapshot/sphinx4-data-1.0-snapshot.jar!/edu/cmu/sphinx/models/en-us/en-us  09:31:14.133 info acousticmodelloader  pool means entries: 16128  09:31:14.133 info acousticmodelloader  pool variances entries: 16128  09:31:14.133 info acousticmodelloader  pool transition_matrices entries: 42  09:31:14.133 info acousticmodelloader  pool senones entries: 5126  09:31:14.133 info acousticmodelloader  gaussian weights: mixture_weights. entries: 15378  09:31:14.133 info acousticmodelloader  pool senones entries: 5126  09:31:14.133 info acousticmodelloader  context independent unit entries: 42  09:31:14.133 info acousticmodelloader  hmm manager: 137095 hmms  09:31:14.134 info acousticmodel        compositesenonesequences: 0  09:31:14.134 info largetrigrammodel    loading n-gram language model from: jar:file:/c:/users/kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-snapshot/sphinx4-data-1.0-snapshot.jar!/edu/cmu/sphinx/models/en-us/en-us.lm.dmp  09:31:14.807 info largetrigrammodel    1-grams: 19794  09:31:14.807 info largetrigrammodel    2-grams: 1377200  09:31:14.807 info largetrigrammodel    3-grams: 3178194  09:31:15.582 info lextreelinguist      max ci units 43  09:31:15.583 info lextreelinguist      unit table size 79507  09:31:15.585 info speedtracker         # ----------------------------- timers----------------------------------------  09:31:15.585 info speedtracker         # name               count   curtime   mintime   maxtime   avgtime   tottime     09:31:15.586 info speedtracker         load dictionary      1       0.1020s   0.1020s   0.1020s   0.1020s   0.1020s     09:31:15.586 info speedtracker         load lm              1       0.6730s   0.6730s   0.6730s   0.6730s   0.6730s     09:31:15.586 info speedtracker         compile              1       0.7760s   0.7760s   0.7760s   0.7760s   0.7760s     09:31:15.586 info speedtracker         load              1       1.5450s   1.5450s   1.5450s   1.5450s   1.5450s     09:31:15.608 info speedtracker             time audio: 1.94s  proc: 0.01s  speed: 0.00 x real time  09:31:15.608 info speedtracker            total time audio: 1.94s  proc: 0.01s 0.00 x real time  09:31:15.609 info memorytracker           mem  total: 454.75 mb  free: 262.35 mb  09:31:15.609 info memorytracker           used: this: 192.40 mb  avg: 192.40 mb  max: 192.40 mb  09:31:15.610 info largetrigrammodel    lm cache size: 0 hits: 0 misses: 0  <s> </s> 

like nikolay shmyrev said file must 16khz 16bit mono mswav. such file can recorded audacity. 16khz , mono

file export , make sure pick wav (microsoft) signed 16 bit pcm.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -