Skip to content
Eric Riebling edited this page Feb 4, 2019 · 5 revisions

DiViMe Diarization Virtual Machine Tutorial

Assumptions

  • You are running Linux
  • You have installed
    • Git
    • Vagrant http://vagrantup.com
    • VirtualBox (included with Vagrant)

Steps

  • Download the DiViMe repository
git clone http://github.com/DiViMe
  • Download test audio
wget http://speech-kitchen.org/cough.wav -P data
  • Provision the VM (takes a long time, lots of output)
vagrant up
  • Run the Diarization with Noisemes script
vagrant ssh -c 'OpenSAT/runClasses.sh /vagrant/data'
Extracting features for cough.wav ...
(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: /vagrant/MED_2s_100ms_htk.conf
(MSG) [2] in cComponentManager : successfully registered 95 component types.
(MSG) [2] in cComponentManager : successfully finished createInstances
                                 (19 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 1721 ticks.
DONE!
Filename cough.htk
Predicting for cough ...
Connection to 127.0.0.1 closed.

Output

cat data/hyp/cough.rttm
SPEAKER	cough	1	0.0	0.0	music_sing  	<NA>	<NA>	0.367768079042
SPEAKER	cough	1	0.0	0.1	crowd        	<NA>	<NA>	0.419847130775
SPEAKER	cough	1	0.1	0.3	music_sing  	<NA>	<NA>	0.375354379416
SPEAKER	cough	1	0.4	0.8	music       	<NA>	<NA>	0.285134375095
SPEAKER	cough	1	1.2	1.5	background  	<NA>	<NA>	0.178366959095
SPEAKER	cough	1	2.7	0.5	music       	<NA>	<NA>	0.230460226536
SPEAKER	cough	1	3.2	0.2	background  	<NA>	<NA>	0.245576620102
SPEAKER	cough	1	3.4	0.4	music       	<NA>	<NA>	0.237979069352
SPEAKER	cough	1	3.8	0.1	music_sing  	<NA>	<NA>	0.253297328949
SPEAKER	cough	1	3.9	0.8	music       	<NA>	<NA>	0.307353198528
SPEAKER	cough	1	4.7	1.1	crowd        	<NA>	<NA>	0.279803872108
SPEAKER	cough	1	5.8	5.4	music_sing  	<NA>	<NA>	0.282180398703
SPEAKER	cough	1	11.2	1.1	music       	<NA>	<NA>	0.302925169468
SPEAKER	cough	1	12.3	4.0	background  	<NA>	<NA>	0.486689478159
Clone this wiki locally