Show simple item record

[conference paper]

dc.contributor.authorWeninger, Felixde
dc.contributor.authorWagner, Claudiade
dc.contributor.authorWöllmer, Martinde
dc.contributor.authorSchuller, Björnde
dc.contributor.authorMorency, Louis-Philippde
dc.date.accessioned2020-01-15T14:13:38Z
dc.date.available2020-01-15T14:13:38Z
dc.date.issued2013de
dc.identifier.isbn978-1-4799-0356-6de
dc.identifier.issn2379-190Xde
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/66084
dc.description.abstractWe present a multi-modal approach to speaker characterization using acoustic, visual and linguistic features. Full realism is provided by evaluation on a database of real-life web videos and automatic feature extraction including face and eye detection, and automatic speech recognition. Different segmentations are evaluated for the audio and video streams, and the statistical relevance of Linguistic Inquiry and Word Count (LIWC) features is confirmed. In the result, late multimodal fusion delivers 73, 92 and 73% average recall in binary age, gender and race classification on unseen test subjects, outperforming the best single modalities for age and race.de
dc.languageende
dc.publisherIEEEde
dc.subject.ddcNaturwissenschaftende
dc.subject.ddcScienceen
dc.subject.otherspeaker classification; computational paralinguistics; multi-modal fusion; Linguistic Inquiry and Word Count; LIWCde
dc.titleSpeaker trait characterization in web videos: Uniting speech, language, and facial featuresde
dc.description.reviewbegutachtet (peer reviewed)de
dc.description.reviewpeer revieweden
dc.source.collectionProceedings of the 38th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013)de
dc.publisher.countryUSA
dc.subject.classozNaturwissenschaften, Technik(wissenschaften), angewandte Wissenschaftende
dc.subject.classozNatural Science and Engineering, Applied Sciencesen
dc.subject.thesozVideode
dc.subject.thesozvideoen
dc.subject.thesozVideo-Clipde
dc.subject.thesozvideo clipen
dc.subject.thesozAufzeichnungde
dc.subject.thesozrecordingen
dc.subject.thesozComputerlinguistikde
dc.subject.thesozcomputational linguisticsen
dc.subject.thesozInternetde
dc.subject.thesozInterneten
dc.subject.thesozEvaluationde
dc.subject.thesozevaluationen
dc.subject.thesozSoziale Mediende
dc.subject.thesozsocial mediaen
dc.subject.thesozExperimentde
dc.subject.thesozexperimenten
dc.subject.thesozaudiovisuelle Mediende
dc.subject.thesozaudiovisual mediaen
dc.identifier.urnurn:nbn:de:0168-ssoar-66084-2
dc.rights.licenceDeposit Licence - Keine Weiterverbreitung, keine Bearbeitungde
dc.rights.licenceDeposit Licence - No Redistribution, No Modificationsen
internal.statusnoch nicht fertig erschlossende
internal.identifier.thesoz10061598
internal.identifier.thesoz10063356
internal.identifier.thesoz10037027
internal.identifier.thesoz10040387
internal.identifier.thesoz10040528
internal.identifier.thesoz10039188
internal.identifier.thesoz10094228
internal.identifier.thesoz10043015
internal.identifier.thesoz10036934
dc.type.stockincollectionde
dc.type.documentKonferenzbeitragde
dc.type.documentconference paperen
dc.source.pageinfo3647-3651de
internal.identifier.classoz50200
internal.identifier.document16
dc.source.conferenceInternational Conference on Acoustics, Speech and Signal Processing (ICASSP 2013)de
dc.event.cityVancouverde
internal.identifier.ddc500
dc.identifier.doihttps://doi.org/10.1109/ICASSP.2013.6638338de
dc.date.conference2013de
dc.source.conferencenumber38de
dc.description.pubstatusVeröffentlichungsversionde
dc.description.pubstatusPublished Versionen
internal.identifier.licence3
internal.identifier.pubstatus1
internal.identifier.review1
internal.pdf.wellformedfalse
internal.pdf.encryptedfalse


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record