TY - GEN
T1 - Production Strategies of Vocal Attitudes
AU - Salais, Léane
AU - Arias, Pablo
AU - Le Moine, Clément
AU - Rosi, Victor
AU - Teytaut, Yann
AU - Obin, Nicolas
AU - Roebel, Axel
PY - 2022
Y1 - 2022
N2 - Humans have an impressive ability to communicate precise social intentions and desires with their voice - through vocal attitudes. Previous studies have shown how isolated acoustic features such as pitch can convey social attitudes, but have mostly worked with single attitudes and have not controlled for inter-speaker variability. Thus, the vocal behaviours used to produce social attitudes remain mostly unknown. That is the aim of the current study, to uncover the anatomic production strategies that speakers use to communicate vocal attitudes. To do this, we analysed recordings from N=20 French speakers producing dominant, friendly, seductive and distant speech. For each of these attitudes, we investigated their vocal fold behaviour, vocal tract actuation and phonetic speech structure, with the support of deep alignment methods, and compared them with group statistics. We notably produced high-level representations of speakers' articulation (e.g. Vowel Space Density) and speech rhythm. Our results reveal speakers' prototypical strategies to produce vocal attitudes, and highlight how vocal behaviours can communicate social signals. We expect these results to provide an objective validation method for deep voice attitude conversions.
AB - Humans have an impressive ability to communicate precise social intentions and desires with their voice - through vocal attitudes. Previous studies have shown how isolated acoustic features such as pitch can convey social attitudes, but have mostly worked with single attitudes and have not controlled for inter-speaker variability. Thus, the vocal behaviours used to produce social attitudes remain mostly unknown. That is the aim of the current study, to uncover the anatomic production strategies that speakers use to communicate vocal attitudes. To do this, we analysed recordings from N=20 French speakers producing dominant, friendly, seductive and distant speech. For each of these attitudes, we investigated their vocal fold behaviour, vocal tract actuation and phonetic speech structure, with the support of deep alignment methods, and compared them with group statistics. We notably produced high-level representations of speakers' articulation (e.g. Vowel Space Density) and speech rhythm. Our results reveal speakers' prototypical strategies to produce vocal attitudes, and highlight how vocal behaviours can communicate social signals. We expect these results to provide an objective validation method for deep voice attitude conversions.
KW - articulation
KW - speech production
KW - vocal social attitudes
U2 - 10.21437/Interspeech.2022-10947
DO - 10.21437/Interspeech.2022-10947
M3 - Paper in conference proceeding
AN - SCOPUS:85140054205
VL - 2022-September
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 4985
EP - 4989
BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022
Y2 - 18 September 2022 through 22 September 2022
ER -