I sabel Zollna



Marburg University



Comparing prosodic features in different languages: how to get a homogeneous corpus ?



abstract



Prosody, i.e. intonation, rhythm, tempo (speech rate), and accentuation is one of the fields of linguistic research which is maybe the most influenced by extralinguistic conditions. Emotional and psychological factors as well as purely external factors (acoustic conditions, conversational setting) can have an immediate impact on the prosody of the text. Homogeneity, however, is one of the most important conditions for a reasonable work with linguistic corpora. Hence, when comparing prosodic features of different languages one has to look for homogenous situations of text production. Therefore investigators very often have pre-defined and artificially constructed data produced in an artificial "studio-like" situation, where the informants read texts or produce isolated sentences. This kind of homogeneity is reached only at the price of a great lack of authenticity, and the corpus doesn't represent "reality". "Authentic" texts instead, registered in a natural setting, are influenced by all external factors mentioned above, not to mention disturbing noises and technical problems. The suggestion we want to give here for comparing prosodic patterns in different languages is to investigate typical or stereotyped situations of communications with identical speech acts (more or less guaranteed in typical genres), where the influence of extralinguistic factors is reduced to a minimum. This is the case in repetitive and ritualised or stereotyped monologues: the reaction to an interlocutor is excluded, the text is almost the same in every performance. The homogeneity of our corpus hence comes from the identity of the speech acts (in our project prayers, the eucharist, announcements, and street cries) which all have one essential point in common: the speaker is the medium of the message and not involved as subject, as an individual person who has his or her own "message". The context and the time-and-space conditions are identical or have to be identical in the comparing of the features. How the corpus had to be reorganized in certain details to maintain the highest degree of homogeneity in the three text types (genres) will be the subject of this talk.