Coordinator: Jean-Marie Marandin (LLF, Paris 7), Barbara Hemforth (LPNCog, Paris 5), Jonathan Ginzburg (LLF, Paris 7)
Participants in the Labex EFL: Judit Gervain, Thierry Nazzi, Willy Serniclaes, Liliane Spenger-Charolles (LPP, Paris5), Frédéric Isel, Boris New, Sébastien Pacton (LPNCog, Paris 5), Laure Sarda, Michel Charolles (Lattice, Paris 3)
In many cases, phenomena of interest for linguistic as well as for applied questions are not easily accessible for corpus-based studies because of the so-called sparse-data problem. The frequency of occurrence in existing corpora is insufficient for the investigation of the factors driving the respective phenomenon. A solution to this problem is to set up highly constrained laboratory experiments where the types of utterances of interest can be elicited (see e.g. Beyssade et al., 2010). Empirical results obtained in this kind of experiments, however, always confront the problem of ecological validity. Parameters of the experimental setup always entail the danger of the production of artifacts, production phenomena only reproducible in an artificial environment but not in natural dialogues. The goal of this work package will be to set up and validate different experimental environments in order to arrive at corpora of utterances in setting which will be as controlled as possible still allowing for ecologically valid productions. The experimental environments will be set up with members of the Labex EFL working on a large variety of research questions from phonetics to pragmatics, such that the obtained data can be exploited as widely as possible. Once validated, the corpora will be made accessible for researchers beyond the Labex EFL following our general policy of sharing resources.
Working plan:
We will start out testing three approaches to the construction of experimentally constrained corpora, in
i. setting up dialogue situations in which discourse topics, shared discourse models etc; are constrained in order to elicit utterances as naturally as possible. An example of this kind of tasks would be the Map Task (Anderson et al., 1991). The Map Task is a cooperative task involving two participants. The two speakers sit opposite one another and each has a map, which the other cannot see. One speaker -- designated the Instruction Giver -- has a route marked on her map; the other speaker -- the Instruction Follower -- has no route. The speakers are told that their goal is to reproduce the Instruction Giver's route on the Instruction Follower's map. The maps are not identical and the speakers are told this explicitly at the beginning of their first session. No French corpus based on this kind of task is currently accessible for the national or international research community. Other tasks will be discussed and tested.
ii. having participants repeat naturally occurring dialogues (Rep Task). Pilot studies (Laurens et al., 2009) show that participants repeat prosodic patterns from original dialogues when reading transcripts aloud without having heard the original dialogue.
iii. providing participants with naturally occurring contexts allowing for a small variety of utterances. Predictions from probabilistic models with respect to choices of continuations will allow testing which factors are relevant for the generation of particular utterances (e.g., Bresnan, 2007). This task will be used regularly as a validation for the Map Task Paradigm as well as the RepTask paradigm.
All experiments will be run in soundproof rooms such that the data obtained will be exploitable for all levels of analysis, from phonetics to pragmatic. Video cameras (including head sets) will be used for most experiments in order to be able to include gestures and mimic as well as visual focus of attention into our analyses.
Schedule
Deliverable |
Main research area |
Time frame |
French Map Task environment RepTask Environment |
All levels of analysis |
2011-2014 |
Corpus French Map Task |
All levels of analysis |
2015-2018 |
Corpus French Map Task / French RepTask including Gestures and mimic |
All levels of analysis |
2019-2021 |