Original field study: Patrick Boylan and Nadia Mari, Iniziativa Didattica Studentesca 40/82-83,
La Sapienza University of Rome, 1983.

The present paper was co-authored and circulated in photocopied form in 1983
and then, entirely rewritten by Patrick Boylan, presented in the form which follows
at the 6th International Pragmatics Association Congress, Reims, 1996.

Patrick Boylan and Nadia Mari

0. Abstract/Introduction

This paper analyzes how second-language learners and native speakers interact in small groups. It focuses on how apparently marginal behavior, such as eye and head movement, may contribute to determining each participant's status as an ‘insider’ or ‘outsider’. Data is taken from spontaneous small-group conversations among American and Italian college students, surreptitiously filmed and analyzed by a research team composed of teachers and students of English as a Second Language (henceforth ESL) in Rome.

To what degree, then, are eye/head movements culturally distinctive? What messages do they convey? How can we be sure we have understood them? Can adapting our eye/head movements improve communication? The paper argues that by studying conversations as ‘texts’, such questions cannot fully be answered. It then proposes the notion of ‘enactment of intent’ as the basic ‘unit’ making up conversations, and outlines an innovative research procedure based on experiential knowledge (phronesis), better suited to ascertaining the local meaning of somatic as well as verbal and prosodic messages. This, it claims, is the kind of knowledge that learners of a second language (henceforth L2) need and that conversationanalysis (henceforth CA) should consider investigating.

1. Aims

The aims of the present research project were:

a. to observe and record spontaneous conversational interaction in informal settings between self-selected members of culturally heterogeneous small groups, specifically, between mixed-gender American and Italian university students and teachers surreptitiously filmed as they chatted together informally in English after a lecture/debate;
b. to take one of the features of the video-recorded interaction — here, gaze and head movement — and, using classical research methods (Kendon 1967, Birdwhistell 1970, Allen & Guy 1974, Argyle & Cook 1976), to verify, together with the participants:
¨ if each cultural group has a ‘typical somatic behavior’ in conversing and, if so,
¨ whether a participant can be more successful in communicating with members of the other culture by adopting their conversational behavior;
c. to determine — this time with only the Italian participants, all of whom were ESL students — whether the kind of knowledge classical research methods furnish gives them a better insight into the communicative competence they wish to acquire, i.e., native-like conversational skill in multi-party multi-cultural encounters;
d. if the response to (c.) is negative, to determine what kind of research questions are not currently being asked in the field of conversation analysis (but that the ESL learners deem of interest) and to propose a methodology for answering those questions.

2. Experimental task and setup

Fifteen American third-year university students studying art history in Italy were invited by fifteen Italian ESL students to a lecture/debate in English on cultural differences between their two countries, held at the University of Rome. After the debate, the students were free to mill and converse in the lecture hall while a lawn picnic was being set up outside. A video camera was present in the room and was constantly pointed at the table up front where the lecturer and the discussion moderator sat. The camera was equipped with an Eibl-Eibesfeldt lens, i.e.,a hidden mirror arrangement which permitted filming anywhere in the room unnoticed; audio came from a tiny radio microphone concealed on one of the students. By rotating the hidden mirror, the cameraman was able to keep the camera constantly focused on the ‘wired’ student as she wandered from one group to another engaging in conversation. To make sure the student acted as naturally as possible, she was told, when the lecture began, that the microphone batteries had gone dead and thus there would be no audio recording; since the microphone was sewn inside her dress, she had to leave it there. One wall of the lecture room had a panel mirror so that the faces of any group standing in front of it could be filmed either directly or as a reflection. Unfortunately, no group spontaneously formed in front of the mirror; thus in some cases, gaze must be reconstructed through the interactional dynamics. All students were informed afterwards of the video recording and consent was obtained for its use.

The reader is now invited to turn to Appendix 1 and inspect the drawings (called a ‘storyboard’ in cinema jargon) of a fragment of the filmed conversation. Appendix 2 gives a verbal description of the movements;Appendix 3, the transcription conventions.

3. Previous studies; new directions

3.1 Typical research into conversational interaction centers on culturally homogenous dyads (see Allen & Guy 1974:54 for a justification of the preference for dyads). Less studied are triads (Kerbrat-Orecchioni 1990-4), large groups (Edelsky 1981) and, least of all, small groups (Berrier 1997:326). While studies of intercultural interaction are now fashionable in our multi-cultural societies, they also concern dyads or triads (Orletti 1992, Shea 1994, Jensen et al. 1995) and only rarely small groups (Berrier 1995).

Yet conversational competence in small-group situations is what learners of any L2 seem to find most difficult. In paired exchanges with members of the other culture (service encounters or one-to-one socializing), they automatically get individualized attention from their interlocutor, however begrudging it may be. In large groups, L2 learners usually have the option of creating a dyadic relationship to get individualized attention and to isolate themselves from the crowd. Not so in multi-cultural small group conversations: here they are mostly on their own and, in order to cope, need "a superior know-how" (Kerbrat-Orecchioni 1997:10). They either ‘fit in’ or get left out.

Unfortunately, current language teaching methodology gives L2 learners neither the know-how Kerbrat-Orecchioni speaks of, nor the intellectual tools to acquire such know-how empirically in real-life situations (Boylan 1998). Thus it is common for even moderately fluent L2 learners to report that, in small group conversations with native speakers, they often feel stifled, ignored or, worse yet, condescendingly listened to during the occasional lulls in the ‘real’ conversation. In other words, these L2 learners do not feel accepted as ‘one of the group’. But, of course, how could they be? They have no idea of what it takes to gain a "footing" (Goffman 1981) in the new culture.

3.2 What can we do to prepare L2 students to understand and assimilate the dynamics of the conversational interactions in which they find themselves?

A first step would be to determine what indeed is required to gain a footing in a given foreign culture. Frake (1964) suggests, rather convincingly, that this involves learning to feel as ‘real’ what is ‘real’ for one's interlocutors. There is no doubt, in fact, that in multi-cultural group conversations L2 learners will tend to be ignored if the topics they raise — ‘real’ to people back home — appear pointless to the group or out or place (see Tenny 1989 for rules of topic appropriateness). The same goes if these L2 learners, by lack of appropriate reaction, treat as pointless or irrelevant the topics that the rest of the group finds ‘really’ significant, controversial, humorous, scandalous or whatever.

Indeed, to use a metaphor from transplant surgery, can we really blame the group for rejecting a body felt as foreign? After all, why should the group take seriously people who don't take seriously things that matter? It may be instructive but it is certainly not pleasurable to let an outsider, by her apparent disregard, call into question what one always thought mattered; and it must be remembered that conversation (fun) is different from discussion (work) precisely because it operates on the pleasure principle. If, on top of it all, the outsider makes conversation management a chore — by sending confusing signals through ‘strange’ eye and head movements or by maintaining a speaking style diametrically opposite from her interlocutors on such culturally indicative scales as physical-contact ~ distance, one-speaker-at-a-time ~ multiple-flooring, explicitness ~ allusiveness, (Hall 1966, Jensen et al. 1995) — she can hardly complain at being left out of the conversation. She has in fact opted out by not having opted in.

In a perfectly just world, of course, she would not have to ‘opt in.’ She and her conversational partners would want to learn from each other in order to create a terrain of authentically shared values and creative diversity (see the special issue of Pragmatics 4/3, September 1994). This is not, however, the world that many L2 learners live in.

What L2 learners need, then, is the capacity to meet their interlocutors a little more than half-way culturally. This does not mean memorizing lists of conversational topics and Do's & Don'ts. Actors get quickly unmasked. It means something much simpler but much more radical: sharing, at least in part, the existential value system at the bottom of the other culture. To be truly ‘one of the group’, L2 learners must truly feel they are.

3.3 Thus, the basic task facing L2 teachers is to get learners to internalize their interlocutors' culture as a value system or Weltanschauung (Boylan 1998). Doing so will allow them to forget the lists of rules and play by ear. Far from being cultural imperialism, this technique permits the L2 learner to communicate her values as a member of a different culture deserving respect much more effectively and convincingly than if she spoke as an ‘outsider’. In addition, it gives the L2 learner a negotiating edge, since she controls the flow of information. (These points are developed in Boylan, forthcoming/a.).

But getting learners to internalize a second language does not simply mean getting them to playact. It presupposes a thorough contrastive study of the cultures and communication modalities involved. In fact, to return to the topic at hand, one way to help L2 learners ‘enter into’ the new language and culture is to get them to formulate (and then try to imitate) how people manage conversations in the target culture — including how they use their eyes and heads — as the expression of a different, felt Weltanschauung. Kendon (1967) reported long ago that eye and head movements are our principal signaling devices in conversation. Allen & Guy (1974:142) confirm his assertion: "Eye contact is an important supplement to the conversational encounter. It is a good indicator of the degree to which the channel is open or closed, [...] the degree of bonding and the level of attention in the listening mode." Participation, bond of solidarity, interest: these are precisely the qualities we just mentioned as essential for ‘being one of the group.

3.4 Several classification schemes have been devised over the years — see Section 1 b; for the most recent, see Pezzato & Poggi 1998 — which give a research group interested in replicating a typical ‘candid camera’ ethnographic experimentation a framework for ordering observations of eye and head movement. What remains to be seen, however, is the usefulness of these schemes to L2 learners or, for that matter, to anyone interested in communication. Classical research methods measure, for example, phenomena like the length of time a typical subject spends gazing at an interlocutor when speaking or when listening. But these observations are seldom related to what the recorded phenomena mean. Average "length of gaze" is an interesting concept; but when does a glance — in a foreign culture, or even in one’s own culture — become ‘fleeting’ or ‘overly insistent’? What non-average "length of gaze" enhances participation, creates bonds of solidarity, manifests interest or, on the contrary, appears brazen, standoffish or bored? These are the findings that people who use language need to have. More importantly, they need to know what procedures will permit them to learn these things in the culture in which they may one day have to live and work, if no studies have been made yet.

The fundamental problem, then, is how to assign meaning to behavior. And given that communication is multi-modal and holistic (Poggi 1997), not only must we take note of eye and head movements as potentially meaningful behavior, but we must also record behavior in every other modality (verbal included) and weave all the data together into a single fabric. Note that the fabric thus obtained is not the ‘meaning’ of the communicative event, but simply its material substratum, no more than the documentation of an automobile accident (photos, declarations, measurements) is the ‘event’ determining a claimant's legal rights. In the case of an accident, legal rights derive from an interpretation of available documentation by an insurance expert or a judge. They alone are authorized to decide what really happened, by using applicable law and jurisprudence to link intentionality, behavior, concurrent circumstances and social responsibility into a legally-binding whole (the ‘event’). In the same way, the fabric of a conversation that we weave when we analyze a video recording, is not the ‘event’ but rather the material to which we assign a sensein the very act of weaving it, thereby creating the ‘event’ (Gadamer 1960). Like a judge, rightly or wrongly we determine what the conversational ‘event’ was through the hermeneutics, not of law and jurisprudence, but of psychology and cultural practices. It is not clear, however, what authorizes us to pass judgments.

Let us therefore look for a moment at how we go about ‘making sense’ of talk-in-interaction. This will allow us to answer some of the crucial methodological questions that Grossen & Orvig (1998:153-154) raise concerning the validity of clinical interviews, and which apply to any exchange that involves ‘judging’ what an interlocutor means.

4. General considerations: making sense of a conversational event

4.1 A key step in making sense of any event is defining the units that go to make it up. This of course depends on the traits we have selected to foreground in viewing the event: in any empirical science, what we discover depends on what we have chosen to notice — or, more exactly, not to notice. Indeed, as Gadamer (1960) argues, researchers (such as ourselves in this very paper) may be said to take de facto a social and ethical stand in finding ‘interesting’ certain phenomena which they then elaborate as knowledge.

4.2 For Berrier (1997), revisiting Sacks et al. (1974), the basic unit of conversational interaction is the utterance, defined as "any verbal attempt at turn-taking" (p. 326). However Ford et al. (1996) question whether the floor is in fact held or ceded on purely verbal grounds. Real-life conversational turns, they assert (p. 449), are "constellations of convergence and divergence" of verbal/prosodic/somatic signals, the weightings of which co-determine possible turn junctures dynamically. In this perspective, then, conversations should be seen as composed not of utterances but rather of moves (Goffman 1981) to gain or hold the floor by whatever means available, verbal, prosodic or somatic.

4.2.1 The two definitions just given offer useful insights. However both follow a consolidated but questionable tradition: that of treating conversations as ‘texts’, i.e., bounded, finalized semantic representations, transferable onto paper for easy inspection and dividable into verbal (or verbal/prosodic/somatic) units with the stroke of a pen. This approach, we hold, while applicable to text productions proper (a sonnet, a joke, ritual insulting) is misleading when applied to conversation. Conversations are not simply ‘texts’ — neither verbal texts, nor multi-modal texts (i.e., concurrent verbal/prosodic/somatic realizations pictured as a musical score: Poggi 1997), nor semiotic texts (e.g., in the way some semiologists attempt to reduce culture to a text). This is because a ‘text’, by definition, is a framed representation to be viewed from without, while conversations (like cultures) are boundless events, the sense of which can only be experienced from within. Conversations (and cultures) produce ‘texts’ as their residue. While much can be learned by studying that residue, to fully grasp a conversation (or a culture), a procedure other than textual reconstruction is needed. Let us briefly clarify these assertions.

4.2.2 When one delineates (‘frames’) an object in order to observe it as a ‘text’, one endows it with meaning potential. Pop Art did that with soup cans. It is also what CA does in framing conversation fragments as ‘adjacency pairs’ or ‘side sequences’.

If a stretch of speech has been conventionally bounded and finalized by the speakers (even unconsciously) — for example, a ‘side sequence’ — and if the subsequent framing operation by the linguist respects those boundaries and puts that finalization into perspective, then textual analysis of the framed object is both practicable and instructive. If on the other hand, as in (Pop) Art, the framing does not correspond to the way whoever created the object sees it, but rather to how the framer sees it — i.e., to the personal mental world the framer projects upon the object — then the analysis will end up telling us more about the framer than about the object. This, too, can, be a perfectly legitimate and instructive operation, if we are interested in the framer’s ‘perceptions’ or metadiscourse. It must not, however, be confused with scientific inquiry, which imposes on us to accept the dictates of the object as other in order to come to grips with it (7.1.8.) .

The first question to resolve, then, is whether conversations, as such, are bounded and finalized and thus frameable as ‘texts’. If they are, we can then move on to defining procedures that guarantee that our framing respects the object. If they are not, we have a more difficult task — discovering a non-textual approach for coming to grips with them.

4.2.3 Conversations, as Garfinkel (1967) argues convincingly, are on-going, situated, collegial, intentional attempts at existential meaning-making (or meaning-confirmation); they are the very stuff out of which our conscious lives are made. That is why Aristotle, long before ethnomethodology, considered societies to be communities of discourse more than communities of people (Lo Piparo 1996:40).In this perspective, therefore, the stretches of talk ordinarily called ‘conversations’ — e.g., the talk that occurs between the lifting and the lowering of a phone receiver — are simply fragments of a single, uninterrupted conversation punctuating the lifetime of individuals and their community (and still unachieved when both pass away). Chats on the phone will of course have, like all conversational fragments, recognizable textual features (phone calls are partly ritual, as CA has shown) and even goals. But insofar as the call is ‘conversation’, it will not be a ‘text’. Thus, a specific kind of non-textual competence is needed for analyzing a conversation as a researcher, as well as for co-creating one as a participant.

Travelers, for example, know that, while mastering a community's conversational rituals is sufficient for service exchanges (e.g., knowing "Excuse me; how do I get to..." is sufficient for asking directions), it is rarely sufficient for entering into communion, through conversation, with the members of that community — no more than mastering the rituals of prayer in a given religion puts one in communion with the god being prayed to. Conversation is different from ‘information exchanges’ precisely because it is an ‘entering into communion’. As such it requires making a community's existential value system one's own, in order to spontaneously recognize and react to the units of intent that make up ‘free talk’ in that community. Recognizing existential units of intent as an outsider is not easy. This is why it is harder, in a foreign language, to learn how to converse as ‘one of the group’ than to create sophisticated forms of bounded, explicitly finalized discourse (i.e., ‘texts’: academic papers, sales talks, literary reviews...), the structures of which can be thoroughly described by teachers or learned from handbooks.

4.2.4 If conversation proper is not a ‘text’, then what is it concretely? In this paper, it will be treated as an ‘intentional event’ — specifically, as an on-going, situated enactment of a collegial will to make sense of ordinary experience, accompanied by verbal/prosodic/somatic improvisations. Thus, while any verbal interaction has textual features which must certainly be accounted for, we shall consider the basic units of conversation to be neither physical objects (‘utterances’) nor discourse functions (‘turns’) nor logical constructs (‘felicitous acts’) but rather intuitively-grasped enactments of will, or ‘stances’. A stance is the perceived intent behind a basic behavioral pattern.

How can such a subjective criterion be scientific? For now we simply appeal to the ordinary experience of the reader. In general, when we encounter manifestations of intent — a strange movement made by a passer-by in a dark alley or human-like cooing coming from the apartment next door — we recognize them as willful acts on the strength of intuition alone, i.e., without explicitly calculating the degree of their deviation from randomly generated acts of similar nature. In other words, an intent ‘makes itself felt’ to us clearly and distinctly, even if we are not sure of what it means. Of course at times we are mistaken: we project intentionality on phenomena we later find have none; thus we discover the pitfalls of intuition and learn to practice preventive strategies, like inner as well as outer listening. Whatever level of expertise we reach, however, sensing intentionality still remains the soul of our conversational ability. It is what conversation analysts rely on all the time when they confidently assert the ‘meaning’ of other people’s talk.

4.2.5 Treating conversations as interplays of ‘intentional events’ — and not exclusively as verbal or verbal/prosodic/somatic ‘texts’ — is not only closer to our experience of them but also more useful. First, it permits us to use a wider range of data in our textual analyses of the frameable parts of a conversation (to be specified in 4.5). This allows us to explain interactions like a TV talk show on politics in which the skirting of a certain issue by A provokes a protest from A’s rival B (reticence is, in fact, an ‘intentional event’); or one in which A respectfully lets B talk at length without interruption (deference also constitutes an ‘intentional event’); or one in which A sadistically sits in silence letting his expensive clothes show as B, stammering, constantly tugs at his overly-short jacket cuffs (this normal exhibition of self by A is just as much an ‘intentional event’ as was the act of putting those clothes on at home); or, finally, one in which A chooses to remain in silence after B, having made a terrible admission, relinquishes his turn (if A’s silence lasts a long time, the turn goes back to B; if B has nothing to add, his silence, with A’s, becomes a simultaneous silent turn). Occurrences such as these constitute genuine communicative acts by A, through doing nothing (see Hall 1966 and Watzlawick et al. 1967 on covert communication; novelists, too, regularly note occurrences like these in describing conversations). And yet they are not "verbal attempts at turn-taking" nor even, strictly speaking, multi-modal attempts. If, however, we consider conversations to be made up of units of intent, then we can treat them structurally, in order, as: alternating turns in a side sequence, alternating turns with immediate turn relinquishment, backchanneling, and an alternating turn followed by a simultaneous turn.

More importantly, by considering experiential investigation through cultural assimilation as a valid research tool for grasping units of intent (the procedure we shall suggest in Section 7), we can gain access to forms of communication that remain obscure if analyzed as ‘text’. An obvious example are certain Native American cultural practices:

When used in a special way by Blackfeet, the term 'listening' refers to a form of communication that is unique to them; when enacted in its special way, 'listening' connects participants intimately to a specific physical place. [...] 'Listening' this way can involve the listener in an intense, efficacious, and complex set of communicative acts in which one is not speaking, discussing, or disclosing, but sitting quietly, watching, and feeling-the-place [communally], through all the senses. — Carbaugh (1998)

Indeed, the complementary heuristic we shall be proposing at the end of this paper can give investigators better access to any intercultural interaction (native/non-native, boss/worker, woman/man...). Only by learning how to sense, represent and introject the Weltanschauung of their conversational interlocutors can investigators grasp the ‘enactments of intent’ that direct the flow of the conversation in which they seek to participate.

4.3 This paper, then, will view conversations as boundless meaning-making events, moved forward by bounded sequences of stances, i.e.,perceived minimal enactments of intentionality. (See ahead for the bounding criteria.) It is possible to sense a stance without fully grasping the intent behind it, as we have seen, simply by noting a suggestive pattern of behavior. But to understand a stance fully, we must grasp the intent giving it shape within the context of both the whole conversation and the particular sequence in which it appears. (See Gumperz 1992, 1995, on contextualization.) As a unit of effect — what we are ‘summoned’ to feel — a stance is the equivalent, in the realm of intentionality, to a ‘speech act’ in the realm of discursive thought (Austin 1962: see ahead). As a unit of perceived expression, a stance is a bundle of signals, i.e., triggering occurrences. Examples of signals: stammering, keeping silent, letting one's expensive clothes show... Example of a stance: while B stammers, A seems to keep purposefully silent, letting his expensive clothes show... (which is seen as a sadistic stance only if the dynamic of the whole conversation is grasped). The basic unit of conversations is thus not the signal but the bundle of signals (triggering occurrences) within which each occurrence may acquire its presumed intentional value — i.e.,the stance.

4.4 The Oxford Ordinary Language philosophers, and many pragmaticists with them, have adopted a different, quasi-legal perspective in defining the basic units of communication. To converse is to ‘felicitously’ do something with words and/or gestures (Austin 1962). This is clearly an oversimplification, as Goffman (1981) has pointed out: people do much more with language than accomplish speech acts; they indulge in phatic communion, for instance, or talk to keep from thinking. Thus, while Austin's ‘speech acts’ may constitute the building blocks of the world of discursive talk, the world in which some philosophers choose to live, they do not qualify as the building blocks of all talk, much less conversation. Moreover, a definition of the basic units of communication should establish at least some correlation between units of expression and units of effect. But speech act theory can do so only with propositions, not with the co-constructions and multi-purpose utterance-turns typical of conversation (see Goffman 1981 for examples). Nonetheless, Austin's basic notion of speech as ‘doing’ (1.creating a value; 2.modifying/advancing a state of affairs’) seems worth keeping. In fact, applied to enactments of intent, the first sense defines ‘stance’; the second, ‘sequence of stances’.

But what exactly constitutes a ‘sequence’? Taking inspiration from Kerbrat-Orecchioni (1997), Colas & Vion (1998) define the minimal unit of conversation as a multi-phase contribution. (Also see the notion of ‘mutual adjustment’ in Clark & Wilkes-Gibbs 1986, ‘contribution’ in Clark & Schaefer 1989, and ‘exchange’ in Kerbrat-Orecchioni 1994.) Each ‘contribution’ has a three-part structure: A makes a verbal offering;B acknowledges the offering; A acknowledges the acknowledgment (explicitly or tacitly).

This schema elegantly explains how a whole series of utterances can constitute a single functional unit. Nonetheless, as worded, it clarifies conversational exchanges uniquely as exchanges of information — more precisely, as epistemic or deontic transactions (a tribute to the Austinian tradition?). Our objective, on the other hand, is to clarify how conversation works as an alogical world-building practice. Thus, we shall borrow the term, conserve its primitive sense ("finalized transaction made up of a series of turns"), but define it teleologically as a "perceived claim to value" (Stuart Hall, lecture). Contributions make a conversation advance, as political struggles make a nation advance. They modify a state of affairs. The entire storyboard in Appendix 1, for example, represents a single contribution: PHIL’s defense of an American idiom (a claim to value), with acknowledgment by the GROUP and PHIL’s acknowledgment of the acknowledgment.

We shall not consider here units superior to the ‘contribution’. Our thesis is that intentionality drives conversations. Larger units (cognitive ‘enhancements’) are derivative.

4.5 Drawing on Goffman's notions of ‘frames’ and ‘moves’ (1974, 1981), Schank's notion of ‘scripts’ (revisited in Schank & Leake 1989) and the existentialist notion of ‘will affirming itself’ (Sartre 1943), we have defined stances as the basic units of intent which go to make up contributions and thus conversations. But what are they in practice?

For the moment the reader can get some idea by glancing once again at the conversational fragment given in Appendix 1 in the form of a cinema storyboard. Clearly, in order to choose a video sequence for the storyboard, the authors of this paper must have had some idea of what had struck the participants as a ‘claim to value’ (or ‘contribution’) during the conversation. Debriefing revealed that PHIL’s irruption into the conversation to comment on the expression "Give me a break!" had struck everyone. Next, the authors had to define the beginning and end of that contribution and divide it into stances. Those choices were not automatic. A video tape is a continuous flow of visual and audio phenomena: some principle must have therefore guided the authors not only in subdividing that flow, before and after PHIL’s guffaws, into a coherent unit with a start and a finish (the ‘contribution’), but also in subdividing that contribution into single units to be pictured as ‘frames’, i.e.,the six boxes making up the storyboard. Whatever that principle was — and we shall try to describe it further on — Appendix 1 illustrates it.

4.6 We may therefore define a ‘frame’ as the pictorial representation of a stance. It is a bundle of individual/collective signals (e.g., changes in gaze and head position) seen as enacting a personal/group intent. This is represented pictorially by keeping all visual elements intact from one frame to the other, except those in which change is seen as potentially significant and indicative of intent (see Note 6). Since all other changes are perceived by the participants as either background or as noise, they need not be pictured. As a rule of thumb, frames — like stances themselves — represent the shortest sequences (or ‘flashes’) that a Video Editor might cut from the tape of the interaction to make a brief TV spot to publicize the conversation, like spots made for films. Minimal units give enough (not necessarily all) of an utterance or gesture to make it seem like the concrete, intelligible expression of some will directed to moving events forward.

The storyboard makes it clear that, in any case, the minimal units of a conversation are neither utterances/utterance-turns, nor speech acts, nor transactions. The frames were not chosen on linguistic grounds (one contains no utterance and one contains two that work as a single enactment) or to represent discourse functions (although all, including the silence in frame 6, perform one or more ‘illocutionary acts’ and frame 2 contains a topic/comment exchange). The criterion used was existential: every frame had to capture one minimal aspect of the overall attempt (on the part of the participants in the conversation both as individuals and as a group) to ‘get a hold on things,’ and thus to create or reinforce a world-view. To borrow an expression from an Italian theoretician of literature inspired by Althusser, every frame shows, in miniature, an "attempt to implement a project of acting on the world" (Liborio 1979:9, our transl.).

4.7 Now that we have ‘units’ suitable for analyzing conversations as existential meaning-making events, we may proceed to examine a segment of our video recording. We propose to: 1. identify all phenomenological regularities using traditional nomenclature (‘posture’...) and procedure (distributional analysis...) and, simultaneously, 2. assign meaning to these regularities using our new nomenclature (‘stance’...) and a yet-to-be-defined method for grasping experientially the overall sense of the conversation.

The need for such a method is evident. We have asserted, in fact, that signals can be perceived as intentional only within a stance; a stance can be perceived as such only within a contribution; and a contribution, only within the development of the conversation-up-to-that-point. Moreover, we have asserted that a conversation is not frameable and can be grasped only by experiencing it. These assertions would seem to imply that to be able to recognize a ‘stance’ in the verbal/prosodic/somatic ‘posture’ of a person, we must actively participate in the conversation in which the posture is struck; what is more, we must be a bona fide member of the interaction, able to ‘get the feel’ of ‘what’s going on’ and, insofar as possible, to work out meanings dynamically with the group.

If instead we try to make sense of a conversation from the outside, we may fail to notice many of the stances and contributions as perceived by the participants: but these are what determines (subjectively) the development of the conversation! The risk is especially great with covert stances (4.2.5). In addition, we may misinterpret the various movements we notice by projecting onto them our own world of values (‘projective interpretation’), imagining ‘stances’ and ‘contributions’ where there are none (for the participants). In other words, we may manage to make sense of the conversation for us, but fail to capture what is going on "from the natives’ point of view" (Malinowski 1944).

To conclude, it would seem that there is no way we can reliably establish the idiosyncratic meanings of the interactions we recorded. Current methodology can legitimately uncover only their prototypical meanings by using psychological/cultural universals to frame the interactions (e.g., Grice’s conversation ‘maxims’, Goffman’s conversation ‘system requirements’, Sacks’ ‘turns’, Brown & Levinson’s ‘face-work’...).

This dilemma, called the hermeneutic circle, has various solutions (Gadamer 1960:312 seq.); we shall propose ours in Section 7. In carrying out the present experiment, however, we chose to take a shortcut in order to expedite analysis. We substituted actual participation in the conversation with ‘virtual participation’ obtained by debriefing the participants and then attempting to view the video with their eyes (as though we were living it as they had lived it). How well this procedure works is what we shall now see.

5. Hypotheses, predictions

5.1 The present research hypothesized that:

1. at least a few kinds of eye/head movements would be identifiable as either typically American or typically Italian; moreover it would be possible to grasp the idiosyncratic meaning of these movements by vicariously ‘willing the enactment’ in which they appear, on the basis of how the participants reported having lived that enactment;
2. participants who used meaningfully and appropriately the eye/head movements typical of a given culture would be perceived as ‘insiders’ by the members of that culture: they would be included in the web of intersecting gazes uniting the members of the culture in phatic communion and would receive through gazes, although to a lesser extent, more invitations to speak and greater attention when they spoke.

5.2 To test these extremely complex hypotheses, we formulated two simple, narrowly-focused predictions dealing with how one value, assent, is communicated through eye/head movement. On the basis of informal observations made during a pre-encounter held with the American students, we predicted that, in our video-recorded encounter:

1. in a majority of the enactments of assent, most Americans would nod amply and continually; most Italians would use brief, contained nods or jerk their heads backward while opening their mouths slightly (individuality would be expressed, if at all, by varying the other signals bundled with the nod or jerk);
2. during each enactment of assent, participants would gaze principally at whoever gesticulated as they did, regardless of that person’s nationality. This means, given the first prediction, that during most of the choruses of assent typical of friendly group conversations, the Americans would gaze at whoever was nodding amply (in practice, at their fellow Americans, plus any Italian adopting such behavior) and vice versa, creating de facto two ‘phatic communion’ groups based on communicative style.

6. Results

6.1 The data gathered was initially compared with classical studies on gaze and head positions (see Section 1.b) to give the student-researchers practice in noting and collocating visual cues. The nomenclature and data categories for describing eye/head movement in Kendon (1967) and Argyle & Cook (1976) were found applicable to our data; no observations disconfirmed these authors’ findings. Allen & Guy's (1974) findings matched observations, too, except for the following conjecture that left us flabbergasted (p.137).

We do not claim that the various head movements make any distinctive communication, and for the most part they do not. [...] If head tosses are not part of the signal system, then what part do they play? We think it is a form of exercise which is inherently rewarding to the actor. It also provides variety and interest for the partner because he has a mobile object of attention.

Commenting on Birdwhistell's (1970) stimulating text would require a separate article. Pezzato and Poggi 's (1998) typology of gaze was not available for verification.

6.2 As for confirmation of our two predictions, results were disappointing. No regularities of the kind predicted were found. The causes: there were too many dependent variables involved; the video-recording was much too brief; and ‘typical’ gestures, we discovered, are not exhibited as regularly as we had thought they would be. As a consequence, we were not able to furnish quantitative evidence for our two hypotheses. We could only try to verify them through case-by-case studies of ‘presumably’ culturally-marked behavior (not necessarily the predicted eye/head movements) associated with ‘apparent’ inclusion/exclusion (not necessarily manifested as predicted, i.e., through a communion of gaze). In other words, we had to fall back on the hermeneutic practices we had hoped to avoid having to use exclusively. The example which follows shows the kind of ‘knowledge’ we were able to obtain by examining one of the frames we created for our storyboard (frame 4; duration: 0.5 seconds). We framed that particular one-half-second sequence of video because in it we vaguely sensed various enactments of intentionality, of which assent. Let us now try to give them more specific meaning.

6.3 In frame 4, EMILY and BOB nod at PHIL, while BARBARA turns toward him and starts to nod (amply in frame 5). The Italians finish turning their heads toward PHIL and remain immobile; NADIA, immobile, smiles faintly (EMANUELA and MINO, too?).

The group stance captured in this frame seems to be one of general assent. So why are the postures different? One explanation is that there are culturally-different ways of enacting assent: by ‘nodding’ (USA) and by a more reserved style, ‘immobility-with-a-polite-smile’ (Italy), one which we had failed to predict for the Italians. But are the Italians really assenting? Is not their immobility simply due to the fact that PHIL is addressing his question to the Americans (by implication and by gazing at BOB)? And is not immobility the way any bona fide member of the group-as-a-whole would behave, when not questioned directly? In that case, there is indeed a cultural division here, but not because of different styles of assenting. The Italians are simply awaiting their turn.

On the other hand, it is not really clear that PHIL’s question was in fact directed only to the Americans. ‘Keying into’ the spirit of the conversation, through debriefing, we felt the question to be fairly open. And in any case the Italians still could have shrugged, cocked their heads or expressed bystander assent through semi-nodding (back-channeling). In other words, the evident division here into two cultural sub-groups is not simply due to turn assignments. The Italian sub-group is manifesting disinterest through lack of somatic participation (although their gaze is on PHIL, in unison with the Americans).

Or perhaps not. Additional debriefing made us sense two intentionalities in the Italians’ stance: assent to PHIL’s implicit assertion that "Idioms can certainly be curious";non-assent to PHIL’s implicit claim that "Being American makes an idiom OK" (we still felt "Give me a break!" to be illogical). If this is so, then the Italians are indeed participating: they are enacting only partial assent. Note that we are not confirming our initial explanation: ‘nodding’ and ‘immobility-with-polite-smile’ are not two culturally-different ways of expressing the same assent. They are enactments of different kinds of assent, i.e., different positions on the acceptability of a certain idiom (undoubtedly due to culture, but that is another question). Our second prediction is therefore disconfirmed with respect to gaze and unverifiable as to the creation of sub-groups by gestural affinity.

As for the first prediction (head movements as culturally typical, at least among Americans), our data offered no real confirmations. Classical studies of assent, using larger samplings (Argyle & Cook 1976), tell us that, in fact, most of the (American) subjects in the investigator's experiment nodded to say "yes". On closer examination, however, this finding turns out to be fairly worthless: it does not tell us who nodded, how and in what circumstances. Nodding in relaxed laboratory conditions does not make nodding universal: had they felt intimidated by a touchy question, Argyle & Cook’s subjects might have simply smiled to assent (like our Italian students here). Most of all, the findings do not tell us the intent of the subjects who ‘nodded to assent’. Perhaps assent was not the message they intended to communicate at all. For example, debriefing revealed that BARBARA's stance (frame 5), facing PHIL head on and nodding, was one of confrontation; she was not assenting but saying: "OK, you've spoken; now let me get on with chatting up BOB; you've already stopped me twice!" One could even look for finer shadings. EMILY's tilting nod differs from BOB's ‘typical’ straight nod: she may therefore mean something other than "yes" or, if "yes", she may be adding a touch of personal warmth. (This would disconfirm our prediction that ample straight nodding is standard among Americans, with idiosyncrasies appearing in the bundled signals, e.g., squinting, which in fact EMILY does). EMILY was unable to tell us later what she had meant.

To conclude, in frame 4 it is probable that the Italian students, although they did not nod or jerk their heads back, were assenting in some way; but it is not easy to say in what way or how they would have enacted assent if the implied question had been simply "Aren’t idioms curious!?" It is also probable that one or more of the Americans who nodded meant something other than assent (but what?) and that the ample nod used by one of them is not so typical of Americans as we had thought (but why did we think so?). This is all the ‘knowledge’ we were able to obtain: not very much and not very certain.

Yet the web of subtle signals creating the group stance in frame 4 is, we would argue, both perceived and ‘understood’ (unconsciously) by everyone. In making up both group and individual stances, these signals acquire connotations of intent. Sensing them is part of the conversational competence that an L2 learner, or any speaker, must acquire (4.2.3). Not only is classical (distributional) research methodology unable to define them, it does not, we claim, constitute a valid means of ever understanding them in situ.

6.4 Paradoxically, our major research finding was that the kind of study we had initially undertaken could not furnish the kind of knowledge we were seeking. Even if we had had a longer video tape to examine, it could only have allowed us to confirm (or disconfirm) that, in general: 1. Americans nod more amply than Italians; 2. Italians who nod amply get looked at more often by Americans. These would be interesting findings, of course. But even using debriefing to add qualitative data to these quantitative findings, we still would be unable to tell L2 learners how to nod when conversing with Americans. Certainly not all the time, nor for every assent, nor always amply. When, then, should they nod? in what circumstances, in what manner, to what effect? With what intent?

Thus, our failure to find answers to our research questions turned out to be relatively unimportant as soon as we discovered that the questions we had asked were relatively unimportant. The debriefing technique used to give meaning to our analyses had had the effect of making us realize how much we were ignoring (or misunderstanding) in our video-recorded conversations by considering them simply as texts with behavioral regularities to be catalogued. Debriefing alone, however, could not furnish the knowledge we sought. Another research tool — or even paradigm — was felt to be necessary.

6.5 To see in what terms, let us leave our research questions aside and examine another frame with the sole purpose of trying to assign meaning to the eye/head movements we perceive. In frame 6 PHIL's posture can be labeled head droop with hands in pockets. But what is his stance? What is that posture saying in (and to) the group?

6.5.1 If we were to forget the considerations raised in Section 4.7, we might be tempted to indulge in the kind of projective interpretation that is typical of second-rate literary critics: we would project upon frame 6 our vision of what Americans like PHIL must be like and then look around for American cultural icons that ‘prove’ we are right. For example, our minds filled with Norman Rockwell paintings and John Ford films, we might interpret PHIL's stance as expressing typical American small-town informality (and as our Sociocultural Questionnaire showed, PHIL had in fact lived in a small town). Or perhaps we might have declared, with equal certitude, that PHIL's stance marks typical American male shyness in the company of women (according to the movie cowboy stereotype PHIL's posture suggests). Or, remembering the "I-have-spoken" American ‘Indian’ stereotype, we might have interpreted PHIL's posture as a culturally-marked turn-relinquishment cue (although this is contradicted by BARBARA who, in frame 3, seems to stick her hands in her pockets as a bid for the floor in order to speak to BOB).

6.5.2 To avoid the arbitrariness of superficial hermeneutic analyses like these, we might assign to PHIL's posture only the very general meaning it has in any circumstance (its ‘dictionary meaning’). Birdwhistell (1970) claims that ‘downward’ always indicates some kind of ‘diminishment’. So to play safe, we might want to limit ourselves to saying that PHIL is signaling an ‘easing-up’. But while such an affirmation is undoubtedly true, it tells us almost nothing. Is the ‘easing-up’ a proclamation of greater informality, a manifestation of embarrassment after having guffawed before women, or a signal of turn-relinquishment (to repeat the three previous hypotheses)? Or is it something else again?

According to our premises (Section 4), there can be no answer if we study a gesture in isolation. PHIL's posture will become a ‘stance’ only when we see it as part of a contribution, i.e., as part of a perceived enactment of individual/collective historical will.

6.5.3 The concept of "game" (Goffman 1981) would seem to offer just such a perspective. Games, in fact, link a series of events temporarily and causally. This allows us to avoid the trap of studying a phenomenon in isolation; it also gives every move historical density. Moreover, games simplify human activity: behavior is explained as the rational maximization of gains, a principle that characterizes human activity (at least in part) universally. This allows us to avoid the trap of cultural stereotyping.

All of PHIL’s actions in frames 1-6 can, in fact, be viewed as game-like responses to a stimulus in the previous frame. In frames 1 and 2 MINO expresses surprise at the strangeness of the American expression "Give me a break!" If we hypothesize that calling into question the reasonableness of people's language, especially by a foreign speaker of that language, is tantamount to calling into question the reasonableness of their culture, then we can read PHIL's movements in Frames 2 and 4 as a challenge to MINO's attack. PHIL is using his voice and gaze to rally his fellow Americans around the flag. In frames 4 and 5, PHIL's compatriots nod their assent while the Italians at least acquiesce by their immobility. Thus, in frame 6 PHIL is responding with a self-congratulatory stance: he is taking a modest bow for his victory over MINO.

But this ‘game model’, to work, requires creating a ‘story’ based exclusively on rules of competition that ignore other psychological and cultural drives possibly at work. Above all, it views the interaction as a ‘text’, the sense of which is reconstructed from the outside — just like the other interpretations. It is therefore just as (potentially) arbitrary.

6.5.4 What we need is to be able to interact with the events (the interplays of minds and wills) that produce meaning in a conversation. We could then test reactively the sense that a ‘word+tone+gesture’ acquires reactively. Hermeneutics (6.5.1), inductive systematics (6.5.2), formalism (6.5.3) are clearly not interactive. A mixed hermeneutic/empirical method, that uses debriefing to countercheck interpretations, comes closer. It is, in fact, the method used in this project (6.3). Still, it does not offer experimental validation of falsifiable claims, like experimental science. Debriefing gives, not ‘proofs’, but clues that require interpretation as much as the conversation itself.

The reader will recall, for example, the sense we assigned to the Italian students’ immobile heads in frame 4. Debriefing (supposedly) revealed that the students’ intended message was only partialassent. But were the students sincere and accurate in recalling, during debriefing, their intentional states? Perhaps if we had given more weight to certain clues (their hesitations and hypercriticism), we might have come up with a less charitable but possibly more realistic interpretation of their immobility during PHIL’s call for consensus in frame 4. We might have concluded that, having such poor conversational skills in English, the Italian students simply did not understand what PHIL wanted and therefore limited themselves to a polite smile. In other words, the intent to distance themselves from PHIL’s call to rally around American English, was something that the Italians felt only during debriefing, as a rationalization. This interpretation does not change the conclusions we reached in 6.3; but it shows how unreliable debriefing can be.

In short, debriefing requires subjects with a rare capacity for minute recall, can provoke false memories, works only if participants are always willing and available to be debriefed (PHIL, for example, wasn't) and requires interpretation anyway. What we needed, then, was a technique that would allow us to participate in the genesis of meaning as it happens. For if conversations are simply texts, then they only need to be analyzed to be understood. But if conversations are events, then they must be lived to be understood.

7. Lessons learned

7.1 Three lessons were learned from the present experimentation. The most important was a clarification of the notion of sense in discourse. This question may seem futile to a lay person who sees conversations as: 1. purely epistemic transactions (information exchanges), 2. conducted entirely through words (frameable as text), 3. by interlocutors who are culturally/psychologically unproblematic (for themselves and for the observer). But as soon as that person engages in ‘small talk’ within some group as an outsider (by age, nationality, social class, tastes/interests, etc.) — and fails to ‘key in’ — these three idealizations quickly crumble. Indeed, not even we, as bilingual insiders, were able to make complete sense of the behavior in our videos (analyzed postmortem as ‘texts’) until we hit upon an alternative research method for CA. Let us now try to describe it.

We just recalled how debriefing showed us the limits of studying conversations as ‘texts’; and yet debriefing itself can be quite unreliable, due to memory lapses and false recollections. Why not then, we started asking ourselves, turn the researcher into a conversationalist trained to assimilate her interlocutors’ Weltanschauung, appropriate their stances, and then debrief herself on the spot? Memory would no longer be a problem and, more important, the researcher could experiment with the sense of the stances she feels she has grasped. This is what ethnographers and psychotherapists do all the time. We would only have to shift the focus from words/deeds as revelatory of a cultural system (ethnography) or a psyche (psychotherapy), to words/deeds as revelatory of the existential meaning-making event we call conversation. In fact, this is what conversation analysts do all the time, too, except that, in practicing introspection to ascertain the probable meaning of someone else’s utterances, they do not always use systematic strategies to limit projection and accept the ‘dictates of the object’ (see 4.2.2).

Our alternative CA research paradigm therefore sees the analyst as a co-conversationalist, someone who creates existential meaning-making events conjointly with her informants as a peer, i.e.,adhering to their values (but see 7.1.3) and thereby learning to interpret the events from within their culture. Like an ethnographer, the researcher accepts to be a ‘dumb but willing learner’ in the eyes of her interlocutors. Like a therapist, she uses transference and countertransference as her investigative tools. Unlike either, she does not seek to fit her interactions with her interlocutors into a system. Her holistic, collegial procedure (like conversation itself) aims at obtaining, not ‘laws’ or ‘models’, but rather transient, situated, non-formalized intuitions of the intentionalities enacted.

At first glance, this kind of knowledge may not seem ‘scientific’. It is in fact a non-epistemic variety that Aristotle calls phronesis (Nicomachean Ethics, VI, 1140a), translated generically as ‘wisdom’ but best rendered by ‘discernment that generates rules of procedure’. It is knowledge as ‘sure’ as that of any social science. An example will make its specificity clearer. Phronesis is the knowledge of, say, political history that a top-rate diplomat has: he ‘sees’ history in the choices to be made...and is usually right; his books on history, however, are often judged by Academia as accurate but ‘impressionistic’. (Note that phronesis is not practical expertise, i.e., techne in Aristotle’s trilogy: the diplomat may be a brilliant analyst but a poor negotiator.) Epistemic knowledge, on the other hand, is the knowledge of political history that a top-rate historian has: in his books he can demonstrate his assertions...and is usually right; but his capacity to ‘see’ history in the making may be judged by seasoned diplomats as ‘schematic’ and ‘inaccurate’. (Note that epistemic knowledge is not necessarily theoretical: the historian, if pedantic, may infer facts brilliantly from available data but have no theories to offer. Note, too, that it goes together with technical know-how: every great historian is also a paleographer.) Universities traditionally teach episteme and techne, not phronesis, and thus produce knowledgeable graduates who don’t know how to use their knowledge. As we shall suggest in our conclusions, the kind of student-led research that we are proposing here may provide an answer. In any case, let us now try to justify experiential/procedural knowledge (phronesis) as a valid research tool in conversation analysis.

7.1.1 As is well known, Saussure distinguished two objects of linguistic inquiry: parole, i.e., what people effectively say and mean using language, and langue, i.e., the semantic/linguistic forms inferable from what people have actually said and presumably meant. He then proceeded, as is equally well known, to treat only the latter and three generations of linguists have followed him. (Among the isolated exceptions was Saussure's co-editor Bally himself.) But what many linguists seem not to have noticed is that parole is not the mere application of langue to a specific context. Instead, langue is an abstraction from parole and a partial abstraction at that— a frail skeleton that gives only a hint of what the real body, parole, is like. Parole is "the sum of individual cases" (Saussure, in De Mauro 1972:30,395, our transl.) and thus cannot be reduced to langue.

One would have thought that pragmaticists, intent on studying real acts of speech, might have redressed the imbalance that has favored the study of langue. To some extent they have, of course. But this has not led to the study of parole as immanence,i.e., as discourse produced and apprehended through phronesis. Indeed, some pragmaticists seek to create, as it were, a langue of pragmatic effects — thus, once again, a ‘disembodiment’ of parole. Their efforts are commendable in that they enable us to put handles on regularities. But communication specialists — discourse analysts, international negotiators, L2 learners — need to come to grips with what is unrepeatable in a communicative event, not simply with what is generalizable. If a musicologist were to show us that melodies in every part of the globe can be reduced to a few combinations of notes, recursively expandable, she would undoubtedly help us to understand why music is a universal language. But if we were musicians looking for new styles, we would still need to know what makes the traditional music of Dakar or Bali or Sofia unique.

The majority of pragmaticists, of course, do study conversational interactions as parole; but they tend to treat those interactions as ‘texts’, seldom as ‘events’. (For an example of an ‘event’ approach, see Contento 1998.) Like historians, they carefully transcribe the exchange (usually only the words, however); take note of settings, roles, and relationships; then use these materials to explain how the ‘text’ — which, we claim, is not the event — develops. To discover meaning on a deeper level, some even proceed to analyze the ‘text’ hermeneutically: they reconstruct it, as literary critics do with a novel.

7.1.2 But why take for models the methods of historians and literary critics, specialists in the study of words imprisoned on a page? Pragmatics, after all, deals with live communication. Surely we can learn from specialists in the experiential/procedural comprehension of discourse events: sociologists engaged in participant observation, ethnographers in the field, group therapists, social workers, diplomats, trial lawyers, even the archeologists of Lejre, Denmark (Bibby 1970), who, to grasp the meaning of the Neolithic tools they had unearthed, created a prehistoric-like camp in which they dressed and lived as the users of those tools presumably did. Indeed, their example illustrates admirably the central, neo-Saussurean thesis of this paper: language as parole is not ‘words’, nor even ‘words+tones+gestures’, but the enactment of a historical will in a communicative event. Words, tools, empires are but the residue of that will — the meager formalizations studied by linguistics, anthropology and history (4.2.1).Parole itself can be grasped only experientially, by making the will that created it one’s own.

7.1.3 The professionals just listed all possess heuristics for grasping their interlocutor's meaning from her standpoint, heuristics which vary considerably. For example, testing the real values of one's interlocutor by provoking her verbally is something diplomats and trial lawyers do commonly, social workers and group therapists do less, and participant observers or ethnographers only rarely. On the other hand, participant observers and ethnographers usually adapt to the value system of the host population, group therapists and social workers remain themselves while creating a group identity with their interlocutors, and diplomats and trial lawyers usually keep their identities and their distance.

7.1.4 We hypothesize that it is possible to use combinations of these heuristics in order to grasp the meaning of conversational behavior from the point of view of an insider. To get an idea of what me mean concretely by ‘heuristic’, the reader may now turn to Appendix 4 and inspect one of the modules of the training program in participant observation that we developed subsequently for our student-researchers.

This article will not go into further detail in describing the acquisition of the various heuristics. Our aim here is to describe the kind ofknowledge such heuristics can give. In Section 7.1 we defined it as phronesis: but what does that mean in practice? What, for example, would one of our student-researchers have learned if she had used the participant observation heuristic in the conversation pictured in our storyboard? Let us imagine that we are that student-researcher.We are an ESL learner with only a rudimentary knowledge of English, standing where NADIA is standing in the pictures.

7.1.5 So how would we perceive the conversational event and PHIL’s vocal, prosodic, and somatic behavior? The answer is easy. Even with no training in any heuristic, we could not help but sense PHIL’s behavior as a ‘contribution’ (see Note 8). Even if his words are unclear, his intent ("Hear this!") impinges itself upon us, as does the group’s intent ("Hear him!"). With appropriate training, on the other hand, we would be more attuned. For example, if we had internalized reciprocal rousing as a value (i.e., what young American males do typically when they greet each other boisterously or when they slap each others' hands after some kind of victory), we would intuitively feel PHIL’s stance as rousing and his /HUH?/ as a call, more than a question; our reaction might be to tilt our head up with a smile and with an exclamation already forming on our lips. This is what BOB is doing in frame 4 and his gesture is greeted with a twinkle in PHIL’s eye. But what if the sense of PHIL’s stance — for him — is not rousing, as it is for the group? What if, let’s say, PHIL has a heartburn and, for some reason, has guffawed to express pain? In that case our glance and smile would be met with a wave of the hand ("No, that’s just me."); if we persevere, with a reprimand ("Forget it, forget it..."). In other words, by acting coherently with PHIL’s culture, we would get him at least to correct us (and not simply ‘look through’ us, as often happens in native/non-native encounters). This is what we mean by experimenting stances in order to grasp the sense they have for the ‘natives’. The insight gained is what allows us to analyze the videos of our conversations more competently, seeing stances and contributions as they were seen.

7.1.6 But what do we do if we did not participate experimentally in a conversation we now wish to analyze? In that case we simply replicate the situation, like the archeologists of Lejre. This is, in fact, how we finally answered our question about PHIL’s posture in frame 6. So what does PHIL’s head droop with hands in pockets mean? Students of a following year accepted to try out behaving like PHIL when conversing with American friends. Whenever someone expressed perplexity over some fact of life, they would erupt with a guffaw, claim normality ("That’s life!"), rouse consensus (/HUH?/), muse a moment and then go into the downward posture. Moreover, they would do so while adopting temporarily PHIL’s value system (reconstructed from our Questionnaire).

What PHIL’s head movement surely meant, according to our veteran experimenters, was something we might label "swallowing food for thought". In fact, after musing a moment over the fact for which they had claimed normality, the experimenters needed to break with their thoughts. Letting their heads droop (then brusquely rise: not pictured) helped them to do so; furthermore, stuffing their hands into their pockets was a way of steeling themselves for the break and returning to the group in a battle-ready stance.

7.1.7 As for the validity of our ‘game’ hypothesis (according to which PHIL’s posture is a self-congratulatory stance, i.e., "taking a modest bow": 6.5.3), it was judged dubious by our experimenters. They felt victorious in having normalized a perplexity, not victorious over whoever had expressed it. They did feel ‘strategic interaction’ with the group as a whole, however, which is in fact what Goffman meant. We are therefore considering developing a game-theoretical explanatory apparatus fed with data from our simulations (as in experimental microeconomics) to sift out the competitive behavioral constants from the communicative practices we observe in multi-cultural encounters.

7.1.8 As for the first two of our three projective interpretations ("typical American sign of informality"; "cowboy shyness in female company": 6.5.1), our experimenters judged them parochial. After all, PHIL certainly felt his posture as normal, not "typically American" and "highly informal" (as it would be seen in Italy). And, in fact, "normal" is how our experimenters felt that posture in their reenactments, thanks to cultural assimilation. Moreover, while PHIL’s body language in female company might be seen as "shy" in Italy, our experimenters lived it as ordinary self-control. Clearly, then, the first two projective interpretations tell us more about their (Italian) framers’ mentality than about their object: PHIL’s posture as meaningful to him and his compatriots (see 4.2.2.). An element of truth was recognized, however, in the third interpretation: "solemn American-’Indian’-like turn-relinquishment". In fact, the experimenters reported feeling under pressure after having roused consensus — all eyes were upon them (just as in our storyboard, frame 4) — and so they gazed downward to ‘think in peace’ about what they had said. Kendon (1967) rightly notes that a downward gaze while thinking liberates speakers from their interlocutors’ stare and does not count as a turn relinquishment. But our more introspective experimenters reported that their intent to muse contained a hidden intent to cede the floor (in contrast with a concurrent, more conscious intent to keep the floor in case they thought of some quip to make). In frames 5 and 6 of our storyboard, PHIL’s stance seems to share this ambivalence: for EMILY, PHIL’s stance says he has ceded the turn to MINO (she gazes at MINO who apparently looks blank; so she looks away but not at PHIL); for the others PHIL is still claiming the floor.

7.1.9 In conclusion, our final answer is that, in frame 6, PHIL is signaling a break with his thoughts (and a quasi-relinquishment of turn), together with a brusque return to invigilating the conversation; his is a retake-control stance.

The experimenters added that, as Italians, they would never put their hands in their pockets to muse or to conclude their thoughts although, after having adopted PHIL’s Weltanschauung, they felt natural doing so in English. Indeed, the gesture gave them a greater feeling of self-possession and self-control, two of PHIL’s values they had chosen to internalize to be more in tune with their American conversation partners. And feeling those values more, they found that they communicated better with their partners.

Our second hypothesis (5.1) thus receives an unexpected (albeit partial) confirmation. Adopting every so often PHIL’s hands-in-pockets stance actually helped these Italian students to feel — and to be — ‘one of the group’. But their success, let us quickly add, does not prove that ‘similar conversational behavior’ and ‘acceptance as an insider’ are mechanically linked. Something more subtle had occurred in the interpersonal and group dynamics. As one of the student-experimenters reported: "My American friends did not consider me ‘closer’ because I put my hands in my pockets; but by putting my hands in my pockets, I considered myself ‘closer’ to them... and so they treated me that way."

7.2 A second lesson was learned from the research project illustrated in this paper: a reciprocal relationship between research and teaching can be extremely enriching, not only at the graduate level (as is currently practiced), but at the undergraduate level as well.

Normally, undergraduate students are not considered motivated enough to want to learn the basic concepts and tools of a discipline through conducting a research project, since this means a lot of self-study (terminology, etc.) to save class time for original investigations. Thus, university systems have students study, for four years, the rote ‘factual’ knowledge of all the various disciplines. Then if they go on to graduate studies, students get to study the discipline ‘hands on’ by carrying out research on a topic of interest.

This educational practice, we suggest, may actually be counterproductive: knowledge is either ‘hands on’ or it is just words. Wouldn't it be better to have biology graduates with perhaps less encyclopedic knowledge of history and language but with real knowledge (phronesis) of what documenting a historical fact and learning a natural language mean? The same applies, naturally, for history or language students with respect to biology.

In any case, it is worth noting that the research questions asked in the present study — and so the perspective given to the study of conversational interaction (what we "chose to notice": 4.1) — were dictated by the needs and curiosities of the ESL students who were the material executioners of the project. In other words, without the bi-directional interaction between teaching and research, this would have been a paper on statistically relevant concordances between gaze and head movements in some sample population.

7.3 Finally, the research procedure described here calls into question ‘obvious’ contradictions like those said to exist between theoretical research, applied research and technological implementation. The very terms are full of ambiguity and mask issues worthier of discussion (for instance, the interplay with experiential knowledge, phronesis).

Was this project, then, an example of ‘applied research’? It did in fact spring from a real need to know expressed by the participating L2 teachers and students. Still, the methodological issues that this project raised (and that are discussed throughout this paper) were, if anything, theoretical. The question, then, is not whether we need more theoretical or more applied research. The question is whose needs should we try to satisfy better by creating new elaborations and procedures resulting in what? Only when we fail to answer that question — when our ends become simply furthering some tradition or innovation (or simply some interest) — do the mental elaborations called "theoretical" seem disjointed from those called "applied" or "technological". The ends of the research project presented here were to furnish L2 students and instructors — in a university system which had no consideration for second language studies as an academic discipline, and provided little or no support for teaching and research in that field — with tools to acquire (and to help others acquire) intercultural conversational competence of the kind needed in today’s global village. The parole-centered, experiential investigative apparatus hypothesized here was created in response to that need.



The project reported here, Spontaneous Conversation in English, was financed with funds for extra-curricular student activities (Iniziativa Didattica Studentesca 40, 1982-83) provided by the University of Rome La Sapienza. This paper is a rewriting by P. Boylan of N. Mari's 1983 eight-page unpublished report on eye/head movement in the video-recorded interaction. The other student reports, also to be rewritten for publication, treat other aspects of kinesics (G. De Lorenzo, P. Noce), prosody (I. Avvenente, A. Lamanna, S. Meli), oral syntax (F. Franchi, G. Panico), and conversation routines (E. Agostini, N. De Cillis, G. Runcio). Thanks to Adam Kendon for suggestions on the final draft of this paper. Belated thanks to the student volunteers who, in 1983, helped with the staging (A. Roselli), lighting (M. Cassano), filming (G. Concetti), sound recording (G. Saporaro), editing (C. Mosticone) and transportation (N. Fioravanti). A very warm grazie to the students from Pitzer College, Temple University and Trinity College and to their teachers, respectively L. Marquis, E. Miller and P. De Martino.



     Appendix 1: Storyboard of a fragment of a multi-cultural multi-party conversation

Appendix 2: Description of the Storyboard (Contribution) — Duration: 6 seconds

Participants, clockwise from far left (numbers give position on an imaginary dial):

9: PHIL / USA; 10:EMILY / USA; 12:NADIA / Italy; 2: BOB / USA; 3: BARBARA / USA (a young ESL teacher in Rome; she knew only Emanuela); 6: EMANUELA / Italy;7: MINO / Italy

FRAME 1 (1 second): MINO is speaking to BOB, while gazing at him. The OTHERS are looking at MINO but with their heads pointing toward the center of the circle, thus obliquely in most cases. Only EMANUELA turns her head to look at MINO directly.

Note: "Give me a break!" is an idiomatic American English expression meaning "That’s enough!"

FRAME 2 (1 second): MINO is repeating his utterance. EMANUELA is turning her head back to the center and probably glancing at BOB, as are the OTHERS (except NADIA). BOB is lowering his eyes, smiling, and emitting a very soft, constrained chuckle. After MINO finishes speaking, PHIL intervenes with a guffaw and attracts NADIA's gaze.



FRAME 3 (0.5 seconds): Now EMILY is looking at PHIL, too, without moving her head. So is BOB who, lifting his eyes, is passing from a low chuckle to a chortle. PHIL is cocking his head toward MINO while uttering something out of the corner of his mouth. BARBARA, sticking her hands in her dress pockets, is turning her head toward BOB.

PHIL: =THAT’s -aMERican .

FRAME 4 (0.5 seconds): PHIL, turning his head back to the center of the circle (and looking directly at BOB), is uttering an even louder guffaw mixed with a monosyllabic utterance. BOB is nodding and uttering another low, less constrained chuckle. EMILY is cocking her head toward PHIL and nodding slightly with a wide smile, closing her eyes partially. Both MINO and EMANUELA finish turning their heads toward PHIL and remain immobile. NADIA, immobile, has a faint smile. BARBARA, who was turning her body to face BOB directly, brusquely begins turning back to her original position while smiling.


FRAME 5 (1 second): PHIL is lowering his eyes and repeating MINO's utterance (with a citation tone). EMILY is turning her gaze toward MINO who is lowering his head (and gaze?) slightly. BARBARA, finishing turning 180°, is now facing PHIL head on. She nods amply.

PHIL: +GIVE me a BREA::K .

FRAME 6 (2 seconds): EMILY, after glancing at BARBARA, is turning her head back and upward, looking into space. NADIA is beginning to smile more broadly as her eyes follow the direction EMILY is indicating with her glance. PHIL, smiling with a closed mouth now, lowers his head, stoops his shoulders while moving them slightly, and sticks his hands in his pockets.

GROUP: ‘ ‘

Appendix 3: Transcription conventions

Utterances are aligned along a horizontal time bar to show overlapping/chaining. Paralinguistic realizations are given in the description in Appendix 2. Silence is indicated by apostrophes: ' ' (2 seconds); a prolonged phoneme, with colons: No::: (0.3 seconds). A pretonic syllable is indicated by capsand a tonic syllable by underlined caps. A pretonic or tonic higher than usual for an utterance-type is marked by a plus sign + ; if lower by a minus sign ; if much lower by a double minus sign = . The tonic syllable can have one of six possible tone movements, indicated by using six punctuation marks:

1. Fall from middle or middle-high to very low (the conclusive tone): .

2. The same as above but with a fall to middle low (non conclusive tone): —

3. Middle or middle high tone slightly rising at the end (introductory phrase tone): ...

4. Middle high to (very) low, then sharply to very high (question tone): ? (??)

5. Rise from middle high to high and then sharp fall to very low (exclamatory tone): !

6. Middle low or very low tone slightly rising at the end (parenthetical tone): ,

"," can also follow "." or "!" to attenuate affirmations or express worried surprised.

Appendix 4: Preparation for field research / Participant observation: Empathy

The module below seeks to enhance the student-researcher’s capacity for empathy, usually considered a "rather vague notion" (Mondada 1998:159) or, at best, a ‘gift of nature’. It is taken from a training program developed between 1984–1991 as part of an ‘alternative’ English course at the Teachers’ College of the University of Rome. The course taught English in a perspective akin to what is now called Intercultural Communication (Jensen et al. 1995). Further details may be found in Boylan (forthcoming/b).

1. Observe foreign ‘twin’; playact her or him
Change: subjects define target values, see how their ‘self’ hinders perception
Tools: ethnographic checklists/practices à la Malinowski
2. Formulate intuited values as maxims
Change: from an epistemic to a volitional stance
Tools: Stanislavski's State of "I am" and Through Action
3. Divest (existentially)
Change: from willfulness to anomie
Tools: Bracketing à la Husserl
4. Invest (existentially)
Change: from anomie to new willfulness
Tools: Guided associations (Freud) using maxims
5. Act and verify
Change: new needs, intents, perceptions
Tools: Simulations with colleagues, thenreal-life interaction, first in controlled situations; subsequent debriefing and, if necessary, reformulation of target values.

