ASPECTS OF GESTURAL ALIGNMENT IN TASK-ORIENTED DIALOGUES

Interlocutors in a conversation influence each other in a number of dimensions. This process may lead to observable changes in their communicative behaviour. The directions and profiles of these changes are often correlated with the quality of interaction and may predict its success. In the present study, the gestural component of communication is scrutinised for changes that may reflect the process of alignment. Two types of task-oriented dialogues between teenagers are recorded and annotated for gestures and their features. We hypothesize that the dialogue task type (collaborative vs. competitive), as well as certain culture-specific properties of alignment that differ between German and Polish pairs, may significantly influence the process of communication. In order to explore the data and detect tendencies in gestural behaviour, automatised annotation mining and statistical exploration have been used, including a moving frame approach aimed at the investigation of co-occurring strokes as well as re-occurring strokes and their features. Significant differences between German and Polish speakers, as well as between the two dialogue types, have been found in the number of gestures, stroke duration and amplitude.

1 Introduction (state-of-the-art, background) Participants of conversational interaction mutually adjust their behaviour in time and in form.
Recent research shows that external, visible accommodation is based on internal processes related to participation in a dialogue.Conversational partners adjust their cognitive, emotional and interactive processes.As a result, they align their mental representations on many levels of language organisation, as well as in facial expression and body movements, including hand gestures and head movements.In the framework of the mechanistic theory of dialogue (Chartrand & Bargh, 1999;Pickering & Garrod, 2004), syntactic and semantic structures of utterances are subject to gradual adjustment and unification as the participants follow a common aim in interaction.Alignment is considered to be a predictor of success in the process of dialogue (Chartrand & Bargh, 1999;Pickering & Garrod, 2004).
In an interactive alignment account (Chartrand & Bargh, 1999;Pickering & Garrod, 2004), each subsequent turn of each speaker involves certain representations of linguistic structures: the semantic, syntactic and sound features of the utterance.These representations become aligned in both speakers as they construct a shared representation of their situation and form, referred to as a conceptual pact (Clark & Brennan, 1991).Interlocutors are thought to "prime each other to speak about things in the same way, and people who speak about things in the same way are more likely to think about them in the same way as well" (Garrod & Pickering, 2013).Alignment on one level is the basis for another level, such as lexical alignment for the syntactic (Chartrand & Bargh, 1999;Pickering & Garrod, 2004).
Human communication is multimodal by nature (Bonacchi & Karpiński, 2014;Karpiński, 2014).People communicate via linguistic units accompanied by hand movements, gaze, head movements and posture.Recent research has shown that all modalities of communication tend to align in time and in form.The process of aligning gestures or body postures is called motor mimicry or imitation and it is has been proved to be largely unconscious and automatic during interaction (Chartrand & Bargh, 1999;Pickering & Garrod, 2004).Empirical studies have shown that motor mimicry is linked to mutual liking and good relationships between interactants as well, as the quality of relationships between interactants (Chartrand & Bargh, 1999;Chartrand, Maddux, & Lakin, 2006;Kulesza, 2016).Dialogue participants who repeat each other's lexical units and syntactic structures are priming each other and have a greater probability of reaching understanding (conversational success).In other words, effective dialogue depends on interactive alignment (Garrod & Pickering, 2013).
Co-speech gestures are an especially important channel for communication as they allow speakers to express both propositional and affective content closely related to speech or add extra information (Kendon, 2004;McNeill, 1992).Alignment or adaptation in representational gestures resembles adaptation in verbal references.Importantly, gesture forms were only repeated across speakers if they had occurred in a meaningful context whereas other gestures seemed to be neglected (Mol, Krahmer, Maes, & Swerts, 2012).Gestural alignment is stronger in the case of cospeech gestures (Garrod & Pickering, 2013;Mol et al., 2012) Systems of gesture coding divide hand movements into emphatic, deictic, iconic and emblematic (McNeill, 1992).However, in coding systems such as NEUROGES (Lausberg, 2013), emphatic hand movements can be superimposed on other types of gestures, such as deictic or form presentation.In most cases these emphatic gestures are performed repetitively in a gesture space and are hence called repetitive in space in NEUROGES labels (Lausberg, 2013).
There is a strong but flexible relationship between emotional and communicative alignment in interaction.Communicative alignment is often linked to empathy or an affective theory of mind (Jaecks et al., 2013).One interactant needs to know the internal state of the other, including his or her thoughts and feelings so that he or she can address and adjust his or her behaviour to current state of mind of the other.Recognition of the internal state of the other is based mainly on observing how the behaviour of the other is similar to his or her own behaviour.This process is assumed to be automatic and unconscious for both interactants (Chartrand & Bargh, 1999), but it is visible for a patient observer and researcher.Due to its popularity, it is referred to as the "Chameleon effect" (Chartrand & Bargh, 1999) and it has been shown to facilitate the smoothness of interactions and increase liking between interaction partners (Chartrand & Bargh, 1999).
Similarity in behaviour between two interactants is sometimes called "mimicry" or "contagion", while the mental representation of the other's internal state is referred to as "alignment" or "simulation".Similar behaviour aligned in time is called synchrony (Jaecks et al., 2013;Ramseyer & Tschacher, 2011).However, this kind of co-ordination may not be obvious as co-ordinated acts may be sequential by nature or form parts of larger structures.The alignment between interactants happens on all modalities of communication and all levels of language.Research provides evidence for alignment in words for naming objects (Brennan & Clark, 1996) or the same syntactic structure (Cleland & Pickering, 2003), segmental phonetic (Pardo, 2006) or prosodic features (Guitar & Marchinkoski, 2001;Truong & Heylen, 2012).
Following earlier studies in speech alignment by the authors (e.g.Czoska, Klessa, Karpiński, & Nowikow-Jarmołowicz, 2015;Karpiński, Klessa, & Czoska, 2014), this paper analyses gesture units for their alignment in the course of two types of dialogue tasks: a collaborative and a competitive task.All the analyses described in the present paper are based on the Borderland corpus data, collected as part of a project dedicated to the investigation of the paralinguistic features of interpersonal communication in the borderland region of Słubice (Poland) and Frankfurt Oder (Germany) -on the border of languages and cultures (e.g., Karpiński & Klessa, 2018).The hypothesises that in the cooperative condition the participants' communicative behaviour will be more compatible to one another, and consequently, they will be more likely to mimic their interlocutor's gestures.Conversely, in the competitive condition the dialogue parties will display more differences in the usage of gestures and their functions.Although some of these processes may be immediate, the paper focuses on those that are more stretched in time and possible to discover by comparing the properties of gestural behaviour on the macro scale (e.g. in the initial and in the final part of dialogue).

Idea and aims of the study
Based on the findings reported in the previous section, it may be hypothesized that dialogue parties tend to align locally or globally, and that this process may involve convergence in various domains, including gesture and head movements.Interaction and alignment in the gesture domain may reflect important aspects of dialogue flow and the quality of interpersonal communication (Karpiński, 2014).The aim of the present study is to explore this type of convergence in taskoriented dialogues by Polish and German teenagers (see Section 3 for more details about data collection).
Although the concept of accommodation emerged in the studies of interpersonal communication in the early 1970s (Giles & Smith, 1979;Giles, Taylor, & Bourhis, 1973), measuring mutual influences in the behaviour of dialogue participants remains a challenging task even if only observable parameters are taken into consideration (Campbell & Scherer, 2010;Ward & Litman, 2007), with some recent studies even showing tendencies contrary to earlier expectations (Healey, Purver, & Howes, 2014).
In order to capture the process of gestural alignment in conversation, several approaches are combined in the present work.One of them is the moving time window approach, inspired by Kousidis et al. (Kousidis, 2010;Kousidis et al., 2008).The phenomena under study are observed and measured within a moving time window of a fixed duration, shifted by a fixed time interval.Measurements are taken and compared for all the frames to search for changes in selected behavioural parameters.The directions and sizes of these changes are compared between the dialogue participants in order to detect similarities or divergence.The time window size and step size are normally dependent on varying distributions of conversational activities, the duration of the entire dialogue under analysis, as well as the size of the units to be studied (e.g., gesture phrases or phases; Karpiński et al., 2014).A range of factors may influence the distribution of conversational units, their sizes and other properties.They include individual speaking style, as well as situation-specific speaking styles.Spontaneous, conversational speech may require broader time frames than read speech or elicited speech prepared in advance.
Among other methods of gestural accommodation measurements, there is also a search for the re-occurrence of features of gestures produced by one person in the gestures produced afterwards by his or her conversational partner.While the usage of gestural categories may also be of interest, similarity is most often limited to certain features of gesticulation like handshape or gesture size.

Aspects of gestural alignment in task-oriented dialogues
Another area where increasing or decreasing similarity can be found is gesture sequencing: not only does a single gesture re-occur in a partner's stream of gesturing but, an entire sequence of gesture is re-used.
The number of factors involved in the process of alignment may be high and it may be difficult to capture statistically in fully spontaneous conversations.Specifically designed scenarios of taskoriented dialogues help to expose and isolate behaviour that may contribute to this process.One may expect that a number of modalities (semiotic modes) may be involved in the process and they may also interact among themselves, intra-or cross-modally (Karpiński, Jarmołowicz-Nowikow, & Czoska, 2015).

Participants
The participants of the study were quasi-randomly recruited from both a German and a Polish secondary school in Frankfurt(Oder) and Słubice, respectively.The group consisted of 15 girls and 5 boys aged from 12 to 15, who did not report any serious vision or hearing problems.All recording sessions were preceded by obtaining written consent from the participants' parents or legal guardians.

Recording scenarios and procedure
Pairs of pupils were recorded performing two types of task-oriented dialogues: a collaborative and a competitive dialogue.The collaborative task (Tower ) involved building a tower of blocks that were only manipulated virtually, i.e. imagined by the participants.The interlocutors took turns in adding subsequent blocks and described their shapes and positions to their conversational partners so that they could both create and update a mental image of the virtual tower.In the end, they were both asked to draw (independently) the tower according to what they had managed to memorise.The competitive task (Gift) was focused on the selection of birthday gifts for an imaginary mutual friend.Before the task, participants were provided with photographs of a friend and lists of available gifts.However, the photographs handed to each of the participants depicted different people of vividly contrasting personalities and likes.As participants were not allowed to talk about the friend, it took some time before they realised that they were thinking of gifts for different people.
Before the recording session, the dialogue participants were invited to the room and informed, in general terms, about what they would be expected to do.They were shown to their places and listened to the formal task instructions.During the session, the participants stood facing each other from a distance of approximately 3 m.Each participant was filmed by a separate HD camcorder situated on one side of his or her conversational partner.Voices were recorded separately using a portable digital audio recorder and two large membrane condenser microphones.Recordings took place in regular classrooms.The choice of recording environment was motivated by technical and extralinguistic factors, namely, the location of the target schools, as well as the fact that the pupils could have been expected to feel more confident and at ease in a well-known surrounding than would be the case in a recording studio.No time limits were imposed on the task-oriented dialogues.
The material under study consists of 10 sessions, each comprising two dialogues: Tower (collaborative) and Gift (competitive).The total duration of the recordings is approximately 77 minutes (Germans 35 min and Poles 42 min.).The average duration of the collaborative task dialogue is 3:00 min.(for Germans) and 3min 42 s (for Poles), while for the competitive task, it is 4 min 8 s Aspects of gestural alignment in task-oriented dialogues (Germans) and 4min 42s (Poles).For the purposes of the present study, only the initial, middle and final sections, of one minute each, were annotated.

Data management and processing
The linguistic descriptions and analyses of the Borderland corpus were carried out using ELAN (gesture annotations; Wittenburg, Brugman, Russel, Klassmann, & Sloetjes, 2006) and Annotation Pro (orthographic transcriptions and speech segmentation; Klessa, Karpiński, & Wagner, 2013).Both tools were integrated within a common database system developed using a client-server architecture (Karpiński & Klessa, 2018) supporting consistent work management and annotation data access.Thanks to the interoperability of the tools, it is possible to import and export annotations from the format of one annotation tool to another and thus to analyse the combined effects of features from various domains within one multilayer environment.

Gesture annotation specification
Gesture annotation was based on a modified PAGE GAS scheme (Karpiński et al., 2015), described in more detail in Karpiński and Klessa (2018) and carried out in ELAN (Sloetjes & Wittenburg, 2008).Gestures were annotated for the left and the right hand of each participant independently, on separate tiers.The boundaries of Gesture Units and Gesture Phrases were tagged on respective tiers (GUnit and GPhrase) by experienced annotators.Further annotations were done by trained students and researchers, and revised by experienced annotators.For each GPhrase, its category was tagged (pragmatic vs. referential).Further, on GPhase (Gesture Phase) tiers strokes were tagged as the obligatory and most meaningful gesture phases.Tags describing their features were arranged on separate tiers, hierarchically dependent to the GPhase tier.They included handshape, gesture location (in the gestural space), gesture size, and representation technique.Additionally, head movements were annotated for each dialogue participant.
Annotation was carried out by two trained annotators supervised by one of the authors.Gestures and head movements were annotated using silent footage so that no acoustic cues would influence the decisions of the annotators.The resulting annotations were scrutinised by an experienced researcher and doubtful cases were discussed.
A hierarchically organised ELAN template provided pop-up menus containing lists of available labels for respective annotation tiers.Together with relatively long annotator training, and support and control systems, these factors contributed to an increase in coherence among annotators, and a reduction in the number of accidental mistakes in annotations.
In order to check inter-rater agreement, kappa tests were conducted for selected tiers and stretches of dialogues, and some tags were adjusted afterwards.These samples constitute approximately 10% of all the data analysed.

Annotation consistency
Two annotators (A and B) performed the annotation of the material.In order to test the stability of the annotation scheme, annotator A returned to the material after six months, and annotated a randomly selected 10% of a sample that had originally been annotated by B. Cohen's Kappa coefficient was obtained using the EasyDIAG function implemented in ELAN (Holle & Rein, 2015).The global kappa value for all labels listed is 0.8, whereas the kappa value for Referential and Pragmatic gesture functions is above 0.8 (Good).The kappa value for HandShape is above 0.9 (Very Good) for OpenPalm and Fist (although there are only two Fist gestures), and good for OneFinger and ManyFingers (>0.7).The Representation Technique tier allows for the usage of four tag values and its kappa value is between 0.8 and 0.97.The most problematic labels are in GestureSpace, where there are as many as 14 of them, so the kappa values are between 0.1 and 1.0, but in most cases around 0.7, which is still a relatively good result.Overall, the results show that the annotation scheme is stable enough for a quantitative analysis, with the exception of some labels of Representation Technique and some labels of Gesture Space, due to the low number of segments representing those labels.

Methods
The analyses conducted within the present study encompassed the following features of gestural behaviour: 1. gesture frequency; 2. gesture function (referential vs. pragmatic); 3. duration of gesture strokes; 4. features of strokes and their co-occurrence and re-occurrence.
The values of variables based on the measurements of these features were calculated and compared between the initial and the final stages of dialogues, between dialogue types (collaborative vs. competitive), and taking into account the nationality of the participants.The co-occurrence of gesture events and functions was understood as in the previous studies by the present authors referring to speech events (Karpiński et al., 2014) or interactions between gestures and speech (Czoska et al., 2015).Co-occurrence was inspected using the moving frame (or moving windows) approach (see also: Fig. 2).The moving frame method was implemented as an Annotation Pro plugin (a C# script).The initial version of the plugin (SRMA) was tested and used beforehand to study the variability of speaking rates in two different corpora of task-oriented dialogues (Karpiński et al., 2014), as well as for the analysis of cross-modal interactions between interlocutors' hand gestures, gaze shifts and speaking rates (Czoska et al., 2015).Only data provided by adult speakers were used as study material for the preliminary analyses.The plugin enables the study of the rate of (co-)occurrence of any type of event represented by segments in time-aligned annotation layers.Segments may include not only transcriptions of linguistic and paralinguistic speech events or gesture labels, but also the results of measurements based on annotations, representing e.g., various rhythm metrics or pitch representations.Consequently, both local and global variability of the features in question can be tracked and analysed within and between the domains of prosody (primarily rhythm and pitch), gesture (hand gestures, gaze shift and head movement), and the lexical domain.Re-occurrence (mimicry) of gestures or their features, is understood as the occurrence of a gesture or its feature in the behaviour of one of the participants within an arbitrary defined time window, following the occurrence of a gesture of the same category, or of some of its features, in the gestural behaviour of the second participant.
In the present study, re-occurrence of gestures and gesture functions is measured by means of another Annotation Pro plugin (henceforth: the Re-Occurrence plugin), designed and implemented specifically for the purposes of the present project (Figure 3).The plugin enables the counting of the number of occurrences of an annotation label found in one annotation layer (e.g., including gesture annotation for Speaker 1) in another annotation layer (e.g., including gesture annotation for Speaker 2).The number of re-occurring segments is calculated within n segments appearing after the end boundary of the original segment.The n number can be defined by the user in the plugin code.Output of the plugin provides data that include: • the timestamps of both the original annotation segment for Speaker 1 and of each of the segments annotated with the same label that re-occur in the layer for Speaker 2; • the durations of both the original and re-occurring annotation segment(s); • the number of re-occurring segments.Two ANOVAs were conducted based on the sum of gestures produced by each of the participants in each dialogue task, i.e. 40 data points.The first shows an effect of the task factor (F = 10.54,p = 0.0024), and the second shows an effect of language as well (F = 5.774, p = 0.0217).MANOVA (whose results should be taken with caution because of the scarcity of data) on both task and language shows an effect of language (F = 8.012, p = 0.0077) and task (F = 12.357, p = 0.0012), but no interaction between the factors (F = 0.379, p = 0.54).
MANOVA on stroke duration (separate data on each stroke annotated in the dialogues, 1194 data points) showed an effect of the language factor (F = 32.464,p < 0.001) but not task (F = 2.976, p = 0.084) or interaction between the factors (F = 1.604, p = 0.2); see Fig. 6 for details.Removing outliers (defined as strokes longer than 2000 ms, ca. 5% of the data) did not bring significant changes to the results of MANOVA.tendency is present in both the Polish and the German dialogues (Fig. 8).In the Gift condition there were 322 pragmatic and 76 referential gestures, while in the Tower condition there were 169 pragmatic and 623 referential ones.The difference measured with a Chisquared test is significant (R 2 Chi = 387.789,p < 0.001).The result remains significant when only German dialogues are taken into account (Chi(DE) = 114.0172,p < 0.001) as well as for Polish data only (Chi(P L) = 361.152,p < 0.001).
Further analyses focused on the duration of original and re-occurring strokes of three sizes covered by the annotation scheme (1 -small, 2 -medium/regular, 3 -large).Size and timing may belong to alignment-sensitive features as they simultaneously contribute to and are influenced by the overall rhythm of conversation, and can easily be observed by conversational partners.Annotations were searched for occurrences of strokes in one speaker that have the same size as the stroke that they follow in the other using the Annotation Pro Re-Occurrence plugin described in Section 4. Each stroke was checked for its size and duration parameter and each of the ten following strokes annotated for the other speaker were checked for equal value of this parameter.They, in turn, were checked for duration.Additionally, the distance (lag) between the original segment and each of the re-occurring ones was measured.
In Fig. 9, results are shown separately for the collaborative and competitive task.In both conditions, the mean stroke durations of the repeated strokes follow the pattern of the "original" (reference) ones.One exception is the case of Polish speakers in the collaborative task where large strokes in repetitions are longer than the original strokes.While duration may be, in principle, heavily influenced by gesture size, as larger gestures normally require more time to perform, one may notice that among German speakers in the competitive task (Fig. 9, bottom panel), as opposed to the collaborative task, the larger gestures (Size-2) are actually significantly shorter.In this task, the performance of Polish speakers is clearly different: the larger the strokes, the longer their average durations.The statistical significance of the differences was confirmed by the results of factorial ANOVA (F = 57.73,p < 0.0005 for the interaction of task and gesture size factors; F = 12.51, p < 0.0005 for the interaction of language and gesture size factors; and F = 52.87,p < 0.0005 for the interaction of all three factors).
Similar analysis was conducted with the same plugin for repeated gestures that have the same function, pragmatic or referential, as the original ones.As shown in Fig. 10, the original and repeated stroke durations for gestures of the same function are generally similar for the Polish speakers, while for Germans the repeated strokes of the same gesture functions are, on average, shorter than the original ones for each type of gesture and condition (dialogue type).Moreover,

Discussion and conclusions
In the present study, selected measures of gestural behaviour have been analysed in order to find observable correlates of gestural alignment between conversational partners.The analyses are based on a total of twenty task-oriented dialogues of two types (collaborative, referred to as Tower, vs. competitive, referred to as Gift) in two languages (German and Polish).The dialogues were recorded using a pair of camcorders and annotated for their gestural component in ELAN.A high coherence of annotation was achieved due to the design of the tagset and the process of annotation (including the training and controlling of annotators).Among other tiers, the timealigned annotation used for the purposes of the study includes data on 1194 strokes (crucial phases of gestural phrases) and their properties.Selected annotation tiers were imported into Annotation Pro and processed using plugins designed specifically for co-occurrence and re-occurrence data extraction.
The Tower scenario seems to evoke more gestures while greater differences in gesture usage intensity between the conversational partners are observed in the Gift scenario, although their roles were symmetrical in both scenarios.A significant difference in the mean number of gestures in the annotated parts of dialogues (minutes of interaction from the beginning, the middle and the end of each dialogue) is also found between German and Polish speakers.According to ANOVA results, both the effect of task and language factors are significant but MANOVA shows no interaction between them.The proportion of referential and pragmatic gestures is reverse in the two conditions, with more pragmatic gestures in Gift (competitive) and more referential ones in Tower (collaborative).The difference is statistically significant (Chi = 387.789,p < 0.001) and the significance has been preserved when Polish and German dialogues were analysed separately.
Differences between the languages have also been calculated.The sum of strokes performed by each speaker differs between the conditions for each language with more gestures in the Polish dialogues and in the Tower condition.However, MANOVA shows no effect of interaction of the factors which indicates that the tasks affected the number of gestures similarly in both languages.Stroke duration differs between the languages but not between tasks: The durations are more similar in the Tower condition than in Gift.MANOVA on stroke duration showed an effect of language but no effect of either task or interaction of the factors, independently of outliers (kept or removed).
In order to explore the co-occurrence of gestural events in the communicative behaviour of dialogue partners, a moving time frame approach based on a 30 second long window was employed.The number of strokes per frame differs between languages but the difference between the conditions is even more striking.Co-occurrence analysis is based on the total of 173 data points.The correlation between speakers in the number of produced gestures significantly differed between the conditions, with a lower R 2 value in the competitive scenario, but not between the languages.In Polish speakers the correlation proved to be insignificant in both the conditions, and the difference between the respective R 2 values was also not significant.In German speakers, the difference between correlation coefficients proved to be statistically significant.
Further exploration of alignment involved the analysis of re-occurrence of stroke parameters.For each stroke made by one speaker, the ten following strokes by the other were analysed for their size and function.In both cases, their durations were also measured.The mean stroke durations of the repeated strokes of three different size categories follow the pattern of the original ones with the exception of those produced by Polish speakers in the collaborative task.Among German speakers in the competitive condition, larger gestures are actually significantly shorter, in contrast to the collaborative condition.In this task, the performance of Polish speakers is clearly different: larger strokes are longer.It may also be noted that the German speakers did not produce any gestures categorised as large at all while, in some conditions, large gestures dominated in the gestural behaviour of Poles, which may be attributed to cultural differences (Müller, 1998).The durations of original and repeated strokes for gestures of the same function are similar for Poles, while for Germans the repeated strokes of the same gesture functions are, on average, shorter than the original ones.Polish speakers clearly show a higher mean duration of original and repeated strokes in referential gestures in the competitive dialogue.
The two tasks seem to evoke significantly different communicational behaviour.In the collaborative task (Tower ) participants gesture more often and co-ordinate their gesticulation.Global measures for entire dialogues show that the number and duration of gestures are more similar in the collaborative task.Moreover, pairwise correlation between speakers is higher in the Tower task as well.Local measures also indicate better coordination between the speakers in this task.These results are even more striking because the Gift scenario was always recorded as the second one, with participants that had already been talking for some time and had become accustomed to each other.The competitive scenario evokes more pragmatic than referential gestures, which may result from compensating the lack of coordination with explicit discourse management.In the collaborative scenario, pragmatic gestures were significantly less frequent, while communication was more fluent in terms of alignment.The differences in gesticulation by Germans and Poles (more gestures in Polish dialogues, longer strokes in German ones, larger gestures performed by Poles) may be due to cultural and linguistic differences.Even though some of them are strongly confirmed by statistics, it is difficult to exclude the impact of some other uncontrolled factors involved in the process of communication.All the aforementioned analyses show, however, that the difference between collaborative and competitive tasks affected both groups in the same direction, resulting in more gestures and the occurrence of gestural coordination in the Tower (collaborative) task.This outcome supports the initial hypothesis that in collaborative tasks dialogue participants will be more prone to align with each other.
Further research will include more detailed re-occurrence analyses, n-gram based gesture patterning analysis, as well as correlating results with other areas of communicative alignment currently explored in our data, including lexical and prosodic domains.Our results may contribute not only to the knowledge on the mechanisms of alignment, but also to a wider picture of their pragmatic anchoring, sources and consequences (e.g., Beňuš, Gravano, & Hirschberg, 2011;Gravano et al., 2011).
Figure 1: Configuration of annotation tiers in ELAN.

Figure 4 :
Figure 4: The number of gestures (strokes) performed by German (DE) and Polish (PL) speakers in two types of dialogues (Tower and Gift).

Figure 5 :
Figure 5: Mean stroke duration in gestures by German and Polish speakers in Gift and Tower tasks.

Figure 6 :
Figure 6: Mean stroke duration for German (DE) and Polish (PL) speakers calculated for language and task factors (left) and for language alone (right).The numbers of strokes in moving time windows (frame width: 30 sec) were calculated in Annotation Pro, which resulted in 426 data points.The mean number of strokes per window was M = 4.23, with some difference between the languages M (DE) = 3.53, M (P L) = 4.93, and between the conditions M (Tower) = 5.94 vs. M (Gift) = 2.52.Values for individual speakers are presented in Fig. 7.

Figure 8 :
Figure 8: Referential and pragmatic gestures in German (DE) and Polish (PL) speakers in two dialogue tasks (Gift and Tower ).

Figure 9 :
Figure 9: Mean duration of original and repeated strokes in the Tower (top) and Gift (bottom) condition.

Figure 10 :
Figure 10: Mean duration of original and repeated gestures depending on gesture function (pragmatic or referential), type of task (Tower vs. Gift) and the speakers' native language (DE -German, PL -Polish).