Classification of collaborative behaviors based on textual interactions

Abstract:
In a collaborative work environment, groups of students coordinated by a teacher can make use of the different tools provided by the environment to carry out an assigned task. In this context, collaborative conflicts arise, which can be adequately detected and classified, and can provide data that will help the teacher to solve it.
To this end, the environment must collect the relevant data on the participation and interaction of the students and, in order for the teacher to intervene to resolve any conflicts that may arise, a subsequent analysis of this data must be carried out to detect and To classify these conflicts. The work of analysis and classification can take a significant amount of time and effort.
The purpose of this article is to present experimental results of classification of free text of a chat in collaborative behaviors. Based on a given collaborative environment and a specific classification of conflicts, this can be used to construct a tool that helps the detection of collaborative conflicts. Such a tool would significantly simplify the work of who should conduct the analysis of interactions, allowing it to focus on resolving conflicts that may arise.
(Adaptation of the work done with my workteam)

1-Introduction:
Currently, the large number and variety of platforms and resources focused on collaboration, participation and simplicity that exist within the framework of Web 2.0 (blogs, wikis, collaborative documents, video and audio sharing platforms, etc.) offer Possibilities to enrich learning while posing new challenges in the learning and teaching processes. It has been found, for example, that classroom use of a wiki encourages collaborative learning among students [1]. It has been investigated and demonstrated that many of the internet-based collaborative activities facilitate teamwork [2] and social skills and basic computer skills [3].
The benefits provided by these tools allow to register a large amount of data about the interactions between participants, on which an analysis can be carried out both to detect and characterize collaborative conflicts that arise during the accomplishment of a task, to improve and to give form To the same learning process, developing new tools or improving existing ones. In addition to studying the different tools to analyze their effects on learning processes and student-teacher interaction, tools have been developed to improve these processes; For example, the development of intelligent assistants who, within the framework of a platform given to students to develop collaborative work and based on a work plan [4], or based on group interactions [5], lead Conduct the detection of collaborative conflicts and alert the teacher about them, so that the latter can intervene if necessary; Or make recommendations to students so that they can take corrective action. An example of such platforms is Google Docs, the online collaborative tool for creating and editing text files that will be used as the basis for this work.
In order to be able to analyze the collaborative conflicts that can arise between students, a characterization of collaborative skills must first be counted. A possible alternative is the IPA (Interaction process analysis) [6]. This method "is one of the most elaborate, best validated and most widely used since its appearance in 1950" [7].
Based on a tool that provides a space for collaborative work (Google Docs) and a collaborative conflict detection and classification (IPA) model, this paper presents experimental results of free text classification of the chat from platform to schema Of interaction analysis proposed by the IPA model. Subsequently, the results obtained can be used for the construction of a tool that helps to detect collaborative conflicts and significantly simplify the work of who should carry out this stage of analysis.

2-Theoretical foundations and tools used:

2.1-Shared space for collaborative work
The collaborative work platform chosen to develop this work is Google Docs. As mentioned, it is an online tool that can be used by anyone who has access to the Internet [8] and allows working on a common task without restrictions often imposed by traditional face-to-face contacts [9; 10]. Google Docs, in addition to the basic functionality of a text editor, has the extra functionality of allowing multiple authors to work collaboratively on a document. This feature gives rise to many advantages for the purposes of our work: managing real-time editing, creating comments and notes, availability of a chat to facilitate communication, and so on.
There are studies that have been dedicated to determine if the use of this tool can lead to an improvement of performance in collaborative activities. In Zhou et al. [11] the performance in an assigned task of two groups, one using Google Docs and another not; It is concluded that the use of this tool has good reception by the students, who show a general tendency to adopt the tool once introduced. In another study [12], it was concluded that during the production of papers, students wrote longer essays and were able to work on collaborative writing more efficiently when using Google Docs compared to Microsoft Word. Brodahl et al. [13] analyze characteristics of students using online writing applications, concluding that students with high competence and positive attitude towards digital obtain more positive results.

2.2-Interaction analysis process
If it is intended to detect possible collaborative conflicts, it must first be possible to identify the collaborative abilities that each student possesses or not, while conflicts arise due to the lack of these skills in the participating students. The analysis of the interactions using the IPA method proposed by Bales [6], allows this identification. This method allows the codification of group behaviors according to two main categories: socio-emotional and task categories, to be subclassified into twelve different types: six socio-emotional (C1: shows solidarity, C2: shows relaxation or moderation, C3: shows Agreement or approve, C10 shows disagreement or disapproval, C11: shows tension or discomfort, C12: shows antagonism or aggressiveness) and six towards task (C4: gives suggestions or orientation, C5: gives opinions, C6: gives information; Information; C8: ask for advice; C9: ask for suggestions or guidance). IPA provides an enumeration of possible behaviors that arise during collaborative activity and classifies them according to the type of reaction they mean (R1: positive, R2: answers, R3: questions, R4: negative) and which of the two categories above corresponds. Bales also differentiated a series of successive and typical phases by which any group that develops a collaborative task happens, and established that the problems of collaboration are manifested by inappropriate amounts of the different types of interactions in each stage, defining the ranges between which An amount of each type of interaction can be considered "appropriate".
Based on this classification and the features that tools like Google Docs provide to the teacher to analyze the participation of each student, the mapping of these interactions to the IPA categories can be carried out to detect collaborative conflicts. Here comes a new challenge: starting from a set of data on participation (work actions, suggestions, conversations, etc.), they must be linked to the IPA model behaviors.

2.3-Interactions and categories of behavior
During the processing of interactions, two types of indicators are calculated: intragroup interaction indicators and individual contribution indicators (ICI). For this calculation, IPA behaviors are taken into account and the number of interactions that the group manifests for each of the twelve categories is calculated, also calculating the associated percentage. To calculate the ICI, we compute the number of interventions that each of the students expressed in relation to each of the categories. In this way it is possible to evaluate the individual performance of each of the team members.
Processing interaction involves performing the classification of each interaction as a sample of a particular group behavior. Once the processing of a log base is completed, the existence of conflicts or disturbances in the collaborative dynamics of the student group is recognized. This is done to carry out corrective actions customized for each student. Regarding the mapping of interactions to IPA behaviors in particular, Costaguta & Amandi [5] propose the registration of the interactions of the group with a format based on "opening sentences" related to the attributes of collaboration. Then, the mapping is performed in one-to-one relationship with the behaviors. In this work, since the interaction of students is carried out exclusively through these "opening sentences", students have a limited set of options for interacting with each other. Our work proposes to overcome this limitation, allowing the free interaction between the members of a group working on a common goal.
Starting from the characterization of the IPA method, a shared space for collaborative work, and a chat for the interaction between the participants, in the next section we will describe the experimental process that was carried out for the automatic classification of Interactions.

3-Experimental results

This section is organized as follows. Section 3.1 details the set of data used to perform the experimental evaluation. Section 3.2 details the procedure for conducting the experiment. Finally, Section 3.3 shows the results obtained and an analysis of the results and their implications.

3.1-Data set
In order to carry out the experiments, a set of data was collected corresponding to the group work carried out by students of the Systems Engineering course of the National University of the Center of the Province of Bs. As., Argentina, during a 3rd year curricular subject. Participated 82 students who were divided into 17 groups of 5 or 6 members each and had to solve collaboratively three practical work required for the approval of the subject. The data were obtained by monitoring and recording student interactions using Google Docs.
Once the three practical works were completed, the chats were analyzed and the "behavior" (see section 2.2) more manually associated with each interaction and the context where it was issued was established manually. The resulting dataset has 5430 interactions. From the initial dataset, a second dataset was generated, which was pre-processed by removing invalid records, applying stemming and eliminating stopwords. In this way, the number of interactions of the second dataset was reduced to 3634. On the other hand, two additional ones were generated, in which some behaviors were grouped in order to predict the type of reaction and the type of collaborative conflict. For the third set of data, the Conduct variable is then grouped as follows: [C1, C2, C3] as "Positive", [C4, C5, C6] as "Response", [C7, C8, C9] as "Question", [C10, C11, C12] as "Negative". For the fourth set of data, the behavior variable is grouped according to the collaborative conflict they affect, resulting: [C1, C2] as "Control", [C3, C4] as "Evaluation", [C5, C6] as "Communication" , [C7, C12] as "Decision", [C8, C9] as "Voltage reduction", [C10, C11] as "Reintegration".

3.2-Process
The objective of this experiment is to find a model that allows to automatically categorize the interactions of the students to accelerate the processes of the IPA method, reducing the high consumption of human-temporal resources that requires the categorization of interactions by suitable people in the theme. To achieve this objective, the following research questions were posed: (1) Which classification algorithm and what characteristics will allow better classification results? (2) Is it possible to automate the detection of behavior in a direct way? (3) What improvements does an alternative offer with an intermediate level of abstraction where the categories are grouped?
To answer these questions, we performed an iteration for each algorithm on the different datasets using the WEKA tool. We sought the configuration with the most efficient results for its subsequent implementation in an automatic tool to assist teachers, students, multi-agent systems and people working with the IPA method in the categorization of interactions.

3.3-Results
First, we evaluated the influence of the pre-processing of the data, filtering stopwords and reducing the words to their stemming roots. Different classification algorithms were tested using 10-fold cross validation on the training data set. Table 2.1 shows the accuracy obtained for each algorithm.

Table 1. Accuracy of different classifiers over datasets.

steem tabla.png

As can be seen in Table 1, an improvement was achieved for all classifiers using the pre-processed dataset. For the non-preprocessed dataset, the SMO implementation of Support Vector Machines achieves a precision of 28.15% of correctly classified instances (over 12 classes), while the best results were obtained with the technique "Naive Bayes Multinomial Updateable "on the pre-processed dataset.
Secondly, we worked with the third dataset where the categories were grouped according to the type of reaction due to the similarity of terms that were observed in the interactions belonging to the categories within each type of reaction. With this third dataset, significant improvements were achieved in the predictor percentages of the classifiers. The "Naive Bayes Multinomial Updateable" technique achieves the best result, obtaining 55.42% of correct predictions. With these data the validity of the observation that bases the grouping is verified and it is inferred that better results will be obtained for this domain with the execution of a preprocessing and grouping by type of reaction.
Finally, we worked with the fourth dataset to contrast with the experiments performed how it affects the grouping of categories according to collaborative conflict in the prediction of classifiers. With this fourth dataset improvements were obtained in the results in contrast to the first experiment performed. However, these improvements do not exceed the prediction values reached by the classifiers of the second experiment executed. The "Naive Bayes Multinomial Updateable" technique again achieves the best result, with 42.84% correct predictions.

These experiments suggest then that better results will be obtained for this domain with the execution of a preprocessing and grouping according to the type of reaction. We can then answer the questions posed at the beginning of this section:

The best results were obtained with the combination of the "Naive Bayes Multinomial Updateable" technique with the dataset that groups the behavior categories by type of reaction, achieving a classifier with an accuracy of 55.42%
The results suggest that it is not possible to achieve complete automation of behavior detection, since the best results achieved an accuracy of 35.27%. However, it is possible to suggest the most likely categories, thus reducing the burden on the person in charge of classifying the interactions. In addition, as the Naive Bayes Multinomial Updateable an incremental algorithm, the selection of the correct category from the suggested ones can be used to feed back the model and increase its classification power.
Grouping of categories offers higher levels of prediction, but not reliable enough to fully automate the classification process.

4-CONCLUSIONS

In this post we present experimental results of classification of free text obtained from a chat in behaviors, reactions and conflicts. The resulting values of the classifiers have determined that it is not enough for this domain to work only with the lexical structure of the interactions for the recognition of the behaviors of a group of students working collaboratively.
The findings of our study may be used as evidence in future work of the need to work complementing the interactions with a semantic analysis of the text. The performance of the classifiers was better when working with a grouping of behaviors according to the type of reaction. We believe that this work makes an important contribution to the area of interaction analysis, because in the literature there are many studies on the English language, but few studies work with the Spanish language and with classification the behaviors that are established in IPA.
As future work, we will work to find alternatives of grouping different lines of the chat that involve the same idea raised by each individual. On the other hand, it is planned to incorporate other factors into the analysis that may positively affect the results, such as the enrichment of the dataset with the incorporation of a semantic analysis of the interactions. Finally, new sets of data will be collected, with different groups, to replicate the study and corroborate the results of this experience.

5- References

Lamb, B. Wide open spaces: Wikis, ready or not. EDUCAUSE review,39, 36-49 (2004).
Wood, S., Bragg, S. C., Mahler, P. H., & Blair, R. M. Beyond Crossroads: Implementing Mathematics Standards in the First Two Years of College. American Mathematical Association of Two-Year Colleges (2006).
Bottge, B. A., Rueda, E., Kwon, J. M., Grant, T., & LaRoque, P. Assessing and tracking students’ problem solving performances in anchored learning environments. Educational Technology Research and Development,57(4), 529-552 (2009).
Casamayor, A., Amandi, A., & Campo, M. Intelligent assistance for teachers in collaborative e-learning environments. Computers & Education,53(4), 1147-1154 (2009).
Costaguta, R., Garcia, P., & Amandi, A. Using Agents for Training Students Collaborative Skills. Latin America Transactions, IEEE (Revista IEEE America Latina), 9(7), 1118-1124 (2011).
Bales, R. F. Interaction process analysis; a method for the study of small groups (1950).
Costaguta, R. Entrenamiento de habilidades colaborativas. Facultad de Ciencias Exactas, Departamento de Computación y Sistemas, Universidad Nacional del Centro de la Pcia. de Bs.As. (2008).
Oishi, L. Working Together: Google Apps Goes to School. Technology & Learning, 27(9), 46 (2007).
Conner, N. Google Apps: The Missing Manual: The Missing Manual. " O'Reilly Media, Inc." (2008).
Holliman, R., & Scanlon, E. Investigating cooperation and collaboration in near synchronous computer mediated conferences. Computers & Education,46(3), 322-335 (2006).
Zhou, W., Simpson, E., & Domizi, D. P. Google Docs in an Out-of-Class Collaborative Writing Activity. International Journal of Teaching and Learning in Higher Education, 24(3), 359-375 (2012).
Apple, K. J., Reis-Bergan, M., Adams, A. H., & Saunders, G. Online tools to promote student collaboration. Getting connected: Best practices for technology enhanced teaching and learning in high education, 239-252 (2011).
Brodahl, C., Hadjerrouit, S., & Hansen, N. K. Collaborative writing with Web 2.0 technologies: education students' perceptions (2011).