Peer interaction in the teaching of mathematics: explanation and the coordination of knowledge

It has been argued that children’s performance in mathematics will be boosted by collaborative activities where they explain to peers how to solve problems. Empirical in-vestigations have generally been supportive, although they identify dangers of separa-ting the activity of problem solving from more strategic understanding of how problems should be approached. Background research has suggested that one route to avoiding the dangers might involve joint problem solving followed by a synthesis into teaching principles to be practised subsequently in dyadic tutorial sessions. The paper reports a study which attempted to test the suggestion with 135 nine- to ten-year old children. Roughly half of the sample was taken through six cycles of teaching with the suggested format; the other half worked individually with the same problems as the «experimental» children. Change from a pre-test prior to the cycles to post-tests one and four weeks afterwards indicated that the experimental intervention had resulted in more integrated performance, with problem solving and strategic understanding becoming more closely co-ordinated. Furthermore, this was achieved in a context of generally superior performance by the experimental children.


Introduction
There can be little doubt that for many children mathematics comprises two, largely unconnected systems. There is the formal system which they are exposed to at school, but there is also an informal system which they deploy in everyday life, including in certain cultures for commercial transactions (de Abreu, Bishop and Pompeu, 1997;Nunes, Schliemann and Carraher, 1993). It can be assumed that the informal system has its roots in skills which children acquire before they start school, skills which typically encompass counting (Gelman and Gallistel, 1978;Gelman and Meck, 1983), addition and subtraction (Hughes, 1981;Fuson and Hall, 1983;Smith, 1994), and the «sharing» which many see as the basis of division (Desforges and Desforges, 1980;Frydman and Bryant, 1988). From the educational point of view, the separation between the formal and informal systems has to be regretted. Although the informal system lacks the «uni-not in everyday life cars which are neither blue nor red are implicitly excluded. Wistedt by contrast focused on a race where the faster runner was made to start behind the slower, and the problem was to determine (given certain other parameters) who would win. The context of a race led children to consider mathematically irrelevant information like fatigue, disqualification and anabolic steroids.
However, in addition to confirming the insufficiency of real-life problems, Forman and Wistedt's emphasis on semantics also provides clues as to what else might be needed. In particular, it suggests that real-life problems should be treated as contexts for the articulation and resolution of differences between everyday and formal meanings, a suggestion which Noss (1997) may have had in mind when he wrote «From a pedagogical point of view, the construction of environments in which children (and adults) play with mathematical structures (...). is attractive; but the extent to which interaction with such environments results in mathematical learning depends critically on the relationship between the environment and the ways in which objects and relationships within the environment are combined, spoken about and expressed». Thinking how speaking about and expressing should proceed in practice, explanation of the formal meanings by teachers would appear to be crucial. Nevertheless, this too is unlikely to be enough. The issue of translation between informal and formal mathematics exemplifies what Vygotsky (1978) had in mind when he discussed the integration of «everyday» and «scientific» concepts, and for Vygotsky integration depended critically on active contributions from children. The implication is that children should supplement teacher instruction by explaining for themselves, a point also recognised by Chi, De Leeuw, Chiu and LaVancher (1994). Interestingly, although explanation by children is not acknowledged in the aforementioned National Curriculum, it is referred to in Mathematics 5-14 (Scottish Office Education Department, 1993), often seen as the National Curriculum's Scottish equivalent. Mathematics 5-14 states that children should be required «to think about what they are doing, to question and to explain».
How then should explanation by children be fostered? There would appear to be little advantage in encouraging children to explain back to teachers. As Piaget (1932) appears to have anticipated, the presumed expertise of teachers would make the exercise extremely artificial. By the same token though, there are attractions in having children explain to each other, and certainly the idea of explanation between peers is becoming popular in the mathematics education literature (Cobb et al., 1991;de Abreu et al., 1997;Forman, 1992). Indeed, Forman makes the further point that children appreciate the significance of offering explanations, perceiving themselves in effect as intellectual resources. Nevertheless, while such arguments may seem persuasive, convincing empirical evidence remains rather patchy. It is true that there are studies relating to mathematics which explore the association between explanation during peer interaction and subsequent individual performance (Webb, 1989). More often than not, the association is positive. The trouble is that the studies seldom take the integration of formal and informal systems as their starting place, and hence seldom follow a logic which leads to real-life problems, teacher explanation and peer interaction. Hence, they rarely deploy the control groups to show that peer interaction was adding something to the other components.
There is, however, one exception, a study by Davenport and Howe (1999). In this study, ten-year olds were taken through «experimental» cycles of introductory lessons from teachers, problem solving exercises in groups using real-life material, and tutorial sessions in dyads where partners «taught» each other how to solve problems previously experienced in groups. Further ten-year olds experienced «control» cycles of introductory lessons from teachers and standard, individual work on the real-life problems being used collaboratively in the experimental classes. Progress from individual pre-tests prior to the intervention cycles to individual post-tests shortly afterwards was greater amongst the experimental children than the control, and explanation was implicated positively. Nevertheless, uncertainties remained. In particular, when progress was measured against problem solving success, it was the frequency with which explanations were given that appeared to be helpful; when it was measured against strategies deployed, the crucial factor seemed to be the frequency with which explanations were received. Since (unsurprisingly) the two frequencies were negatively correlated, there is a suggestion that the children were obtaining partial benefits from peer interaction, receiving support for problem solving or strategic understanding but not for both as a function of their interactive roles. This carries the paradoxical (and worrying) implication that even if peer interaction does assist in the integration of formal and informal knowledge, it may by the same token inhibit coordination within the formal system. Recognising this, Davenport and Howe discuss the result at length. They note that although it may be an inevitable consequence of peer interaction in the context of mathematics, it is more likely specific to the form that interaction took in the course of their study. They note Bargh and Schul's (1980) claim that preparing to teach another person produces highly organised cognitive structures, by invoking informal knowledge bases and triggering subject matter elaboration. This led them to propose that had their group sessions ended with explicit preparation for the teaching to take place in the dyads (rather than, as was the case, moving from one to the other without notice), the children may have been obliged to bring their strategic and problem solving skills into closer alignment, perhaps to the benefit of both.
Davenport and Howe's proposal is consistent with the view expressed subsequently by Vergnaud (1997) that «concepts do not derive purely from empirical regularities, [but] also derive from questions about the reasons for such regularities». Moreover, it squares also with evidence provided by Schubauer-Leoni and Perret-Clermont (1980, but see also 1997), from a study in which children planned collaboratively how to present mathematical problems to their peers. Such collaborative planning was extremely valuable, particularly when as would be the case with Davenport and Howe's dyadic sessions it was followed by genuine presentations with feedback from peers. However, the emphasis of Schubauer-Leoni and Perret-Clermont's work was problem solving, rather than the co-ordination of problem solving with strategic growth which is of interest here. Thus, further research is required, and this is what led to the study that is reported below. In essence, the study was an attempt to test Davenport and Howe's proposal by establishing whether group problem solving in mathematics which ends with a synthesis into teaching principles to be practised subsequently in dyads leads to co-ordinated progress in strategic knowledge and problem solving. Moreover, the study was also concerned with whether such effects could be achieved without sacrificing the apparent «added value» of peer interaction over real-life problems and teacher explanation. Thus, following the logic of the preceding paragraphs, the study was intended to supply one piece in a jigsaw whose final outcome may be the integration of informal and formal knowledge in the teaching of mathematics.

Method Design
The study followed a pre-test -intervention -post-test design, with progress being defined with reference to pre-to post-test change. The pre-test was in two stages, with both stages involving children working individually. The first stage involved the solution in a paper-and-pencil test of a series of mathematical problems and was intended to tap problem solving. The second stage involved the explanation by children in face-to-face interviews of how the problems were approached, and was expected to provide information about both problem solving and strategic understanding. The extent to which interview explanations can tap strategic phenomena is of course thorny and long-debated. Suffice it to say that school mathematics which is the focus of the research is primarily concerned with explicitly articulated strategies, and Davenport and Howe (1999) show that strategies in this sense can be ascertained via the present procedures. The intervention took two forms, experimental and control. The experimental intervention involved six cycles of introductory lessons followed by problem solving in groups followed by tutorial sessions in dyads. Group and dyad activity was videotaped, to provide information about the dialogue features associated with progress. Given the results of the Davenport and Howe (1999) study, dialogue information was expected to prove crucial in exposing the extent of co-ordination between problem solving and strategic understanding. The control intervention involved six cycles of introductory lessons followed by individual work on the problems that were being used collaboratively in the experimental intervention. There were two post-tests, one during the week following the intervention and the other one month later. Both post-tests followed the two-stage format that was used in the pre-test, but deployed new problems.

Subjects
The study was conducted with children from four schools in the East End of Glasgow. Two of the schools (Schools A and B) were Catholic and two (Schools C and D) were non-denominational, but all served the same, extremely deprived catchment area. All of the children were from Primary Six classes, and were nine or ten years of age. This age group was selected on practical grounds: the children should have the reading skills required for the paper-and-pencil problems and group/dyad instructions, but would not be under the heavy curricular pressures associated with Primary Seven, the final year of primary schooling. Each school supplied two Primary Six classes, meaning that eight classes in total were involved.

Materials
Pre-and post-tests: The materials for the first stage of the pre-and post-tests comprised booklets containing addition and subtraction problems with «real-life» content. Addition and subtraction were utilised partly because they are central to mathematics teaching and probably perceived as such by children, and partly because they are as noted earlier precociously-acquired elements of the informal knowledge base. Some problems (referred to as «single-component») involved one calculation, e.g. Elizabeth had a long skipping rope measuring 14m 75cm. She cut a piece off measuring 6m 36cm to give to her friend. What is the length of her skipping rope now? Other problems (referred to as «multi-component») involved several calculations, e.g. The twins have been saving up their pocket money. David has £3.62 and James has £4.75. How much more money does James have than David? Later that same day, David was given an extra 85p from his aunt. How much has David saved now? James wants to buy a sports top costing £5.99. He only has £4.75. How much more money does he need to save up? The problems were chosen so that the pre-test and the two post-tests: a) achieved an equivalent balance between single-and multi-component problems; b) contained 20 components in total. The problems covered money, time, weight and distance, these being the topics which the routine Primary Six textbook was also covering.
In addition to topic, the problems differed in computational structure. A number of authors have shown that structure has relevance to both problem difficulty and solution strategy, and given the study's aims this indicated that structure should be considered. Although the authors show broad agreement over the crucial features of structure, they differ over detail (see Verschaffel and De Corte, 1997), and after some reflection the study followed the approach taken by Carpenter and Moser (1982). This involved including problems with the following six structures: a) separating, e.g. the «Elizabeth» problem above; b) comparison, e.g. the first «David and James» component above; c) joining, e.g. the second «David and James» component; d) equalising-take-away, e.g. the third «David and James» component; e) part-part-whole, e.g. Bobby got a new watch and decided to time how long it took him to get to school. He waited for the bus for 7 minutes 23 seconds. The ride lasted 15 minutes 39 seconds. How long did his journey take altogether? f ) equalising-add-on, e.g. Mum set the microwave timer for 3 minutes 25 seconds to cook a pizza. This was less than the time on the packet, so the pizza was not properly cooked. Mum set the timer for another 1 minute 50 seconds, which made the cooking time the same as on the packet. How long did it say on the packet to cook the pizza? The sets of problems put together for the pre-test and two posttests were selected from a pool of 250 items which covered all six structures. A pilot study with 60 Primary Six children from neighbouring schools allowed the components of these items to be classified as «easy» (solved by the majority of children regardless of (teacher-designated) ability), «moderate» (solved by the majority of high and average ability children but a minority of low ability children) and «difficult» (solved by the majority of high ability children only). The pre-and post-tests were structured to contain five easy components, ten moderate components, and five difficult components, but apart from this selection of problems was random. Despite the randomness, each set contained at least two instances of each topic and of each computational structure.
The materials for the second stage of the pre-and post-tests comprised interview schedules. The schedules started with a standard introduction to be presented to the children, that two of the problems would be read to them and they would be asked to explain how they solved each problem in turn. The schedules continued with a series of questions and associated prompts: a) What kind of sum did you do? (Prompts: Let's look at the sum/answer and see if you can remember); b) Why did you do an adding/taking away sum? (Prompts: Can you remember how/why you did that? How were you sure you'd get the right answer? What gave you a clue? What was it that told you to do an adding/taking away sum?); c) Imagine the person sitting next to you doesn't understand what type of sum to do. How would you explain to them why it was an adding/taking away sum? d) I see you've done a sum here/There's only an answer here -did you do it in your head? Can you talk through what you did? (Prompts: How did you add/take away? Did you use your fingers/a rule/know the answer? Which side did you start with? Which number went on top? Why?).
Experimental and control intervention: The materials for the experimental and control interventions included problem sheets, each sheet containing one multicomponent problem. These problems were selected from the same pool as the preand post-test items, and hence varied in topic area and structure. The problems were printed on white sheets for individual work in both interventions, and for dyadic work in the experimental intervention. For the experimental intervention only, they were also printed on yellow sheets as «group decision problems».
In addition, the materials for the experimental intervention included instruction sheets intended to structure the group/dyad activity by indicating when to read, think, talk, write etc. Sample instruction sheets are presented in the Appendix. As can be seen, the group instructions had the children first working on their own to solve the problem, and then sharing answers and calculations. This is because work by Howe, Rodgers and Tolmie (1990) has shown that having children commit themselves individually in writing is a powerful stimulus to participation in subsequent group discussions. If there was agreement over the individual solutions, the children were instructed to transfer the calculations and answers to a group decision sheet. If there was disagreement, they were instructed to decide between themselves what the correct answers should be before transferring to a sheet. Once they had transferred, the children were told to check the answer card and see if they were correct. If incorrect, they had to identify through discussion where they had gone wrong and attempt to remedy this.
The dyadic sessions were planned such that one child, designated the «teacher», would instruct the other child, designated the «pupil», in how to solve the problem which the «teacher» had previously covered in the groups. Then the teacher-pupil roles would be switched with the new «teacher» working with the (different) problem which he/she had covered in the groups. The dyad instruction sheet was to be given to the «teacher», and directed him/her to have the «pupil» read the problem and then to ask the «pupil» what calculations had to be done and what numbers had to be used. The «teacher» was then to correct any mistakes and check if the «pupil» understood how to solve the problem. Next, the «pupil» was to work out the solution, with the «teacher» giving feedback on accuracy.
Although not intrinsic to the group/pair activity, the materials for the experimental intervention also included «perceptions sheets», to be administered at the end of the sixth cycle. These sheets contained four multiple choice questions: a) How much did you enjoy solving problems with other children? (Choices: More than/not as much as/the same as your usual number work); b) Which did you enjoy most out of the groups and the pairs? (Choices: Groups better than pairs/pairs better than groups/both the same); c) How much do you think you have learned from solving problems with other children? (Choices: More than/not as much as/the same as with your usual number work); d) Which did you learn most from out of the groups and the pairs? (Choices: Groups more than pairs/pairs more than groups/both the same).

Procedure
Pre-test: The first stage of the pre-test was administered on a whole-class basis to each of the eight classes in turn, with 135 children in total completing the stage. Two researchers were involved, one working with the classes in Schools A and C and the other with the classes in Schools B and D. The researchers began by introducing themselves and giving out the booklets. The children were asked to work on their own through the booklets, solving as many problems as possible and writing down how they had reached their answers. They were told that if they had calculated the answers «in their heads», this was perfectly acceptable but they should also write down how this was carried out. In addition, the children were told that if they were really stuck and could not work out an answer, they could miss the problem out and go on to the next one. It was re-iterated that they were to work on their own, asking help from no-one not even the researchers.
The second stage began as soon as the first one had finished, and continued over the following day. The children went one-by-one to a quiet room where they were met by the researcher assigned to their school, and invited to sit down at a table. The researcher chatted to the children to put them at ease and guided the conversation round to the interview material by asking what they thought of the problems and how they felt they had done. After these preliminaries, the researcher worked through the interview schedules, taking a pair of multi-component (in practice, two-component) problems. The problems to be discussed with each child had been decided in advance, to allow each problem to be covered the same number of times within and across classes. While all interviews were audiorecorded, responses were noted in case of tape failure. Due to the absence of one child, 134 children completed the second stage of the pre-test.
Assignment for intervention: Following the procedures of Davenport and Howe (1999), the experimental intervention was planned on the assumption that the children would conduct the group activity in foursomes. Bearing this in mind, the class in each school which at pre-test provided the closest approximation to a multiple of four was assigned to the experimental intervention. The other class became the control. In fact, even with two classes to choose from, it was not always possible to find a multiple of four, and it was necessary therefore to have some experimental children work in groups of five. In particular, 13 groups of four and four groups of five were formulated. Thus, 72 children were to be involved in the experimental intervention, as opposed to 63 in the control. Assignment to groups was the responsibility of a researcher who was not involved in the interventions themselves.
Grouping for the experimental intervention made reference to gender and ability, the latter defined by the number of problem components solved correctly during the first stage of the pre-test. There is evidence (Howe, 1997) that groups which are balanced in terms of gender composition are more effective than groups which are asymmetric, and thus all groups of four contained two boys and two girls. Two groups of five contained two boys and three girls, and the other two contained three boys and two girls. There is in addition a suggestion (Webb, 1989) of complex interactions between ability category of child (low; average; high), ability range of groups (narrow = low + average or high + average; wide = low (+ average) + high), gender and learning. Thus, recognising that there were five easy components, ten moderate components and five difficult components in the first stage of the pre-test, the children were designated «low ability» if they scored 0 to 5, «average ability» if they scored 6 to 15 and «high ability» if they scored 16 to 20, a set of distinctions which showed a close approximation to teacher nominations (x 2 = 53.57, df = 4, p<0.001). Although it was beyond the scope of the study to include all the combinations of ability category and ability range that Webb's work implies, the groups were set up to give equal numbers of: a) narrow and wide range arrangements; b) boys higher ability than girls, girls higher ability than boys, and boys and girls equivalent.
Experimental and control interventions: The interventions were implemented by the researchers who had administered the pre-tests but with a switch of schools. In other words, the researcher who had pre-tested in Schools A and C ran the interventions in Schools B and D, and vice versa. Since neither researcher had been involved in group assignment, the interventions were therefore implemented in conditions of «blindness» over pre-test score/ability category. In both the experimental and control interventions, the first five of the six cycles began with introductory lessons to the whole class. Although these lessons were presented by the researchers, they were included in recognition of the point acknowledged earlier, that no matter what the merits of peer interaction, teachers also have a role to play in the explanation of meaning. The lessons covered in sequence: a) the identification of good stopping places in the text for checking that the problem is understood; b) the identification of clue words for deciding what kind of sum to do; c) the identification of what kind of answer is required; d) the need to check work carefully; e) the fact that there are different ways of obtaining correct answers.
After the introductory lesson, the experimental classes broke into groups to follow their instruction sheets through one multi-component problem, their activity being videorecorded throughout. During any one cycle, half of the groups worked on one problem and half on a different problem. During the days when the experimental classes were engaged with group work, the control classes were working individually on one of the two problems. In both cases, the researcher remained in the room to offer assistance if called upon by the children. On conclusion of each group session, the experimental children were helped to prepare for their teaching role in the dyadic sessions which were to follow later that week. Preparation involved the researcher telling/reminding the children that they would have to act as teachers, and asking them if they were certain how to solve the problems. They were asked as a group what type of sum to do and why, and what numbers to use in the calculation. Then one group member was asked to run through the explanation individually.
For purposes of the dyadic sessions, each child was placed with a classmate who had worked on a different group problem and who had therefore been in a different group. Apart from this, assignment to dyads was at random and varied from session to session. The dyads followed the instruction sheets through twice, with the children taking turns to teach their partner how to solve the problem which they had covered within their groups. Their activity was videotaped throughout. The control children by contrast worked individually on the group/dyad problem which they had not previously seen, meaning that by the end of each cycle all of the participating children had attempted the same two problems (with the same 12 problems attempted therefore over the duration of the interventions). One week was allowed for each cycle, and so the interventions spanned six weeks in total. Once the experimental children had completed the sixth cycle, they were given the perceptions sheets for completion. During any single cycle, each experimental child was occupied for two 40-minute teaching periods, one for the introductory lesson, group session and preparation for teaching, and the other for the dyad.
Post-test: As noted previously, the first post-test took place one week after the intervention, and the second post-test one month after the first one. For the first post-test, 66 experimental children were available for the first stage, and 65 for the second. 53 control children were available throughout. For the second posttest, 62 experimental children were available for the first stage, and 59 for the second stage. 57 control children were available throughout. Post-test procedure was identical to pre-test, and every child was post-tested by the researcher who had administered their pre-test. Since there had been a switch of researchers for the interventions, this meant that post-testing was blind to the children's status as experimental or control.

Coding
Pre-and post-test coding: Coding the data obtained during the first stage of the pre-and post-tests was straightforward: a count was made of the number of problem components out of 20 which each child solved correctly. As noted already, this measure was applied to the pre-test data prior to grouping, and was considered in group assignment. It was also applied to the data from the two post-tests, and thus three totals were available for each child. These totals were referred to as «pre-test score», «post-test 1 score» and «post-test 2 score». By deducting pre-test score from the two post-test scores, two indices of change were also obtained, «pre-to post-test 1 score change» and «pre-to post-test 2 score change». As the measures were objective, no reliability assessment was necessary.
The «score» indices obtained from the first stage data provided one reasonably direct measure of problem solving success. Further indices of relevance to problem solving were obtained from the second stage data, specifically from the children's explanations of how they obtained their answers. By looking at these explanations, it was possible to give credit to children who knew the computational procedures but made errors in carrying these procedures out. Explanations were assigned values as follows: a) no answer or inappropriate answer = 0; b) some understanding of computational procedures but incomplete = 1; c) procedures explained appropriately = 2, meaning that with two problems and two components of each problem the range achievable was 0 to 8. The values were referred to as «pre-test execution», «post-test 1 execution» and «post-test 2 execution», with «pre-to post-test 1 execution change» and «pre-to post-test 2 execu-tion change» being obtained by deduction. A 25% sample of data from the pretest and the two post-tests was independently coded by two judges. Their agreement over execution coding was 82% (kappa = 0.65, p<0.001).
In addition, data obtained during the second stage of the pre-and post-tests were used as evidence of underlying strategies. In particular, the children's explanations of how they decided what kind of sum to do (and how they might help another person decide) were summarised and assigned values as follows: a) no answer or inappropriate answer = 0; b) appropriate clue word/s identified in the text but no account or inadequate account of the sum that it signified = 1; c) appropriate clue word/s identified and adequate account provided = 2. With two problems included and two components to each problem, the range of values achievable was again 0 to 8. These values were referred to as «pre-test strategy», «post-test 1 strategy» and «post-test 2 strategy» as appropriate, with «pre-to posttest 1 strategy change» and «pre-to post-test 2 strategy change» obtained as before by deduction. Based on a 25% sample, the interjudge agreement over strategy coding was 88% (kappa = 0.81, p<0.001).
Experimental intervention: In the interests of manageability, it was decided to analyse 50% of the group and dyad sessions in the experimental intervention. Thus, for half of the experimental sample (selected at random), coding was applied to the first, third and fifth group sessions and associated dyads. For the other half, it was applied to the second, fourth and sixth group sessions and associated dyads. As a consequence, the coded sessions were referred to as «group 1/2», «group 3/4», «group 5/6», «dyads 1/2», «dyads 3/4» and «dyads 5/6». Coding focused on dialogue, but attention was also paid to the initial (and individual) problem solving which took place at the start of the group sessions since this was regarded as a potential influence on dialogue and/or learning. In relation to initial problem solving, each child's answers were coded for «solution accuracy» (no components correct = 0; some but not all components correct = 1; all components correct = 2) and «solution agreement» (no overlap between group members = 0; some differences between group members but some overlap = 1; identical solutions across group members = 2). No reliability checks were necessary since the measures were objective.
As regards dialogue, two factors were considered, content and addressee. Five content categories were deployed with both the group data and the dyads: a) requests help e.g. I'm stuck, how do you get that? ; b) gives a full explanation e.g. Put the higher number at the top and the lower number at the bottom and take the lower one away starting with the units; c) gives a partial explanation e.g. It's about taking away -10's, 100's and units; d) disputes or challenges e.g. Don't do it that way; do it her way, she's usually right; e) comments on the activity e.g. This is hard. This is a takeaway and I can't do those. With the groups, the addressee could be any of several other children or the researcher. Alternatively the remark could have no specific addressee. With the dyads, the options were the one other child, the researcher, or non-specific. As the groups were therefore more complex in terms of interaction than the dyads, the reliability checks were carried out on these, with 24 sessions (22% of the total recorded; 44% of the total coded) being independently coded by two judges. The correlations between the judges» content frequencies ranged from +0.74 to +0.96 with a mean of +0.85. The correlations between their addressee frequencies ranged +0.63 to +0.93 with a mean of +0.80. Treating the figures as acceptable, ten values were derived for purposes of analysis: a) the number of times each child produced a request for help, a full explanation, a partial explanation, a dispute or challenge and a comment on the activity, when his/her remarks were summed across addressees; b) the number of times each child received a request for help, a full explanation, a partial explanation, a dispute or challenge and a comment on the activity, when remarks addressed to him/her were summed across members of the group or dyad. The focus on frequencies was motivated by the Davenport and Howe (1999) study which, as noted earlier, showed these to be important predictors of learning outcomes. In all cases, the values for each child were divided by the duration of the session in seconds to correct for slight variability in overall length.

Results
The major issues, it will be recalled, were: a) whether the refinements made in the present study to Davenport and Howe's(1999) procedures would bring about more co-ordinated progress within the experimental children; b) whether this could be achieved without compromising the value that peer interaction appears to add to real-life problems and teacher explanation. These issues will be the focus of what is to follow, but it should be noted that the analyses relating to them constituted only a proportion of the analyses actually conducted. Consideration was also given to the effects of school, gender and ability category, and all three variables proved to be associated with pre-test performance and/or pre-to posttest change. However, in no case was the association such as to compromise the picture to be painted below. The effects of school, gender and ability did not interact with the effects of experimental vs. control intervention, and the correlational evidence to be reported was robust across school, gender and ability differences. Thus, in the interests of brevity, nothing further will be said regarding these variables, with the expectation that details will be presented on a subsequent occasion.
Focusing then on the central issues, the first point to note is that progress as revealed in pre-to post-test change was relatively modest. As can be inferred from Table 1, both the experimental and control children made gains that in general were less than one scale point. Sometimes, there was regression from pre-to posttest. Nevertheless, despite the limitations, the experimental children performed better than the control on five of the six measures of change, with the differences reaching statistical significance on two occasions. In addition, there was a clear sense amongst the experimental children themselves that the intervention was of use: 38% responded to the «perceptions sheet» by saying that they had learned more than usual while only 9% responded by saying that they had learned less. 53% on the other hand perceived no difference. As regards enjoyment, the vote in favour of the experimental intervention was even more marked: 67% of the children said they had enjoyed the experience more than their usual number work, 17% said they had enjoyed it the same, and only 16% said they had enjoyed it less. Thus, even though the relatively modest learning gains must not be overlooked, the experimental intervention did have positive consequences. It can therefore be asserted with confidence that the intervention did not compromise the previously documented benefits of peer interaction. However, did the intervention promote co-ordinated knowledge while otherwise remaining consistent with earlier research? One way of looking at the issue is by considering the correlations achieved by first the experimental children and then the control between the six measures of change. The relevant figures are presented in Table 2, and these give some indication that the procedures were working as anticipated. The experimental children were indistinguishable from the control children over the correlations relating to the same aspects of knowledge, e.g. pre-to post-test 1 score change vs. pre-to post-test 2 score change. In all cases, the correlations were positive and statistically significant. However, there were marked differences between the experimental and control children over the correlations relating to different aspects of knowledge, e.g. pre-to posttest 1 score change vs. pre-to post-test 1 strategy change, and these correlations are the ones that bear upon knowledge co-ordination. The nature of the differences amounts to the experimental children being far more likely than the control children to produce correlations which were positive and statistically signif-icant. This said, the correlations relating score change to strategy change and score change to execution change were stronger for the experimental children than the correlations relating strategy change to execution change. In addition, the correlations involving pre-to post-test 1 change were stronger than those involving pre-to post-test 2 change, perhaps reflecting some tailing away due to the return during the interval to traditional teaching methods but just as likely stemming from «test fatigue». An additional way of approaching co-ordinated progress is by looking at how the change measures related to group or dyad interaction, and since this proved decisive in the Davenport and Howe study it was followed here. Accordingly, correlations were computed between the six pre-to post-test measures and what amounted to 66 interaction variables (two initial problem solving variables and ten dialogue variables for each of three group sessions, and ten dialogue variables for each of three dyad sessions). With so many correlations, it would be impossible to present every result. However, Table 3 shows all correlations which were either statistically significant (p<0.05) or, to avoid missing trends, approaching significance (p<0.1 and >0.05). It is clear from Table 3 that as regards the group sessions, a small number of dialogue variables were consistently and positively related to score change and execution change. These variables included asking for help, giving partial explanations, and making comments. None of the group session dialogue variables showed consistent negative correlations with score change and execution change, and of the variables which fluctuated between negative and positive correlations only getting a full explanation appears with any frequency within Table 3. Thus, there is a suggestion from Table 3 that if the group interaction had an impact, it tended to be productive rather than inhibitory. However, while this may have been the case for score and execution, group session dialogue can hardly have had any impact, either productive or inhibitory, upon strategy change. Getting a full explanation in group 5/6 was negatively related to strategy change, but no other group dialogue variables appear to have been relevant. Had the study been limited to a single post-test, it might not have been legitimate to talk about dialogue influences on learning on the basis of Table 3. With the first post-test data, solution accuracy was, as Table 3 makes clear, negatively correlated with score change and execution change. Since solution accuracy was also negatively related to the key dialogue variables (significantly for all three sessions with giving partial explanations but more variably with the others), it might have been the case from the first post-test alone that the children whose initial solutions were weak progressed more from pre-to post-test and engaged in the identified dialogue more frequently. The correlations between pre-to post-test change and the dialogue variables were an artefact of this. However, it will be apparent from Table 3 that solution accuracy had no bearing on score and execution change from pre-to second post-test: the correlations there were almost entirely limited to dialogue. This suggests that dialogue was in reality instrumental in learning, and if solution accuracy was implicated it was as a trigger to the crucial variables. This said, the point raised earlier must be re-iterated: the impact of dialogue was limited to score and execution change, with strategy change seemingly unaffected. This of course belies co-ordination at least at the early stages.
Assuming though that dialogue was important in the group sessions (if only for score and execution), the three variables of significance were ones where the children produced the relevant behaviour. In the dyad sessions, three dialogue variables were also important, but this time the variables were ones where the children were addressed by others: gets a full explanation, gets challenged, and gets a comment. However, on all but one occasion, these variables were negatively correlated with pre-to post-test change, suggesting that learning was undermined by the features of dialogue rather than promoted. This implies that if the dyad sessions had any value (and this needs discussion) it was to the extent that dialogue was minimal. However, it was not simply score change and execution change that were negatively related to dialogue but also strategy change implying that any effects of the dyadic interaction must have been equivalent across all three aspects of knowledge. Yet if this was the case, it means that the three aspects must have been brought together, despite their evident separation while the group sessions were in progress. The implication is then that co-ordination was achieved but via post-group processes, an implication which confirms the picture in Table 2 but goes beyond it. As such there was clearly a complex interplay between what happened during the group sessions and what happened during the dyads, and this may have had some bearing on the experimental children's perceptions, for there were no clear «favourites» as regards reported enjoyment or reported learning. 26% said that they had enjoyed the groups more, 33% said that they had enjoyed the dyads more, and 41% said that they had no preference. 39% said that they had learned more from the groups, 31% said that they had learned more from the dyads, and 30% said that they had learned the same from both.

Discussion
No matter how complex the interplay between group sessions and dyads and no matter how this challenged the children's powers of perception, the overall picture as regards co-ordinated knowledge can surely be treated as encouraging. There are signs in all analyses that supplementing group problem solving with a synthesis into teaching principles leads to co-ordinated change in both strategic knowledge and problem solving routines. Correlations between the six change measures were more often positive with the experimental children than they were with the control, and more often statistically significant. Correlations between the change measures and the dialogue variables were uninterpretable unless co-ordination can be assumed to have occurred. More importantly perhaps, the co-ordination can be directly attributed to the supplementation of group problem solving with a concluding synthesis. The synthesis was the only substantive difference between the procedures used in the present study and those deployed by Davenport and Howe (1999). As mentioned earlier, Davenport and Howe failed to achieve co-ordinated progress. In addition, the correlations between the change measures and the dialogue variables suggested that co-ordination was achieved subsequent to the group problem solving and prior to the dyads. This concurs with change being triggered by events which took place at the session's conclusions. This said, it is unclear from the results whether co-ordination was achieved during the concluding synthesis or after its completion. Certainly, the synthesis would have covered elements of relevance, for in deciding how to teach someone else the children were obliged to discuss how the sum should be recognised (strategic information) and how having been recognised it should be carried out (solution information). Nevertheless, the possibility of post-group processes cannot be excluded, for evidence already exists (Howe, Tolmie and Rodgers, 1992;Tolmie, Howe, Mackenzie and Greer, 1993) to implicate such processes in knowledge change.
Wherever co-ordination occurred, the results suggest that it was a key element in producing strategic change. Events taking place during the group sessions had limited bearing upon strategic understanding. It was only in group 5/6 that a feature of dialogue related to strategic change. However, by group 5/6 a degree of coordination can be assumed to have occurred, and the feature in question, gets a full explanation, was also associated with score change and therefore with problem solving. As a result, the association with strategic change may have been a consequence of co-ordination rather than a cause. In addition though, the association was negative, and this was also true of the aspects of the dyadic sessions which were relevant to strategies. Taking all this together, the message is that insofar as strategic understanding progressed in the experimental children, it was almost certainly as an indirect consequence of something else and the most plausible trigger is the problem solving ability that strategic understanding became co-ordinated with, an ability which was of course directly influenced by the group session dialogue. Thus co-ordination was crucial to strategic change. However, if that was the case, it implies that strategic growth depended upon problem solving success, and this would be of considerable theoretical significance. It would not only clarify Bargh and Schul's (1980) claim that preparing to teach another person enhances the organisation within cognitive structures, by showing how organisation is actually achieved. It would also square with Piaget's (1974) contention that success is a pre-requisite for understanding, and Douady's (1991but see also Vergnaud, 1997) related claim that mastery of mathematical «objects» is dependent upon mastery of mathematical «tools». By contrast, it would call into question the alternative view favoured, e.g., by Anderson (1983) that procedural knowledge is dependent upon declarative.
The suggestion is then that strategic understanding was in some sense parasitic upon problem solving. What though of problem solving itself as revealed in score and execution change, and in particular of the group interaction which seemed to bring change about? It was clear from the results that group interaction had to take a certain form to stimulate progress, with three dialogue vari-ables (requesting help, giving a partial explanation, and commenting on the activity) appearing especially important. Conceptually, there are points of similarity between giving a partial explanation and commenting on the activity, and thus it is not surprising that they operated equivalently. It is perhaps more surprising that giving a full explanation was not similarly involved, although it was not of course negatively related to change merely unrelated. Be that as it may, the involvement of explaining and commenting concurs closely with what Davenport and Howe (1999) found for the dialogue correlates of problem solving, as indeed it does with the research which as mentioned earlier is summarised by Webb (1989). As such the study can be seen as offering further support to the theoretical position outlined by way of introduction, that getting children to engage in explanation is helpful to their learning. More generally though, it serves to emphasise the importance of active engagement in dialogue if progress is to be made. It was after all explanatory dialogue that was produced that proved to be important; explanatory dialogue that was received had no relevance. It has been suggested (by McKendree and Mayes, 1997 -but see also Gagné, 1987) that children can learn vicariously from dialogue, by observation or even from scrutiny of transcripts. The present results indicate that the suggestion has limited generality.
However, while certain group dialogue variables appeared to be important for problem solving success, these variables were themselves also subject to group task effects. In particular, there were strong negative correlations between the variables of significance and solution accuracy during the initial, individual stage of the group task. This too has important consequences. On the theoretical side, it means that unlike strategic understanding which, as noted above, depends on success, problem solving is driven by failure, albeit failure mediated by dialogue. In addition, it provides support for a point made by Grossen (1994) and Howe and Tolmie (1994), that task factors exert a powerful influence over peer dynamics and subsequent learning. From a more practical angle, the relevance of solution accuracy suggests that to optimise peer interaction great care needs to be taken over choice of problems. Problems should be challenging so that initial solutions are inaccurate, but they should probably not be too difficult or children will be demoralised. The implication is that problems should be carefully tailored to the group's capabilities, and it has to be admitted that in the (arguably inappropriate) pursuit of standardness the present study did not do this. This may help to explain the more disappointing aspect of the study's results: although co-ordinated progress was achieved, the absolute amount of progress was modest.
Assuming progress could be boosted by strategic choice of problems, the situation looks reasonably promising for the use of peer interaction as a classroom technique. After all, even with the present limitations on absolute progress, preto post-test change in the experimental children was generally superior to what was observed in the control, albeit with the differences reaching statistical significance on two occasions only. In addition, it should not be forgotten that, as re-vealed through their perception sheets, the experimental children themselves saw benefits in peer interaction. This confirms Forman's (1992) point that children appreciate their value as intellectual resources. Furthermore, the vast majority of experimental children claimed to enjoy the intervention more than routine teaching. This is perhaps particularly important given the deprived nature of the sample, and the association of its catchment area with low attainment and early drop-out. In any event, the message from the study is clearly encouraging when it comes to educational practice, suggesting that peer interaction may have something significant to offer.
This said, anyone fleshing the educational implications out might be forgiven for thinking that the focus should be on the group sessions alone: while the group interaction could be productive, the dyadic interaction served only to undermine what the group sessions and the subsequent synthesis had apparently achieved. In particular to the extent that the dyadic sessions led to children receiving full explanations, they resulted in dialogue which imposed strong barriers on learning. On the face of it, these results are surprising. As noted earlier, Schubauer-Leoni and Perret-Clermont (1980 found that a positive outcome was achieved when children prepared to teach others and, equivalently to the dyadic sessions, implemented their preparation. Davenport and Howe (1999) found that receiving explanations was a positive predictor of strategic growth. However, Schubauer-Leoni and Perret-Clermont were concerned with problem solving alone while the present concern was with the co-ordination of knowledge, and this may have been crucial. There is growing evidence (see, e.g. Damon and Phelps, 1989) that tutoring dialogues are particularly helpful when the emphasis is on practical problems, and less so when the focus is conceptual. It may be that the end-of-group synthesis and/or subsequent co-ordination had taken the children onto a different and broader plane, such that the feedback from teaching was distracting rather than helpful. Likewise, Davenport and Howe failed to achieve knowledge co-ordination, and explanations may therefore have been essential to compensate for the absence of input from other sources. It could be that once co-ordination has been achieved explanations are not only rendered unnecessary; they also intrude counter-productively into processes which have been triggered by problem solving. In any event, the discrepancies between the present results and those obtained by Schubauer-Leoni and Perret-Clermont and Davenport and Howe draw attention to a more general point: the interplay between dialogue and learning is extremely subtle, such that what applies in one context may be eradicated or even reversed in another.
Of course, if the dyads were distracting as argued above, we are faced with a dilemma, since preparation for the dyads was most definitely of benefit. How would it be possible to create a meaningful activity in classrooms where children are asked to prepare for teaching but do not actually teach? The solution probably lies with more hypothetical preparation where children are told «Suppose that you wanted to teach someone else in your class how to solve the problems, what would you do?» As it happens, Duchak (1996) has achieved success with mathematics teaching using precisely this approach, finding that group problem solving plus hypothetical preparation was more helpful than group problem solving alone. This then is where the study seems to have led. Peer interaction may well prove useful in the teaching of mathematics but to make the most of it careful attention needs to be paid to the focal activity. As regards activity, the group task used here appears to be along the right lines, but consideration should be given to ending it with more hypothetical preparation. There may be no need for the dyads. In addition, the group task should focus on problems that are just beyond their members» competence. This second suggestion allows the paper to return to the Vygotskyan note on which it began, for problems are just too difficult hint strongly of working within «zones of proximal development» (Vygotsky, 1978). Perhaps though this should not seem surprising. The rationale for exploring peer interaction and indeed for locating it within a context of real-life problems and teacher explanation was the copious evidence that children are not «blank slates» when they embark upon formal mathematics. On the contrary, they approach the subject with extensive informal knowledge which needs to be extended and transformed. However, as noted earlier, this tension between the formal and the informal exemplifies what Vygotsky had in mind when he introduced the distinction between scientific and everyday concepts. The tension has inevitably been implicit rather than explicit in what has preceded, but nevertheless it is only to be expected the mechanisms which have emerged resonate closely with ideas that Vygotsky himself expressed.

Appendix
Group Instructions 1) Do you all have the sheet with Problem 2 in front of you?
Stop reading and check. 2) Get one person in the group to carefully read the problem out loud to everyone else.
Stop reading and listen. 3) Now work out the answers to the problem on your own. You have 2 answers to find. Let each other know when you are finished. Re-read the problem first of all. Stop reading and work. 4) Has everyone finished or done all that they can do?
Stop reading and check. 5) Now take it in turns to show the rest of the group your answers starting with whoever is in charge of the yellow sheet.
Stop reading and check. 6) Did you all come to the same answers? If no, go to number 7.
(Go to number 7 now).
If yes, write the sums and the answers on the yellow sheet. Now go to number 9. 7) If you did not all get the same answer, talk to each other and decide what the correct answer is. Tell each other what kind of sum you did and why. Stop reading and talk. 8) Write down on the yellow paper what the correct answer is and how you worked it out. Stop reading and write. 9) Check the answer card to see if you are correct.
Stop reading and check. 10) If you are correct, let the teacher know you are finished. 11) If you are wrong, try and work out the correct answer again.
Stop reading and work. 12) When you have done this go back to number 8.

Dyads
Instructions 1) You are the teacher. First of all, re-read the problem to remind yourself what it was about. Stop and read. 2) Turn the problem over and tell your partner what the problem is about.
Stop and speak. 3) Give the problem to your partner and ask him or her to read it. Tell them to let you know when they have finished reading it. Stop and wait. 4) Now you are going to ask your partner some questions.
• Ask your partner -What is the first thing you have to find out? If the sum is wrong, help them until they get it right. 6) When you have finished, let the teacher know. devenant plus étroitement coordonnées. De plus, les performances des sujets du groupe expérimental étaient généralement supérieures.