1 Introduction

There has been an increase in the use of collaborative learning methods in recent decades (Ansari & Khan, 2020; Qureshi et al., 2023). Peer feedback, which provides opportunity for students to learn from one another, has become increasingly popular, especially in higher education (Huisman et al., 2019). Students develop their capacity to consider problems from several perspectives by reflecting on their peers’ comments critically (Chen et al., 2020). Therefore, comments as a type of peer feedback may benefit students’ writing by bridging the knowledge gap between what they already know and the prospective performance (Hattie & Clarke, 2018). During the process of peer review, students can provide suggestions, offer opinions, and respond to others by using the “embedded comment” feature on Google Docs (Zhu & Carless, 2018). It is common to think of the effect of comments from the individual perspective; however, how comments affect student writing performance at the group level cannot be overlooked as this activity involves a great deal of peer interaction in providing and receiving information, as well as negotiation within groups (Tajabadi et al., 2023). In such interaction, received comments may influence each partner’s ideas as well as subsequent reflective revisions (Nicol, 2021).

Previous studies have demonstrated that the quality of comments depends on its content (Carless, 2022; Langfeldt et al., 2021; Zaharie & Osoian, 2016). Studies have shown that not all kinds of comments are beneficial to student writing (Latifi et al., 2021; Rasheed et al., 2020); for example, ambiguous and poorly-organised peer feedback will stop students from improving their writing performance (Huisman et al., 2019). In addition, only a few studies have addressed the content of comments in the online context.Therefore, it is essential to investigate what defines the quality of comment contents and which type of comment content has the greatest effect on student writing performance in the online context.

Moreover, there is a lack of studies that have checked the influence of comments on different manuscript sections (Dawadi et al., 2021) due to the fact that the characteristics of manuscript sections may differ from one another. For instance, the Introduction section discusses the study’s goal, background, and technical significance. Students have to demonstrate data collection, analysis, and method limitations in the Methodology section. As for the Discussion & Conclusion section, students should also include personal opinions on data interpretation, results interpretation, study limitations, and future research. So, these three sections are subjective and contain more personal interpretations (Dawadi et al., 2021). However, the Results section should only present statistical evaluations and experiment results, without personal explanation, making it challenging for students, especially when unfamiliar with each other’s topics (Costley et al., 2023). Therefore, it would be beneficial to understand how various peer feedback categories affect student writing in various manuscript sections. The aim of the current research is to explore what types of peer feedback are associated with the academic writing performance of five academic manuscript sections at both the dyadic and individual levels.

2 Literature review

2.1 how peer feedback affects student writing

Peer feedback is an efficient method for enhancing students’ learning, particularly when it comes to improving their writing performance (Patchan & Schunn, 2015; Yu & Liu, 2021). Both giving and receiving feedback encourages students to make more revisions, contributing to their writing development (Tajabadi et al., 2023). Encouraging students to actively engage in the feedback process, such as explanation of problems and reflection on received feedback, may help them become more autonomous, independent, and more proficient in their academic writing (Wu & Schunn, 2021). In the present study, peer feedback refers to the interactive dialogue among students using the embedded comment functions within Google Docs. This approach allows students to engage in thoughtful critique and make suggestions based on criteria set in rubrics (Liu et al., 2023). However, not all kinds of comments are beneficial for student writing performance; for instance, peer comments that lack justification can make subsequent analysis difficult and reduce students’ motivation for further reflection and text revision (Nicol et al., 2014). Additionally, peer comments often do not include suggestions for improving text quality, which limits their effectiveness in fostering writing improvement (Latifi et al., 2021). While some studies suggest that feedback content may be crucial (Carless, 2022), in-depth research onhow different types of peer feedback affect students’ writing performance is limited. Therefore, a number of articles have called for further research to bring additional clarity to the necessary structure to assemble high-quality peer feedback.

2.2 Overview of feedback and broad feedback style

Peer feedback can be organised in different ways. One way to organise it is as a unit of information that is categorised into one of three different types, as can be seen in Fig. 1 (Gielen & De Wever, 2015): elaboration, verification, and general. Specifically, elaboration refers to comments including detailed information and guidance about the authors’ performance, while verification is a judgement of whether what the author has done is right or wrong (Panadero & Lipnevich, 2022). “General” feedback only represents overview statements (Gielen & De Wever, 2015).

Fig. 1
figure 1

The three broad types of peer feedback (Gielen & De Wever, 2015)

Elaboration feedback not only shows whether original authors are meeting certain standards, but also explains why they are not and suggests strategies on how to improve their writing (Hyland, 2019). Consequently, successful feedback should consist of both verification and elaboration (Maier et al., 2016). However Wu and Schunn (2020b) suggested that elaboration with suggestions can occasionally fail to solve problems or create additional trouble in the writing. Therefore, peer feedback is varied and does not always help student learning, and more research is required to determine how these three broad types of peer feedback interact and impact student writing performance.

2.3 Verification and its subtypes

Verification can be defined as a judgement indicating whether what the student has written is correct or not (Dinsmore & Parkinson, 2013; Gan & Hattie, 2014). Such feedback may point out very specific previous behaviours, such as spelling or grammatical errors. According to Gielen and De Wever (2015), verification can be subdivided into three categories, which are positive, negative, and neutral, as can be seen in Fig. 2. In the present study, the comment stating, “This part is a bit confusing” is an example of negative verification, while the comment, “Good overview. It introduces materials and methods clearly” is an example of positive verification. When the reviewer gives a comment such as, “This part may be related to the aim of this paper,” this is defined as neutral verification.

Fig. 2
figure 2

The three subtypes of verification (Gielen & De Wever, 2015)

Although students may generally experience higher levels of satisfaction when they receive positive comments as opposed to negative comments (Carless, 2020), previous research has demonstrated that student writing performance may be significantly impacted by both positive and negative feedback (Shang, 2022). It is generally found that students who receive positive feedback make greater progress in their subsequent writing performance because positive feedback expresses that the original author’s writing met the editor’s expectations and may encourage the author’s pursuit of better writing goals (Sprouls et al., 2015). In contrast, negative feedback somehow indicates that the author’s writing needs improvement, and changes need to be made (Wisniewski et al., 2020). However, Novakovich (2016) pointed out that negative feedback may also help authors clearly identify problems so that students may be more likely to accept them. Some studies also pointed out that the amount of praise comments received is either unrelated to later writing quality (Wu & Schunn, 2020a) or is negatively associated with it (Wu & Schunn, 2021). In this case, it is thought that positioning problems in the context of peer feedback by negative verification is expected to improve the implementation of feedback.

2.4 Elaboration and its subtypes

Compared to verification, elaboration consists of substantive messages and includes some relevant information to guide writers to correct errors (Mayer & Alexander, 2016). Specifically, elaboration feedback provides detailed information about how to meet the criteria, conceptual knowledge, writing errors, procedural knowledge, and meta-cognitive knowledge (Panadero & Lipnevich, 2022). According to the paradigm of Gielen and De Wever (2015), elaboration feedback can be further divided into informative and suggestive categories based on whether the feedback provides guidance on future actions. Figure 3 provides a breakdown of how the elaboration feedback was classified.

Fig. 3
figure 3

The two subtypes of elaboration (Gielen & De Wever, 2015)

Specifically, suggestive feedback explains how to improve future performance. For instance, in the present study, “In these parts, you can mention some limitations about the papers that you have reviewed” is an example of suggestive elaboration. Regarding the focus of suggestive feedback, this can be specific and instructive, such as addressing errors, topics, responses, or alternatively, general and facilitating, offering instructions, for instance (Lipnevich & Panadero, 2021). Informative feedback is another subtype of elaborative feedback that offers a more detailed description about previous performance without providing direction on what to do (Gielen & De Wever, 2015). In other words, these informative statements only provide information about previous assessment statements without prompting students to modify their work. For example, the comment, “Is it related to reference #1?” is a type of informative elaboration. Generally speaking, students tend to provide slightly more suggestive explanations in their comments than informative explanations (Gielen & De Wever, 2015).

Previous research has suggested elaboration is more commonly associated with improved academic writing performance than verification (Fyfe & Rittle-Johnson, 2016; Maier et al., 2016). It might be the case that elaboration not only provides an explanation of the problem but also suggests some potential ways to make the writing better (Latifi et al., 2023). Similarly, Zhang (2020) revealed that students are more inclined to make revisions when they are given suggestions for improvement, or when the problems and suggestions are explained to them (Wu & Schunn, 2020a). Noroozi et al. (2016) also showed that students improve their writing performance through interacting with detailed, specific, and actionable feedback. However, if the elaboration is too long or too complex, it may reduce author attention and hinder them from understanding the information, resulting in less implementation of this type of feedback (Lipnevich & Panadero, 2021).

2.5 General peer feedback

The aim of general peer feedback is to convey some general claims that are neither verification nor elaboration (Gielen & De Wever, 2015). For instance, in our research, the comment, “Our group tends to describe the energy profiles with this style. We focus on the relative energy values, so the scale of the y-axis is often omitted” is a type of feedback that demonstrates general issues. There is no subtype of General peer feedback according to the paradigm generated by Gielen and De Wever (2015). Numerous studies have found that detailed feedback that directs the writer’s attention to the intended outcome is more helpful than general comments in motivating students to revise their work (Strijbos & Wichmann, 2018).

2.6 Feedback focus

As described, feedback can be broadly classified in three ways (elaboration, verification, and general), and further classified into five subtypes (suggestive elaboration, informative elaboration; positive verification, neutral verification, and negative verification). However, as for the content of comments, some of the classified comments may focus on deeper concepts, while others simply modify the written syntax or provide superficial-level suggestions. Therefore, according to the paradigm of Gielen and De Wever (2015), the aforementioned five subtypes can be further subdivided into four categories: abstract general, criteria general, criteria specific, and language. Detailed examples for each comment category can be seen in Appendix Table 7.

Specifically, abstract general refers to the general descriptions of the overall writing performance without mentioning any specific standards. For instance, the comment, “Currently doing more experiments on NQ and TriviaQA” is an example of abstract general feedback. Criteria general feedback refers to the general explanation of the minimum details of the specific standard, not just demonstrating whether the specific standard is correct or not. For example, the comment, “Please make it more clear” is a type ofcriteria for general feedback. Compared to criteria general feedback, criteria specific feedback provides more in-depth specific information about past writing performance in meeting certain criteria. An example comment is, “You should attach Tables 1 and 2 to this document.” When it comes to language feedback, this means making judgements on elements of the language, like word choice, spelling, punctuation, or sentence organisation. Language is the focus of the comment, “change capital to small letter,” for instance.

Table 1 Breakdown of sample gender and level of study
Table 2 Description of the rubrics used to evaluate each of the five writing assignments in the scientific writing course examined in the present study

Generally, specific comments are shown to be more beneficial than general ones in peer review (Latifi et al., 2023). It is thought that such comments lead to more willingness of authors to improve their writing when they receive specific comments from their peers since such feedback provides them with guidance and suggestions for further improvement (Coffin, 2020; Luo et al., 2016; Sung et al., 2016).

2.7 Effects of comments during peer interaction

Giving and receiving comments can affect not only individual writing performance but also communication and interaction within a group of two students (Li et al., 2010; Wu & Schunn, 2021). Through interaction with their group members, students are induced to reflect on their own learning process. Providing feedback and receiving feedback through group interaction are beneficial for the process of student learning. Offering comments helps in developing students’ critical thinking abilities and improving their understanding of the topic matter. At the same time, receiving feedback provides valuable perspectives on their own work and identifies areas that require development (Li & Grion, 2019). Particularly when students are writing or revising collaboratively, they can experience different and/or higher-level strategies from their partners (Li et al., 2012); therefore, students may imitate these strategies to gain a higher level of writing ability (Patchan & Schunn, 2015; Yim et al., 2017).

However, differences in writing abilities among peers may have an impact on dyadic writing performance. For instance, comments from less knowledgeable students are likely to share information that the author already has a basic understanding of. Because of this, the author might read pointless comments and waste time doing so, which would reduce their learning opportunities (Fyfe & Rittle-Johnson, 2016). Moreover, how different categories of comments affect dyad-level writing performance in different sections is still unclear, and further research is needed to understand whether the feedback may lead to different improved student dyad-level writing performance.

2.8 The present study

Feedback content seems to be essential for the influence of peer feedback on learning and performance. In connection with this, previous studies examined general versus specific feedback (Panadero & Lipnevich, 2022), and simple versus complex feedback (Narciss, 2008). According to Strijbos and Wichmann (2018), detailed and targeted feedback improves performance. However, the goal of the current study was to examine in further detail the content of peer feedback, and more precisely, the style, nature, and emphasis of messages that peers send to one another when working on writing tasks in a CSCL context. In the current study, a paradigm from Gielen and De Wever (2015) was applied due to the fact that it explored the benefits of an organised peer feedback process that has improved the quality of the content of peer feedback and contributes to improving the quality of student peer feedback, and showed high reliability in assessing how the level of structure affects the content of peer comments (Xiong et al., 2023). This framework ensures that feedback is accurate, relevant, and beneficial, thereby improving the effectiveness of mutually beneficial assessment and consequently enhancing students’ overall learning experience and writing skills.

Due to the fact that earlier studies have not classified the content of peer feedback into distinct categories and no attempts have been made to scientifically explore how the various peer feedback focuses influence student writing quality, the current study investigates, from both individual and dyadic perspectives, how various types of peer feedback impact student academic writing performance. Specifically, the current study aims to examine whether feedback content (elaboration, verification, general) matters in explaining student writing performance.

To achieve this, an investigation was undertaken to collect the peer feedback from 68 students within a scientific writing course offered at a university in South Korea, spanning a period of 10 instructional weeks. Table 1 provides a breakdown of the sample by gender andprogram. In our research, we employed a convenience sample selection method. This approach involves selecting participants who are readily accessible and willing to participate, rather than using a random sampling technique. The sample size was chosen as it is large enough to allow some basic statistical analysis, while being small enough to look at each individual student and partner’s work in detail. This type of in-depth analysis where individual comments are coded and analysed is time consuming and challenging from a research perspective; for this reason, this type of sample allows a more rigorous investigation. Despite some limitations, convenience sampling provides valuable insights and serves as a useful tool for preliminary research and exploratory studies.

The average age of the cohort at the completion of the study was 25.9 years old (SD = 3.7; min = 22, max = 39). Students were part of 16 different departments such as the Graduate School of Artificial Intelligence (n = 13), the Graduate School of Engineering (n = 12), through to the School of Computing (n = 1) and Aerospace Engineering Program (n = 1). The scientific writing course was designed to give instructions to students how to write manuscripts to submit for publication in scientific journals (Fanguy & Costley, 2021). It was delivered online through pre-recorded videos that were put on the course learning management system, allowing learners to pause, re-start, rewind, and fast-forward the content at their leisure. There were 56 lecture videos in total for 10 instructional weeks, and each week consisted of four to eight lecture videos. Course videos varied in duration, with an average length of nearly 12 min, covering themes related to STEM graduate scientific writing. As for the language proficiency of students, despite most students at the university being non-native English speakers, all students were required to achieve upper-intermediate and advanced English levels (roughly B2/C1 in The Common European Framework of Reference for Languages) by passing the TOEFL exam before being admitted.

The course was structured into five consecutive two-week units, each dedicated to instructing on one of the sections of manuscripts. These sections included (1) Introduction, (2) Methodology, (3) Results, (4) Discussion & Conclusion, and (5) Abstract. During the first week of each unit, students were required to watch a series of videos specific to the section of interest for that unit. The first week’s videos provided content and instruction about the aim, function, features, and conventions of the specific section. After students had viewed the introductory part of videos, they would then use Zoom software to participate in live meetings with the instructor for the course. The instructor then led a brief discussion on the main content of the video and was responsible for answering all students’ questions, after which the instructor broke the students into groups for peer interaction aimed at improving their understanding of the lecture videos.

During the second week of each two-week session, students had to watch additional lecture videos that covered aspects of writing style, language usage, and grammar specific to the particular section of the manuscript under focus. The requirement for students was to produce the first draft of the certain manuscript section before the subsequent Zoom meeting. No requirements for word count were imposed, and students were encouraged to establish their own structure and length, learning from journal style guides and previously published papers in their respective academic fields. Then, in the live Zoom sessions, the instructor provided instructions for the online peer review process. At this juncture, students filled out a questionnaire providing basic information about their field of study, degree program, research interest, area of expertise, and the title of their projects. This information was then summarised in a spreadsheet and shared with the entire class; therefore, students could choose partners with similar research interests during peer review. Then, students formed groups among themselves, and the instructor arranged the different dyads of students into breakout rooms where they could then conduct online peer review.

Detailed instructions on how students should approach reviewing their peer’s writing were included within the Google Docs, and to further assist students in their peer review process, they had the option to access instructional videos on peer review through the learning management system. Furthermore, students were encouraged to use the function of embedded comments, which allowed them to highlight the specific section of content they were providing feedback on. It was also possible for authors to give responses to the given comments so group members could participate in comment threads as needed. In addition to commenting on their peers’ writing, students were tasked with assessing the quality of their peers’ initial drafts using an assessment framework derived from Clabough and Clabough (2016). Table 2 presents a description of each of the five criteria that were evaluated in each of the five rubrics corresponding to the writing assignments.

After the second Zoom meeting for each unit, students were given two days to reflect on the comments they received from their peers on the first draft of their writing. Based on this feedback, they were tasked with creating a final draft and uploading it to the course learning management system where the course instructor provided feedback, advice, changes, and a final grade based on the same specialised grading criteria that the peer editors used.

The current study encompasses all comments provided to original authors across five manuscript sections to understand how the categories of feedback affect both groups and individuals within each section. Consequently, the following set of research questions is suggested:

  • RQ1: What kinds of comments are linked with increased (i) dyadic and (ii) individual writing quality in the Introduction section?

  • RQ2: What kinds of comments are linked with increased (i) dyadic and (ii) individual writing quality in the Methodology section?

  • RQ3: What kinds of comments are linked with increased (i) dyadic and (ii) individual writing quality in the Results section?

  • RQ4: What kinds of comments are linked with increased (i) dyadic and (ii) individual writing quality in the Discussion and Conclusion section?

  • RQ5: What kinds of comments are linked with increased (i) dyadic and (ii) individual writing quality in the Abstracts section?

3 Methodology

3.1 Research instruments

3.1.1 Comments

As students edit writing produced by their peers, they often provide suggestions or express their opinions regarding their peer’s written work. In this research, all embedded comments left by students in Google Docs were counted for categories of feedback according to the analytic framework identified by Gielen and De Wever (2015). A hierarchical tally was conducted to first determine whether the comments were elaboration, verification, or general. Thereafter, each elaboration comment was examined for classification as suggestive elaboration or informative elaboration, and each verification comment was examined for classification as negative verification, positive verification, or neutral verification.Finally, each comment was further broken down and categorised as either abstract general, criteria general, criteria specific, or language focused. At this point, for each student, the quantity of feedback was tallied for all categories of comments. To ensure the reliability of the feedback coding, an additional coder was invited to independently code all students’ comments in online peer review. When compared to the original coder’s results, the new coder agreed 96.26% of the time. Following this, the two coders discussed all disagreements one by one, and eventually, both coders agreed on a coding type for every comment. The template both coders initially worked from can be seen in Appendix Table 7.

3.1.2 Writing quality

The five assignments for the course in this study corresponded to the five manuscript sections: (1) Introduction, (2) Methodology, (3) Results, (4) Discussion & Conclusion, and (5) Abstract. As previously noted, student written work was assessed using a grading rubric adapted from Clabough and Clabough (2016), and their performance was graded on a scale from 0 to 10. Each assignment was double scored, firstly by the course teaching assistant and secondly by the course instructor. To increase scoring reliability, course teaching assistants participated in five training sessions (one session corresponding to each section of the paper) with the course instructor in order to learn about the rubric and rate sample sections for norming purposes. After all scoring was completed, the course instructor and teaching assistant met to discuss where scores diverged, and the course instructor made final decisions about scores based on these discussions.

3.1.3 Effectiveness of peer feedback

Three intentions of feedback were studied to explain effective peer feedback in the previous research (Cui et al., 2022). First, praise can be a motivator that enhances writing or revision activities overall due to the fact that motivation can increase the amount of effort made by the student (Han & Xu, 2020). Second, feedback can encourage or discourage specific earlier behaviour, such as a particular spelling error or particular writing style in the conclusion section (Benzie & Harper, 2020). Third, feedback can be utilised to shift a learner’s performance in a particular direction, not only towards or away from a former behaviour according to information-processing theories (Yan & Wang, 2018). All three perspectives of peer feedback may be important in the context of writing, but the informational element is the most important and will be looked at in the most detail. For example, if the original author disagrees with the problem assessment, comments may not truly encourage or discourage specific writing practices. In the current research, feedback that helps students to improve their writing performance is considered to be effective.

3.1.4 Analysis

For the analysis, five separate datasets were prepared for each of the five questions. Only students that maintained a fixed pair (dyad) for each section assignment were maintained for each dataset—this resulted in a reduction in sample size for each question (see Table 3 for the reduction in data for each analysis). Retaining cases that were part of dyads meant that the coefficients for the individual and dyadic analyses were based on contributions from the same sets of students, allowing for more logical comparisons.

Table 3 Descriptive statistics for the dependent variables in the study

The data were hierarchical with students nested in their respective dyads. However, multilevel modelling was not undertaken since there was generally limited variance within dyads (a necessary condition for performing the more sophisticated multilevel modelling). Students in dyads often experienced the same level of feedback. For example, for the broad feedback style of Verification for the Abstract section, 14 of the total 22 groups received the same frequency of such feedback, i.e., none. In addition, there were multiple instances when the academic writing scores of the dyad were identical. The removal of participants that exhibited no variance in the independent and dependent variables resulted in massively reduced datasets and associated loss of statistical power. For this reason, all data were analysed separately using multiple regression in terms of (i) mean of variables for each dyad, and (ii) individually. For example, for RQ1(a, i), the arithmetic means for each dyad were considered as unique cases, and for RQ1(a, ii), all individual participants were considered as unique cases. For example, pertaining to student performance on the Introduction section, individual analysis involves a dataset (n = 56) that includes each student as an individual case (row) whereas dyadic analysis involves a dataset (n = 28) that includes the average writing performance (and feedback) of the dyad as an individual case (row).

The statistical analysis itself involved the specification of three separate multiple regression models for each sub-question (note the three categorical levels in Appendix Table 7). For example, for RQ1(a, i), the broad effect of frequency, i.e., Elaboration, Verification, and General, is examined for dyadic Introduction writing performance. In addition, for RQ1(a, ii), the general effect of writing feedback, i.e., verification and elaboration types, is also examined for dyadic Introduction writing performance. Finally, for RQ1(a, iii), at the third level, the effect of the frequency of specific verification and elaboration focuses, i.e., AG, CG, CS, and L, is examined for dyadic Introduction writing performance. The same series of models were specified for individual Introduction writing performance for RQ1(a to c, ii).

All data was prepared and analysed with the assistance of the R programming language (R Core Team, 2021). Linear regression analysis was performed using the R base lm function. To interpret the practical significance of the effect of the independent variables on the academic writing performance, the R2 and f2 values were computed with f2= R2/1- R2(Cohen, 1992) with interpretations as follows: above 0.02 = small, above 0.15 = medium, and above 0.35 = large.

4 Results

As can be seen in Table 3, the total number of comments provided by the students varied for each section, with a maximum of 155 for the students’ Introduction sections and a minimum of 84 for the students’ Abstract sections. The descriptive statistics also reveal that students performed well in writing, with average scores above 8.00 for each section. Also note that while 68 total students participated in the study, missing data meant that the number of cases was slightly reduced for each analysis (see Table 3, n).

Tables 4, 5 and 6 illustrate the results of the modelling for eachmanuscript section of interest, respectively (RQ1, RQ3, and RQ5). The results reveal that no statistically significant effects were exhibited for the individual or dyadic analysis for Methodology writing (RQ2) or for Discussion & Conclusion writing (RQ4) (see Appendix Tables 8 and 9). Inside each table, results are presented for the six tests. For example, for Table 4, RQ1, results pertaining to the effect of the (a) broad, (b) general, and (c) specific classifications are presented for the (i) dyadic, and (ii) individual student Introduction writing performance. In each table, the individual-level mean and standard deviation for the independent variables are also presented.

Table 4 RQ1: Effect of feedback for dyadic and individual academic writing performance in introduction
Table 5 RQ3: Effect of feedback for dyadic and individual academic writing performance in results
Table 6 RQ5: Effect of feedback for dyadic and individual academic writing performance in abstracts

A general review of the mean number of comments received reveals that comments were predominantly elaborative. It is also noteworthy that students only gave “General” feedback to their peers for the Results section.

The findings related to the academic writing quality of the Introduction section are displayed in Table 4. The findings suggest that peer feedback has some impact in both the dyadic and individual assessments. At the dyadic level, abstract general verification (\(\:\beta\:\) = -5.43) was negatively associated with student writing. For the individual analysis, neutral verification (\(\:\beta\:\) = 1.29) was positively associated with student writing, while abstract general verification (\(\:\beta\:\) = -4.21) was also negatively associated with the student’s ability to write an Introduction.

Table 5 Shows the academic writing performance results for the results section. Findings indicate that verification (\(\:\beta\:\) = -1.02), positive verification (\(\:\beta\:\) = -2.71), and negative verification (\(\:\beta\:\) = -0.96) had a statistically significant negative association with student dyadic writing scores in the result section. However, no statistically significant effects were exhibited for the individual level analysis

Table 6 Presents the results for academic writing performance for the abstracts. For the dyadic analysis, negative verification (\(\:\beta\:\) = 1.27) was positively associated with student writing. In addition, criteria specific verification (\(\:\beta\:\) = 2.35) was also positively associated with student Abstract writing scores. In terms of individual analysis, informative elaboration feedback (\(\:\beta\:\) = -0.42) had a statistically significant negative association with student writing scores in the Abstract section, and negative verification feedback (\(\:\beta\:\) = 0.81) had a statistically significant positive association with Abstract writing scores.

5 Discussion

As explained, elaboration can be characterised as substantive feedback that includes relevant information that guides authors to correct errors in their writing. One of the main findings in this study is that much of the feedback in academic writing, at least for the current sample, can be classified as elaboration type of feedback. However, the frequency of verification type comments received by the author, though generally far less frequent, tended to be more systematically associated with student academic writing performance—though the direction of the effect tended to depend on the section characteristic. In the first part of the present Discussion section, there is an extensive focus on the type and focus of verifications, elaborations, and general feedback to answer all research questions. In the second part, theoretical practice and pedagogical implications of this study are discussed.

For RQ1, specific to the Introduction section, the findings indicate that abstract general verification feedback is associated with substantially lower levels of writing performance for this section (for both individuals and dyads). Such verification feedback is associated with corrective judgements where the editor identifies general, overall (abstract) issues with the written text. As a result, these evaluations could point out fundamental problems. The findings here are contrary to the previous research which have suggested that specific comments can be more beneficial than general comments (Lipnevich & Panadero, 2021), but suggest thatverification feedback may indicate more systemic violations in written academic conventions. It should be noted that these findings in the current study may reflect the cross-sectional design of the research. With a pretest-posttest design, where the quality of student draft work is accounted for prior to receiving comments, we might be better able to identify forms of editing that might be especially beneficial to the improvement of student work.

Neutral verification was also associated with improved individual writing in the Introduction section. This finding conflicts with prior research suggesting that students who receive content-level comments spend more time revising their essays to improve their writing than those who received only primary surface-level feedback (Coffin, 2020). It is possible that the original authors include background information, definitions, and conceptual illustrations in the Introduction (Öchsner, 2013). Therefore, such specific ideas are difficult for students to provide and receive detailed and in-depth feedback on the writing material itself, especially when students are not yet familiar with the objectives of the paper. As such, neutral and general statements are commonly offered.

As for RQ2, the results suggest that no comments in any categories have an effect on either individual or dyadic Methodology writing scores. This finding may have occurred because the aim of the Methodology section is to explain how the research might be replicated and for potential readers to judge the reliability and the validity of the study (Dawadi et al., 2021). That is, student editors can only provide comments regarding the context of data collection and analysis after going through the Methodology section. Although students were encouraged to pair themselves with others from the same field, they may not have been familiar with the procedure of data collection and analysis in the pieces of writing. In this case, it would have been difficult for them to make suggestions to their peers for improving this section. More research is needed to verify this claim, though.

To answer RQ3, the Results section, the findings reveal that verification, especially positive verification and negative verification type of feedback, are associated with lower dyad Results writing scores. This finding also conflicts with evidence suggesting that positive verification may increase students’ motivation for writing revision activities (Carless, 2020; Sprouls et al., 2015). Multiple studies have suggested that students who receive positive verification tend to exhibit higher levels of academic writing performance as it is thought that this kind of comment reflects that the authors’ writing meets the writing requirement and expectations, and that this positive encouragement may reinforce continued steps toward writing goals (Carless, 2020; Wisniewski et al., 2020). In the current study, we speculate that positive verification may lead to complacency, which may reduce deeper reflection on the content of the paper, thereby preventing students from further improving their writing. These findings also conflict with the work of Novakovich (2016), who claimed that verification feedback may assist students in identifying their problems and improving subsequent performance. One possible explanation is pointing out problems through verification, especially via negative verification, is not equal to giving instructions on how to change their work to match the standards, and thus, may not improve students’ writing performance. It is also possible that negative verifications cause students to lose confidence in their own writing, which affects their motivation to revise their writing, thereby inhibiting improvement in quality.

As for the individual writing performance, the present results suggest that no categories of comments have any association with students’ writing scores in the Results section. Although students did receive considerable comments in the Results section in the present study, it may be the case that the main idea of this latter part of the unit was to arrange their respective research findings in a logical sequence (Maxwell, 2008), so there were not many comments on how to make improvements. Another possible explanation is that the Results section is the most straightforward writing section, so that students did not struggle so much with writing it.

As for RQ4, pertaining to the Discussion and Conclusion, there is also no relationship between any categories of comments and student writing scores. This finding goes against the claim that both verification and elaboration feedback have a significant impact on learner performance (Shang, 2022; Maier et al., 2016). Generally speaking, in the Discussion and Conclusion, the author needs to describe the reasons leading to the experimental results, often according to their own subjective judgement (Öchsner, 2013); therefore, it is much easier for commenters to provide comments on the subjective descriptions. In the present study, students offered a large amount of elaboration feedback in the Discussion and Conclusion, and these comments contained a great deal of detailed information. However, none of these forms of feedback appeared to be associated with an improved Discussion and Conclusion. It may be the case that comments in these two sections happened to be too wordy or too complex to be understood, and they may have decreased the authors’ subsequent reflections, thereby preventing them from understanding the information and leading to a lower adoption rate of such feedback (Lipnevich & Panadero, 2021).

Finally, for RQ5, the findings suggest that negative verification, especially criteria-specific verification in dyads, is associated with improved writing performance in Abstracts. This result echoes evidence suggesting that negative verification indicates that the author’s writing is not on target, and after these errors are pointed out, the author may know that they need to take certain steps to change their writing in order to improve their performance (Wisniewski et al., 2020). In terms of feedback focus, the finding is in line with the research of Lipnevich and Panadero (2021) that specific comments are considered more helpful than general comments in peer review. This is due to the fact that it is more beneficial for authors to reflect on their own writing when problems are identified and called out and when editing choices are identified. Authors are more willing to correct their errors when they receive specific comments as opposed to general and superficial feedback (Panadero & Lipnevich, 2022).

Negative verification also appeared to have a positive association with individual writing of Abstracts. Concurrently, informative elaboration had a negative effect on individual student Abstract writing scores, which was in stark contrast with the work of Fyfe and Rittle-Johnson (2016) who stated that interpretive and understandable elaboration feedback has a greater positive impact on student writing than right or wrong verification feedback. A possible explanation for the reversed finding in the current paper is that informative elaboration does not include advice and future guidance for students to improve their writing. With a deeper understanding of the topic described in the paper, students likely prefer to receive more specific and in-depth suggestions from their peers to help them to improve their work; thus, informative elaboration does not meet their expectations.

Another general finding of this study is that mean frequencies of verification feedback received for authors tended to increase as the students moved through the subsequent units. Mean levels of verification were 0.27, 0.30, 0.50, 0.48, and 0.77 as time progressed. Though this finding was tangential and further tests would be needed to verify it, this pattern may be related to the fixed structure of academic papers, which include an Introduction, Methodology, Results, Discussion and Conclusion, and Abstract (Lin & Evans, 2012). Students may be more hesitant about making in-depth comments when they approach their peers’ writing on unfamiliar subjects (Santana, 2011). It is also possible that students need to have a process of familiarising themselves with the content of the paper; therefore, it is challenging for students to receive some specific suggestions from their peers who do not yet have a thorough understanding of the topic in the Introduction section. Instead, more comments focused on general details of the overall performance without mentioning any specific standards are received. Therefore, before beginning the online peer review, it is preferable to undertake some sort of orientation for the commenter to the paper’s content in order to receive more effective comments.

5.1 Theoretical practice

The current study showed that various forms of feedback had similar effects on both dyadic and individual writing performance. The results revealed that the same type of feedback affected various sections of the writing differently and that verification had a greater effect on student academic writing skill compared to elaboration. It is clear that analysing the feedback provided by peers is an effective educational intervention in the process of online peer review (Chen et al., 2020; Tajabadi et al., 2023). This is consistent with current research that emphasises the necessity of regulating feedback content to provide effective feedback (Langfeldt et al., 2021; Zaharie & Osoian, 2016). The current study provides fresh insight into this problem. To note, when students provide too much detailed information in one comment, it can cause the author to spend more time reflecting and to potentially ignore lengthy peer comments.

5.2 Pedagogical implications

The findings of this research are significant for both research and practice in education. They show that students in an online assessment environment gradually increase the quantity of their peer feedback, demonstrating the need for practice in peer feedback implementations. According to Panadero et al. (2016), in addition to ensuring this practice, teachers also need to supervise the peer review procedure and guide the students. Educators should pay attention to the categories of feedback in different manuscript sections that had the biggest impacts on student writing performance in order to support them in doing their best work. Previous studies showed that peer feedback quality includes information on why something was (in)correct (i.e., positive and negative verifications), along with recommendations for how to improve the writing, in addition to stating whether something is accurate or not (i.e., suggestive elaborations). One of the key conclusions of this study is that, at least for the sample used in the study, a large portion of feedback on academic writing may be categorised as elaborative. Nevertheless, in contrast to the outcomes of earlier research, the current study does not identify the advantages of providing extensive elaborations on student writing. This could be attributed to the potential risk that overly lengthy and intricate elaborations might decrease students’ attention and hinder their self-reflection during the peer review process. Moreover, Google Docs was shown to be sufficient to facilitate the reciprocal feedback processes; therefore, teachers are advised to use this tool to organise peer feedback practice in their classrooms despite any potential practice limits associated with the implementation of multiple peer review sessions.

6 Conclusion

In this research, an in-depth analysis was conducted on the peer review of documents authored by students at a Korean university to explore the connection between various peer feedback categories and student writing quality across different manuscript sections. The current study employed the categorization of peer feedback proposed by Gielen and De Wever (2015) to investigate the impact of these categories on writing performance in five distinct writing sections for both group and individual levels.

Additionally, this study offers valuable empirical evidence that highlights the varying effects of different types of comments on student writing. Notably, feedback categorised as “verification” is found to have correlation with student writing scores. These insights hold significance for educators and students. In terms of practical implications derived from our research, it is suggested that instructors may consider advising students to provide diverse forms of feedback across various manuscript sections in order to enhance the overall quality of their performance when engaging in online peer review.

Although this study fills some of the gaps in the research regarding the relationship between feedback categories and student writing performance, there are several inherent limitations. First, the sample size of the current study was small. While the results of this study point to the potential for optimising types of feedback, future studies should involve more participants to ensure more adequate power to test hypotheses. In addition, the study did not employ a pretest-posttest design; hence, it could not include student writing performance before feedback, which may impact the validity of the results. Furthermore, due to the way the class was taught, the current study explored this impact through five separate writing sections across time rather than examining it through a complete writing containing all five sections, which may have impacted the validity of the results. In order to improve the reliability and generalizability of the findings, future studies could include such a design. In addition, associations between peer feedback types and academic writing performance may exist but could be dependent on the manuscript section and cultural context, and further research is necessary in this direction.

However, the current findings suggest that offering comments is a valuable tool during online peer editing. Therefore, the authors recommend further research to develop a systematic approach to enhance the role of peer review. Such an approach would have great potential to enable students to extract and reflect on the numerous pieces of information from peer feedback to be able to improve their own writing performance.