0% found this document useful (0 votes)
6 views8 pages

Limits of Metacognitive Prompts For Con Dence Judgments in An Interactive Learning Environment

The study investigates the effectiveness of metacognitive prompts for confidence judgments in an interactive learning environment, revealing no significant benefits in performance or metacognitive accuracy among students using the prompts. Despite the intention to enhance self-regulated learning, many students did not engage with the prompts as intended, leading to mixed results. The research highlights the need for alternative methods to support metacognitive skills in educational settings.

Uploaded by

meziab.1968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

Limits of Metacognitive Prompts For Con Dence Judgments in An Interactive Learning Environment

The study investigates the effectiveness of metacognitive prompts for confidence judgments in an interactive learning environment, revealing no significant benefits in performance or metacognitive accuracy among students using the prompts. Despite the intention to enhance self-regulated learning, many students did not engage with the prompts as intended, leading to mixed results. The research highlights the need for alternative methods to support metacognitive skills in educational settings.

Uploaded by

meziab.1968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Open Education Studies 2024; 6: 20220209

Research Article

Maria Klar*, Josef Buchner, Michael Kerres

Limits of Metacognitive Prompts for Confidence


Judgments in an Interactive Learning
Environment
https://doi.org/10.1515/edu-2022-0209 the regulation of that behavior, they are the two key pro-
received July 13, 2023; accepted November 3, 2023 cesses of metacognition (Nelson & Narens, 1990). There-
Abstract: Metacognitive activities are reported to improve fore, metacognition is an integral part of self-regulated
learning but prompts to support metacognition have only learning (Dinsmore, Alexander, & Loughlin, 2008).
been investigated with mixed results. In the present study, As part of monitoring, learners assess whether they
metacognitive prompts for confidence judgments were imple- have already understood a concept and can move on, or
mented in a learning platform to provide more insights into whether they should review the material. Monitoring helps
their effectiveness and their limits. Comparing the prompted students find sweet spots in their learning process:
group (n = 51) with the control (n = 150), no benefits of the Overconfident students invest too little time and effort,
prompts are seen: Performance is not better with prompts, while underconfident students might invest too much time
and there is no improvement in metacognitive accuracy (Son & Metcalfe, 2000). Monitoring helps the learner to con-
over time within the prompted group. Notably, half of the trol their learning behavior effectively, and thus, strength-
prompted group did not use the metacognitive prompts as ening metacognition might result in improved performance
intended. Alternative ways to integrate such prompts are (Ohtani & Hisasaka, 2018). If we can help learners by sup-
discussed. porting their metacognitive skills and activities, the question
is what shape this support could have and how this support
Keywords: metacognitive prompts, metacognitive accu- could be implemented in learning environments. One
racy, confidence prompts, self-regulated learning, K-12 possible measure of support is prompts for confidence
judgments that might help students not to be over- or
underconfident and to regulate their learning process
1 Introduction effectively. To date, it is yet to be determined whether
such metacognitive prompts have a beneficial effect
In interactive learning environments, students are often on learning. Hence, this research examines this evidence
required to regulate their learning, for example, in self- gap.
paced courses. To many students, especially lower per-
forming students, this can pose a challenge (DiFrancesca,
Nietfeld, & Cao, 2016). There are many models of self-regu-
lated learning, and they usually involve that students must 1.1 Metacognitive Accuracy
plan and monitor their learning to some degree (e.g.,
Winne, 1997; Zimmerman & Moylan, 2009). Monitoring is Monitoring can lead to more or less accurate judgments
the observation of one’s own thinking and learning beha- about cognition. Metacognitive accuracy is the degree of
vior, and together with its counterpart, control, which is correspondence between confidence and performance
(Jang, Lee, Kim, & Min, 2020). It is further distinguished
 into several aspects of metacognitive accuracy, the most
* Corresponding author: Maria Klar, Chair of Educational Technology common ones being relative accuracy (resolution) and abso-
& Instructional Design, University of Duisburg-Essen, Essen, 45141, lute accuracy (calibration [Jang et al., 2020; Schraw, 2009]).
Germany, e-mail: [email protected]
Absolute accuracy is low if a student works on a set of tasks
Josef Buchner: Institute for ICT and Media, St. Gallen University of
Teacher Education, St. Gallen, 9000, Switzerland
and expresses a high confidence, for example, between 80
Michael Kerres: Chair of Educational Technology & Instructional Design, and 100%, for these tasks but then scores low. However, for
University of Duisburg-Essen, Essen, 45141, Germany the same confidence levels and results, relative accuracy can

Open Access. © 2024 the author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 International License.
2  Maria Klar et al.

be high if the results are better on the tasks that had a et al., 2014). Such a prompt for a confidence judgment
confidence rating of 100% than on the tasks with a confi- requires the learner to use cues in order to assess their
dence of 80%. Conversely, relative accuracy can be low own performance (Koriat, 1997). With practice, students
while absolute accuracy is high. Therefore, these two aspects should get more fluent in recognizing cues for under-
of metacognitive accuracy are distinct. In assessments, both standing or a lack thereof. The more often students recog-
measures should be used to complement each other (Schraw, nize a lack of understanding, the more often they have the
2009), and students ideally should be accurate in both dimen- chance to regulate their learning, for example, by rereading
sions (Schwartz & Efklides, 2012). the instructions, and thus to improve their performance.
Generally, metacognitive accuracy is found to be far It is reported that metacognitive prompts can be ben-
from perfect (Glenberg & Epstein, 1985). Overall, students eficial for learning because they can make students reflect
tend to be overconfident in predictions of their perfor- on their learning more often (Sonnenberg & Bannert, 2015),
mance (Dunning, 2011; Hacker, Bol, Horgan, & Rakow, and with practice, confidence judgments can become more
2000; Maki & Berry, 1984; Miller & Geraci, 2011). More spe- accurate (Feyzi-Behnagh et al., 2014). More accurate judg-
cifically, Hacker et al. (2000) found that high-performing ments help the students to regulate their learning more
students were rather accurate in predicting and post- effectively and thus perform better. However, performance
dicting (i.e., prediction after task completion) their exam does not improve when students are able to monitor their
scores. The highest-performing group even showed some learning process but fail to take the step of regulating their
underconfidence. In contrast, the lower performing groups learning behavior (Dunlosky et al., 2021). There is also evi-
showed overconfidence which increased as performance dence that metacognitive prompts evoke neither monitoring
decreased (see also Bol, Hacker, O’Shea, & Allen, 2005; nor regulation of the learning process (Johnson, Azevedo, &
Lingel, Lenhart, & Schneider, 2019; Maki & Berry, 1984). D’Mello, 2011), and in several studies, students did not use
This is in line with the Dunning–Kruger–Effect (Dunning, the prompts as intended (Bannert & Mengelkamp, 2013;
2011), which stipulates that less skilled people are overconfi- Lingel et al., 2019; Moser et al., 2017). Therefore, evidence
dent in their self-judgments because their lack of skill keeps on the usefulness of metacognitive prompts is mixed.
them from correctly assessing what they do not know. Highly
skilled people show a weak tendency to underestimate their
abilities (Schwartz & Efklides, 2012). Their more comprehen-
sive knowledge of a domain enables them to better assess
which aspects they might not know yet. However, since
2 Research Questions and
research shows that even high-performing K-12 students Hypotheses
can be overconfident (Lingel et al., 2019), support struc-
tures for metacognition could benefit K-12 students from First, as described earlier, there is an evidence gap con-
all achievement levels. cerning the effectiveness of such prompts for learning.
Second, there is ample research that supports the claim
that high-performing students are better at assessing their
1.2 Metacognitive Prompts learning than low-performing students. We aim to further
test this claim with prompts in an authentic K-12 setting.
For students to use metacognition effectively, they need the Third, there is little research on whether prompts can con-
ability to monitor and control (Nelson & Narens, 1990), and tribute to improved metacognitive accuracy. Therefore,
they also need to use these abilities frequently enough this study is guided by the following research questions:
(Bannert & Mengelkamp, 2013). Instruction can help estab- RQ1: Does the use of metacognitive prompts lead to
lish the ability, while reminders in the form of prompts can higher performance?
help increase the frequency of metacognitive activities H1: The group that uses metacognitive prompts per-
(Moser, Zumbach, & Deibl, 2017). Metacognitive prompts forms better than the group that does not.
can take on different forms. In some studies, students are RQ2: Is performance related to metacognitive accuracy?
prompted to verbalize their thoughts and decisions while H2: Higher-performing students show better metacog-
learning (think-aloud method [Bannert & Mengelkamp, nitive accuracy than lower-performing students.
2008]), or prompts can ask the students to take notes on RQ3: Does metacognitive accuracy improve with repeated
how they want to plan their learning (Zumbach, Rammer- response to metacognitive prompts?
storfer, & Deibl, 2020). Prompts can also ask the students H3: Metacognitive accuracy is better for the last third
how confident they feel in their answers (Feyzi-Behnagh of the tasks than for the first third of the tasks.
Limits of Metacognitive Prompts for Confidence Judgments  3

3 Methods 3.2 Design of the Prompts

To test these hypotheses in an applied setting, a quasi- As discussed earlier, in experiments, metacognition is often
experimental study was conducted. Metacognitive prompts measured through verbalizations made by the learners after
were implemented in a learning platform. The prompts completing a set of tasks. However, online monitoring methods
asked for confidence judgments and were added to 33 mul- correlate more strongly with higher performance than offline
tiple-choice questions from an introductory course on measures (Ohtani & Hisasaka, 2018), so in this study, metacog-
computational thinking. In the experimental group, 51 sec- nitive prompts were presented online, with each multiple-
ondary school students worked through the course with choice question.
prompts; 150 secondary school students had completed The prompts consisted of four buttons representing a
the same course before the prompts were implemented and 4-point confidence scale: “sure, rather sure, unsure, no
functioned as a control group. Data on performance clue.” The buttons took the place of the “Submit”-button,
and confidence judgments were used to calculate absolute so that in order to submit the task, students had to click one
and relative metacognitive accuracy. of the four buttons. Figure 1 shows the multiple-choice
question design with the metacognitive prompt.

3.1 The Platform and the Course


3.3 Sampling
The study was conducted in cooperation with the German
learning platform PearUp,1 which was founded in 2017 and Before the implementation of the prompts, 150 students
has now merged with the platform eduki.2 PearUp was a for- had completed the course already, and their data were
profit learning platform where teachers could create interac- used for the control group. No personal data were gathered
tive material and design interactive courses. Inherent to the but teachers had to give a “class name” when they wanted
learning experience with PearUp was a gamified meta-narra- to use the material for their groups. These class names
tive: Students started their own “start-up business,” and when- were provided by PearUp, and most of them implied the
ever they solved tasks, they collected “PearCoins,” which they students’ grade levels. Thus, it was inferred that the students
could invest into their start-up. In terms of feedback, in the control group were aged 11–18. The class names also
“PearCoins” were also a rough quantitative measure of implied that many students accessed this course as part of
how many tasks they had completed compared to their an extracurricular activity.
peers. There was a leaderboard of the three students scoring
highest on the indicators of the start-up. As part of the treat-
ment implementation, two dashboards were developed,
showing the rate of correct first attempts and metacognitive
accuracy. Because these features were implemented in the
live product, it was not possible to conduct randomized A/B-
Testing but instead the experimental data were collected
after implementation, and data from before the implemen-
tation were used for control.
The course titled “Introduction to Computational Thinking”
was designed by the content creators of PearUp. The average
duration was stated as 2h. There were six units on: “Sequences
and Algorithms,” “While-Loops,” “For-Loops,” “Comparing
While-Loops and For-Loops,” “Conditions,” and “Nested For-
Loops.” The course was designed as a fully online, self-paced
course. The majority of the tasks were interactive, such as
multiple-choice questions and coding tasks.

 Figure 1: Exemplary MC task with metacognitive prompts below, from


1 https://www.pearup.de/ the left: “sure,” “rather sure,” “unsure,” and “no clue” – experimental
2 https://eduki.com/ group (screenshot taken by the first author).
4  Maria Klar et al.

Data for the experimental group were gathered in sev- 4 Results


eral ways: One of the authors taught the course as part of
classes and extracurriculars; 28 students completed the
4.1 Descriptive Statistics
course this way. These were students from grades 7 to 9,
so they were aged 12–16. The rest of the data stem from
4.1.1 Performance
classes which were taught by other teachers. These teachers
were recruited through PearUp and call for participation on
In the control group (n = 150), on average, students solved
social media; 23 students from a similar completed the
52.3% of the tasks on their first attempt (SD = 16.7). In the
course in this way. The teachers reported a range of ages
group with metacognitive prompts (prompted group; n =
13–16. Together, this results in a sample size of 51 for the
51), students solved 54.7% of the tasks on their first attempt
experimental group.
(SD = 12.9). According to the Kolmogorov–Smirnov test for
PearUp complies with the General Data Protection
normality, performance is not normally distributed in the
Regulation, and no personal data were gathered. The stu-
combined sample of N = 201 (p = 0.011).
dents were informed that the anonymous interaction data
gathered by the platform would be used for research pur-
poses, and they were given the option not to use the mate-
4.1.2 Metacognitive Judgments
rial. None of the students chose this option.

The highest confidence judgment was given in 88.6% of the


tasks. Table 1 gives an overview of the distribution of judg-
3.4 Data Processing and Analysis ments for all tasks.
For reporting the results, the four confidence judg-
PearUp provided the raw data as well as a Jupyter Notebook ments are coded ranking from 3.0 = “sure” to 0.0 = “no
file with some preliminary pre-processing code written in clue.” The average of all confidence judgments is 2.85
Python. For further pre-processing, Python code was written (SD = 0.27); 23 of the 51 students only used the button for
in a Jupyter Notebook, using the pandas library. Performance the highest confidence judgment for all the tasks. When
was coded as correct (1.0) and wrong (0.0). Metacognitive these students are excluded, the average confidence judg-
judgments were coded as sure (1.0), rather sure (0.66), unsure ment remains high at 2.73.
(0.33), and no clue (0.0).

4.2 Testing the Hypotheses


3.5 Measures for Metacognitive Accuracy
4.2.1 RQ1: Does the use of metacognitive prompts lead
Absolute metacognitive accuracy measures how well stu- to higher performance?
dents can judge their performance. Schraw (2009, p. 36)
suggested the Absolute Accuracy Index: Because performance values were not normally distrib-
1
N uted and because the samples were of unequal size, the
Absolute accuracy index = ∑ (ci − pi )2 . (1) non-parametric Mann–Whitney U-test was used (Harwell,
N i=1
1988; Zimmerman, 1987). The group with metacognitive
Performance is scored as either 0 or 1. The learners prompts solved 54.7% of the tasks correctly on their first
give a confidence rating ranging between 0 and 1. The attempt, while the group without the prompts solved 52.3%
performance score is subtracted from the confidence score,
resulting in a number between −1 and 1. This number is
squared, so the result ranges from 0 to 1, with 0 indicating Table 1: Frequency distribution of confidence judgments for all tasks
the highest accuracy because the deviation between con-
fidence and performance is the lowest. Division through Correct Incorrect Total Percentage
the number of items results in the mean absolute accuracy. Sure 838 653 1,491 88.6
Relative metacognitive accuracy measures how confi- Rather sure 66 77 143 8.5
dence and performance correlate. Thus, one established Unsure 15 26 41 2.4
No clue 1 7 8 0.5
measure of relative metacognitive accuracy is Pearson’s r
Total 920 763 1,683
(Schraw, 2009).
Limits of Metacognitive Prompts for Confidence Judgments  5

correctly. The exact Mann–Whitney U-test yielded no sig- Hypothesis 2 is, therefore, accepted. Low-performing
nificant difference between the groups (U = 3,591; p = students show a lower absolute accuracy as well as a lower
0.257). This is also the case when only the students who relative accuracy compared to the high-performing group.
did not use the highest confidence button are considered
(performance: 54.4%; U = 1,989; p = 0.33). Therefore, hypoth-
esis 1, stating that the prompted group would outperform 4.2.3 RQ3: Does metacognitive accuracy improve with
the control group, is rejected. repeated responses to metacognitive prompts?

In order to test the hypothesis that metacognitive accuracy


4.2.2 RQ2: Is performance related to metacognitive improves over time, it is necessary to define temporal seg-
accuracy? ments of the task events. The students could choose the
order of the tasks to some degree, so they did not solve
The sample was split along the median into two groups. the tasks in exactly the same order. The data were split into
Students were ranked by their performance. The lower- three temporal segments. An average was calculated for
performing half (n = 26) was categorized as low-per- the first eleven tasks (t1), the second eleven tasks (t2), and
forming students (low). The other half (n = 25) was categor- the third eleven tasks (t3) for each student.
ized as high-performing students (high).

4.2.3.1 Absolute Accuracy


4.2.2.1 Absolute Accuracy Absolute accuracy is sensitive to task difficulty, and task
The lower-performing group chose slightly lower confi- difficulty was not standardized here. Still, the results show
dence levels than the higher-performing group (x̄ (low) = that absolute accuracy did not improve over time. While
2.76, x̄ (high) = 2.94), but they performed more poorly on confidence levels remained stable from t1 to t3, absolute
average (x̄ (low) = 0.44, x̄ (high) = 0.65), and thus, the low- accuracy varies in accordance with performance, as can be
performing group had a lower absolute accuracy (0.49) seen in Table 2.
than the high performing group (0.33). This difference is Thus, changes in absolute accuracy can be attributed to
significant according to the exact Mann–Whitney U-test changes in performance because confidence levels remain
(U = 106, p < 0.001). very stable.

4.2.2.2 Relative Accuracy 4.2.3.2 Relative Accuracy


The correlation between confidence and performance is For each segment, performance values and confidence
positive, weak, and not significant for the higher-per- values were tested for correlation. The results are as
forming group (r(23) = 0.2, p = 0.33). It is negative, mod- follows:
erate, and significant for the low-performing group (r(24) = ▪ tasks 1–11 (t1): r(49) = 0.03 (p = 0.83)
−0.46, p = 0.02). This means that in the low-performing ▪ tasks 12–22 (t2): r(49) = 0.033 (p = 0.81)
group, high confidence is moderately correlated with low ▪ tasks 23–33 (t3): r(49) = 0.045 (p = 0.75)
performance. The low-performing group has higher confi-
dence judgments when they perform poorly, while the Correlation is very low for all three time segments.
high-performing group has higher confidence judgments There is a marginal increase in relative accuracy, but it
when they indeed perform better. Fisher’s z transforma- is not significant between t1 and t3 (z = −0.072, p = 0.47).
tion shows that the difference is significant (z = 2.689, p Thus, hypothesis 3, stating that metacognitive accuracy
= 0.004). would increase, is rejected.

Table 2: Mean performance, confidence, and absolute accuracy across the three time segments

Tasks 1–11 Tasks 12–22 Tasks 23–33

Mean performance (min = 0, max = 1) 0.686 0.459 0.494


Mean confidence (“no clue” = 0, “sure” = 3) 2.85 2.84 2.86
abs. acc. (min = 1, max = 0) 0.28 0.50 0.47
6  Maria Klar et al.

5 Discussion & Mengelkamp, 2013; Lin & Lehman, 1999). Along these lines,
Stark and Krause (2009) found improved performance only
The aim of this study was to investigate the potential ben- for complex tasks, not simple ones. As the tasks used in the
efits and limits of metacognitive prompts in an interactive course for this study were less complex, the lack of perfor-
learning environment. For this purpose, a feature was mance improvement could be a further indication of this
implemented in a learning platform that required students pattern.
to assess their level of confidence for each answer to a mul- There is less research on whether students improve
tiple-choice question in a self-paced course. When answering metacognitive accuracy with repeated self-assessment.
a multiple-choice question, students chose between four con- Here, the students did not improve their metacognitive
fidence levels: “sure,” “rather sure,” “unsure,” and “no clue.” accuracy, which is in line with studies that saw no improve-
Such prompts are used in existing learning platforms, and ment in calibration after several training sets of quizzes or
they are easy to implement, so if they showed benefits, it practice tests (Bol et al., 2005; Bol & Hacker, 2001). In a study
would be efficient to implement them in more learning envir- by Hacker et al. (2000) students made predictions and post-
onments. Since there is little research on this particular type dictions for three exams and throughout the course, there
of prompt in an applied setting, this research looked into their was instruction and emphasis on the benefits of self-assess-
effectiveness in a K-12 setting. ment. Here, the high-performing students, but not the low-
The results show that almost half of the students exclu- performing students, showed an increase in accuracy. The
sively chose the highest confidence judgment, “sure.” This present study provides evidence that without additional
could be regarded as non-compliant use, because if students instruction, repeated exposure to metacognitive prompts
actually engaged in monitoring and received feedback on does not increase metacognitive accuracy.
their varying performance, some variation in confidence
levels would be expected. The other half of the students
did show some variation in their confidence judgments.
Overall, average self-assessment can be described as over- 5.2 Practical Contributions
confident. This level of overconfidence is in line with find-
ings from a study with a similar sample from Lingel et al. When designing interactive learning environments, prompts
(2019): German middle school students took a math test and like the ones used in this study are relatively easy to imple-
judged their results as correct and likely correct in 84% of ment and could be used as part of the default course design.3
the cases before the task and 73% after the task, while only However, as a limit, it should be critically examined whether
answering 52% of the questions correctly. such prompts by themselves have a beneficial effect. As
With this high level of overconfidence and non-com- Schwonke (2015) pointed out, metacognitive processes could
pliance, there is little leverage for a beneficial influence on be ineffective or even detrimental if they use cognitive
performance and hardly any room for improvement. resources in a way that does not support the learning process.
Consequently, no significant difference in performance Schwonke suggested categorizing metacognitive load as a
between the group with metacognitive prompts and the kind of working memory load. It is plausible to assume
control group could be found. As expected from previous that mismatched or overly complex metacognitive prompts
research, higher-performing students showed better abso- hinder the learning process.
lute and relative accuracy than lower-performing students And yet, metacognitive prompts – even in the simple
in the prompted group (Hacker et al., 2000; Miller & Geraci, form that was used in this study – might be beneficial if
2011). Finally, there was no improvement in relative or abso- they are complemented with explicit instruction on meta-
lute metacognitive accuracy across time. cognition and its role in learning (Kistner et al., 2010) and
enough training time for students to engage in monitoring
(Bannert & Mengelkamp, 2013). In order to avoid fatigue,
these prompts could be used for a limited amount of course
5.1 Empirical Contributions time with instruction in the beginning and formative

There is mixed evidence on the effects of metacognitive


prompts on students’ performance. Metacognitive prompts 
3 For example, as of January 2024, the platform “Area9 Rhapsode
were shown to lead to better performance in some cases Learner” (https://area9lyceum.com/the-platform/rhapsode-learner/)
(Renner & Renner, 2001; Veenman, Kok, & Blöte, 2005), uses such prompts for every multiple-choice question and reports
though some found an effect only for transfer tasks (Bannert these data back to the learners as a score for “meta learning.”
Limits of Metacognitive Prompts for Confidence Judgments  7

reflection throughout the course. Furthermore, the learning advisable to not use them extensively but intentionally and
environment used for this study had a gamified meta-nar- with supplementary instruction and formative reflection.
rative, but students did not receive game benefits if they
judged themselves correctly. In a future iteration of the Acknowledgments: We thank PearUp for implementing
prompting feature, it could be tested whether game benefits the prompting feature and providing the raw data.
for accuracy could provide an incentive for students to
invest the required mental effort demanded by monitoring. Funding information: No funding was received for con-
ducting this study.

5.3 Limitations and Future Research Conflict of interest: The authors state no conflict of interest.

Above all, the lack of variation in confidence judgments Data availability statement: The datasets generated during
indicates that metacognitive prompts were not used as and/or analyzed during the current study are available from
intended, especially regarding, but not limited to, the stu- the corresponding author on reasonable request.
dents who only used the highest confidence judgment. This
non-compliant use presents an issue concerning validity and
impedes testing the hypotheses to some degree. Conclusions
about the effects of metacognitive prompts can only be made
References
to the degree that the selection of the confidence button
Bannert, M., & Mengelkamp, C. (2008). Assessment of metacognitive skills
reflects an actual metacognitive judgment made by the
by means of instruction to think aloud and reflect when prompted.
student. Does the verbalisation method affect learning? Metacognition and
This student behavior might have been caused by Learning, 3(1), 39–58. doi: 10.1007/s11409-007-9009-6.
some choices in the sampling and research design. The Bannert, M., & Mengelkamp, C. (2013). Scaffolding hypermedia learning
sample of the prompted group was primarily gathered in through metacognitive prompts. In R. Azevedo & V. Aleven (Eds.),
contexts of extracurricular, voluntary activities where stu- International handbook of metacognition and learning technologies
(pp. 171–186). New York: Springer. doi: 10.1007/978-1-4419-5546-
dents might not have been motivated to invest the extra
3_12.
mental effort demanded by the prompts. On top of that, the Bol, L., & Hacker, D. J. (2001). A comparison of the effects of practice tests
gamified design of the learning platform might have con- and traditional review on performance and calibration. The Journal
tributed to this by evoking a playful sense of learning which of Experimental Education, 69(2), 133–151. doi: 10.1080/
does not ask for the increased mental effort required by 00220970109600653.
Bol, L., Hacker, D. J., O’Shea, P., & Allen, D. (2005). The influence of overt
metacognitive monitoring and control. However, as described
practice, achievement level, and explanatory style on calibration
earlier, participants in a study by Lingel et al. (2019), students accuracy and performance. The Journal of Experimental Education,
of similar demography, showed similar overconfidence in a 73(4), 269–290. doi: 10.3200/JEXE.73.4.269-290.
more formal setting. Still, future studies could investigate DiFrancesca, D., Nietfeld, J. L., & Cao, L. (2016). A comparison of high and
whether students react to such prompts differently in more low achieving students on self-regulated learning variables.
formal learning contexts. Learning and Individual Differences, 45, 228–236. doi: 10.1016/j.lindif.
2015.11.010.
Dinsmore, D. L., Alexander, P. A., & Loughlin, S. M. (2008). Focusing the
conceptual lens on metacognition, self-regulation, and self-regu-
lated learning. Educational Psychology Review, 20(4), 391–409. doi: 10.
6 Conclusion 1007/s10648-008-9083-6.
Dunlosky, J., Mueller, M. L., Morehead, K., Tauber, S. K., Thiede, K. W., &
Metcalfe, J. (2021). Why does excellent monitoring accuracy not
In this study, we have shown that a simple form of meta-
always produce gains in memory performance? Zeitschrift für
cognitive prompts without supplementary instruction con- Psychologie, 229(2), 104–119. doi: 10.1027/2151-2604/a000441.
fers no benefits to student learning and does not result in Dunning, D. (2011). Chapter five - The Dunning–Kruger effect: On being
improved metacognitive accuracy over time. Students showed ignorant of one’s own ignorance. In J. M. Olson & M. P. Zanna (Eds.),
high levels of overconfidence and non-compliance. When Advances in experimental social psychology (Bd. 44, pp. 247–296).
Cambridge, Massachusetts: Academic Press. doi: 10.1016/B978-0-12-
implementing such a feature into an interactive learning
385522-0.00005-6.
environment, it should be critically examined whether the
Feyzi-Behnagh, R., Azevedo, R., Legowski, E., Reitmeyer, K., Tseytlin, E., &
feature brings about the desired results. As such prompts Crowley, R. S. (2014). Metacognitive scaffolds improve self-judg-
can induce a higher (meta-)cognitive load and might ments of accuracy in a medical intelligent tutoring system.
negatively influence motivation and affect, it might be Instructional Science, 42(2), 159–181. doi: 10.1007/s11251-013-9275-4.
8  Maria Klar et al.

Glenberg, A. M., & Epstein, W. (1985). Calibration of comprehension. Ohtani, K., & Hisasaka, T. (2018). Beyond intelligence: A meta-analytic
Journal of Experimental Psychology: Learning, Memory, and Cognition, review of the relationship among metacognition, intelligence, and
11(4), 702–718. doi: 10.1037/0278-7393.11.1-4.702. academic performance. Metacognition and Learning, 13(2), 179–212.
Hacker, D. J., Bol, L., Horgan, D. D., & Rakow, E. A. (2000). Test prediction doi: 10.1007/s11409-018-9183-8.
and performance in a classroom context. Journal of Educational Renner, C., & Renner, M. (2001). But I thought I knew that: Using con-
Psychology, 92(1), 160–170. doi: 10.1037/0022-0663.92.1.160. fidence estimation as a debiasing technique to improve classroom
Harwell, M. R. (1988). Choosing between parametric and nonparametric performance. Applied Cognitive Psychology, 15, 23–32. doi: 10.1002/
tests. Journal of Counseling & Development, 67(1), 35–38. doi: 10.1002/ 1099-0720(200101/02)15:1<23::AID-ACP681>3.0.CO;2-J.
j.1556-6676.1988.tb02007.x. Schraw, G. (2009). A conceptual analysis of five measures of metacog-
Jang, Y., Lee, H., Kim, Y., & Min, K. (2020). The relationship between nitive monitoring. Metacognition and Learning, 4(1), 33–45. doi: 10.
metacognitive ability and metacognitive accuracy. Metacognition 1007/s11409-008-9031-3.
and Learning, 15(3), 411–434. doi: 10.1007/s11409-020-09232-w. Schwartz, B. L., & Efklides, A. (2012). Metamemory and memory efficiency:
Johnson, A. M., Azevedo, R., & D’Mello, S. K. (2011). The temporal and Implications for student learning. Journal of Applied Research in
dynamic nature of self-regulatory processes during independent Memory and Cognition, 1(3), 145–151. doi: 10.1016/j.jarmac.2012.
and externally assisted hypermedia learning. Cognition and 06.002.
Instruction, 29(4), 471–504. doi: 10.1080/07370008.2011.610244. Schwonke, R. (2015). Metacognitive load – Useful, or extraneous concept?
Kistner, S., Rakoczy, K., Otto, B., Dignath-van Ewijk, C., Büttner, G., & Metacognitive and self-regulatory demands in computer-based
Klieme, E. (2010). Promotion of self-regulated learning in class- learning. Journal of Educational Technology & Society, 18(4), 172–184.
rooms: Investigating frequency, quality, and consequences for Son, L. K., & Metcalfe, J. (2000). Metacognitive and control strategies in
student performance. Metacognition and Learning, 5(2), 157–171. study-time allocation. Journal of Experimental Psychology: Learning,
doi: 10.1007/s11409-010-9055-3. Memory, and Cognition, 26(1), 204–221. doi: 10.1037/0278-7393.26.
Koriat, A. (1997). Monitoring one’s own knowledge during study: A cue- 1.204.
utilization approach to judgments of learning. Journal of Sonnenberg, C., & Bannert, M. (2015). Discovering the effects of meta-
Experimental Psychology: General, 126(4), 349–370. doi: 10.1037/0096- cognitive prompts on the sequential structure of SRL-processes
3445.126.4.349. using process mining techniques. Journal of Learning Analytics, 2(1),
Lin, X., & Lehman, J. D. (1999). Supporting learning of variable control in a Article 1. doi: 10.18608/jla.2015.21.5.
computer-based biology environment: Effects of prompting college Stark, R., & Krause, U.-M. (2009). Effects of reflection prompts on learning
students to reflect on their own thinking. Journal of Research in outcomes and learning behaviour in statistics education. Learning
Science Teaching, 36(7), 837–858. doi: 10.1002/(SICI)1098- Environments Research, 12(3), 209–223. doi: 10.1007/s10984-009-
2736(199909)36:7<837::AID-TEA6>3.0.CO;2-U. 9063-x.
Lingel, K., Lenhart, J., & Schneider, W. (2019). Metacognition in mathe- Veenman, M. V. J., Kok, R., & Blöte, A. W. (2005). The relation between
matics: Do different metacognitive monitoring measures make a intellectual and metacognitive skills in early adolescence.
difference? ZDM, 51(4), 587–600. doi: 10.1007/s11858-019-01062-8. Instructional Science, 33(3), 193–211. doi: 10.1007/s11251-004-2274-8.
Maki, R. H., & Berry, S. L. (1984). Metacomprehension of text material. Winne, P. H. (1997). Experimenting to bootstrap self-regulated learning.
Journal of Experimental Psychology: Learning, Memory, and Cognition, Journal of Educational Psychology, 89(3), 397–410. doi: 10.1037/0022-
10(4), 663–679. doi: 10.1037/0278-7393.10.4.663. 0663.89.3.397.
Miller, T. M., & Geraci, L. (2011). Unskilled but aware: Reinterpreting Zimmerman, B. J., & Moylan, A. R. (2009). Self-regulation: Where meta-
overconfidence in low-performing students. Journal of Experimental cognition and motivation intersect. In Handbook of metacognition in
Psychology: Learning, Memory, and Cognition, 37(2), 502–506. doi: 10. education (pp. 299–315). New York: Routledge/Taylor & Francis
1037/a0021802. Group.
Moser, S., Zumbach, J., & Deibl, I. (2017). The effect of metacognitive Zimmerman, D. W. (1987). Comparative power of student T test and
training and prompting on learning success in simulation-based Mann-Whitney U test for unequal sample sizes and variances. The
physics learning. Science Education, 101(6), 944–967. doi: 10.1002/ Journal of Experimental Education, 55(3), 171–174. doi: 10.1080/
sce.21295. 00220973.1987.10806451.
Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework Zumbach, J., Rammerstorfer, L., & Deibl, I. (2020). Cognitive and meta-
and new findings. In Psychology of learning and motivation (Bd. 26, cognitive support in learning with a serious game about demo-
pp. 125–173). Amsterdam: Elsevier. doi: 10.1016/S0079-7421(08) graphic change. Computers in Human Behavior, 103, 120–129. doi: 10.
60053-5. 1016/j.chb.2019.09.026.

You might also like