[From Proceedings of the Human Factors Society 41st Annual Meeting, September 22-26, 1997. Copyright by the Human Factors and Ergonomics Society, P.O. Box 1369, Santa Monica, CA 90406- 1369 USA; 310/394-1811, fax 310/394-2410, http://hfes.
Assessing Virtual Reality's Potential for Teaching Abstract Science
Marilyn C. Salzman, Chris Dede, R. Bowen Loftin, and Debra Sprague
Understanding how to leverage the features of immersive, three-dimensional (3-D) multisensory virtual reality to meet user needs presents a challenge for human factors researchers. This paper describes our approach to evaluating this medium's potential as a tool for teaching abstract science. It describes some of our early research outcomes and discusses an evaluation comparing a 3-D VR microworld to an alternative 2-D computer-based microworld. Both are simulations in which students learn about electrostatics. The outcomes of the comparison study suggest: 1) the immersive 3-D VR microworld facilitated conceptual and three-dimensional learning that the 2-D computer microworld did not, and 2) VR's multisensory information aided students who found the electrostatics concepts challenging. As a whole, our research suggests that VR's immersive representational abilities have promise for teaching and for visualization. It also demonstrates that characteristics of the learning experience such as usability, motivation, and simulator sickness are important part of assessing this medium's potential.
As new media emerge, human factors engineers are challenged to leverage their strengths to meet user needs. We were faced with such a challenge - to assess Virtual Reality's potential (VR) for teaching abstract science. To help you understand the nature of this challenge, we begin by discussing user needs and the features of VR that may be useful in supporting those needs. Then we describe what we have done to meet this challenge, focusing primarily on a research study designed to assess VR's representational abilities.
Consider the challenge students face when trying to grasp abstract science concepts. Mastery of such concepts requires students to build generic and runnable mental models (e.g., Larkin, 1983; Redish, 1993). These mental models need to incorporate invisible factors and abstractions. Unfortunately, real-life metaphors upon which to build these mental models may not exist, making it difficult for students to envision abstract phenomena (Frederiksen & White, 1992). For example, learning electrostatics or quantum mechanics involves understanding phenomena that behave in ways remote from direct experience. Additionally, students' real-life experiences are confounded with invisible factors that distort or contradict the principles they need to master. For example, the force of friction unobtrusively distorts objects' behaviors according to Newton's Laws of motion. Current pedagogical approaches and simulation technologies have not solved this problem; students continue to struggle with abstract, counterintuitive science concepts. Researchers in physics education agree that alternative tools and approaches are necessary (e.g., Hewitt, 1981; Redish, 1993; White, 1993).
Immersive virtual reality microworlds may support the type of learning environments students need. In fact, many researchers (e.g., Psotka, 1996; Winn, 1993) believe immersive VR has potential as a learning and training tool because of its representations. VR is three-dimensional (3-D) and physically immersive; it facilitates multiple frames of reference and supports multisensory interaction. By enabling students to manipulate and experience phenomena in new ways, immersive VR can 1) provide the experiential referents students lack, 2) enable them to perceive factors and relationships that might otherwise be invisible, and 3) support opportunities to challenge and refine their mental models. Unfortunately, despite widespread speculation about the value of VR's representational features, we understand little about how to leverage these features for learning on tasks beyond procedural training.
Here is what we do know. People can learn in VR and transfer that knowledge to tasks outside the VR environment (Bailey & Witmer, 1994; Regian, Shebilske, & Monk, 1992). However, transfer does not always occur, indicating that it is important to consider the nature of the learning task (Kozak, et al., 1993). Modest evidence also exists to suggest that students can learn physics better in a VR environment than through traditional lectures (Brelsford, 1993). Finally, some researchers report that students find learning in VR more motivating than learning through traditional lectures (Bricken & Byrne, 1993). These studies barely scratch the surface and reveal little about the representational features responsible for these outcomes. Our research has been dedicated to determining which VR features have potential for learning, as well as how to leverage those features in the design of virtual microworlds. Our early research suggested that VR's representations (immersive 3-D and multisensory cues) can enhance learning. For example, we found that students' abilities to visualize electric fields were enhanced significantly when they were able to view and manipulate 3-D representations of the electric fields. We found that multisensory cues could be used to direct students' attention to important behaviors and relationships (Dede, Salzman, & Loftin, 1996). We also observed that factors such as usability and simulator sickness influenced individual learning experiences and might have affected learning outcomes. For example, individuals varied substantially in 1) their facility with the interface and 2) their susceptibility to simulator sickness. Both factors tended to distract students from the learning task (Salzman, Dede, & Loftin, 1995).
Purpose for the current study
Through our early studies, we demonstrated that students can learn in VR. We also identified several factors that might influence the learning process. However, we were not able establish whether the learning was due to the method of instruction (the lessons) or VR's representations. Therefore, we conducted a comparison study that would help us determine whether VR's representations aid in the development of runnable mental models for abstract science concepts beyond features of an alternative computer-based simulation when used with identical pedagogical techniques.
We selected electrostatics as our domain. Mastery of electrostatics requires students to understand abstract 3-D phenomena (e.g., the distribution of force and energy created by a set of charged particles) for which they have no real-life referents. By carefully examining different types of learning (discussed in the methods section) and characterizing the learning experience, we hoped to understand some of the strengths and weaknesses of VR's representations.
We compared two microworlds designed to teach electrostatics: an immersive 3-D VR microworld we designed, MaxwellWorld (MW), and a similar, exemplary 2-D Macintosh-based commercial microworld, EM Field (EMF) (Trowbridge & Sherwood, 1994).
MW's interface was typical of current high-end virtual reality. Hardware included a graphics workstation, magnetic tracking system, head-mounted display, 3-D mouse and menu, stereo sound, and haptic vest. EM Field ran on a Macintosh with a 14" color display.
Both microworlds supported all of the learning activities we used in our study and had similar menu organization and interaction styles. In both simulations, students used a mouse, menus, and direct manipulation to perform learning activities. The microworlds differed primarily in their representational capabilities: MW supported immersive 3-D visual representations while EMF supported 2-D visual representations. See Figure 1. Additionally, MW supported multisensory representations, while EMF did not.
Because the microworlds were supported by different technologies, differences also existed in how students physically interacted with the systems. In MW, the mouse, menu and direct manipulation were 3-D; in EMF, they were 2-D. MW's physical interface was a typical VR interface, while EMF's was a typical Macintosh interface. Figure 2 shows a person interacting with MW. It is important to note the following: although these physical interaction differences presented a potential confound, we expected MW's physical interaction to be more cumbersome than EMF's, placing MW at a disadvantage. Thus, finding a learning advantage for MW would mean that MW was a strong learning environment despite weaknesses in its physical interface.
We conducted the study in two stages. During stage 1, we examined whether visual aspects of the microworlds' representations influenced learning outcomes. Fourteen high school students participated in stage 1. Male and female students were assigned randomly to one of two groups: EMF and MW. Both groups were equivalent in terms of their science background. Students completed two lessons in microworld to which they were assigned. In session 1, lessons covered concepts related to the distribution of force and energy in electric fields. They leveraged the visual representations used in EMF and MW. Lessons were constructed so that the content and learning activities were the same for both groups. Therefore, what differed from an educational perspective was the kind of representations the two groups of students worked with during the lessons.
During stage 2, we examined the "value added" by MW's multisensory representations. Seven EMF and MW students returned for stage 2 approximately 5 months after participating stage 1. All students used MW. In MW, they completed a third lesson that relied on multisensory cues to supplement the visual representations. This lesson built upon the concepts taught in earlier lessons. In both stages, the lessons consisted of a series of learning activities. Each successive learning activity built on the previous activities, increasing both in level of complexity and in the information integration necessary. Each learning activity asked the students to make verbal predictions about the activity, observe the actual outcomes of the activity, and verbally compare their predictions to their observations. We found this cycle of predict-observe-compare useful in gauging the students' understanding and progress, and for priming students for the upcoming activity. Lessons were administered verbally and student comments were videotaped and logged throughout the sessions. Lessons required approximately 1 hour 15 minutes to complete, and sessions lasted for approximately 2 hours. Students were given a break-time approximately half-way through each lesson. At the beginning of each stage of the study, we assessed pre-lesson understanding. Following each stage, we assessed post-lesson understanding. We used pre- and post-lesson understanding to determine whether learning occurred and to compare learning outcomes for the two groups. We also assessed stage 1 retention at the beginning of stage 2. The retention test was an abbreviated version of the post-lesson test used in stage 1. This test enabled us to assess how well the groups retained their learning over the 5 month period.
Learning & retention. In the pre- and post-lesson tests, as well as in the retention test, we measured three types of understanding: conceptual (describing key concepts and relationships); 2-D (working with 2-D sketches); and 3-D (demonstrating phenomena in 3-D). We examined these aspects of learning because we expected MW to help students construct better conceptual and 3-D understandings of the phenomena than EMF. Because MW students would have to translate 3-D information into 2-D to work with 2-D sketches, we did not expect to see MW students outperform EMF students on the sketches, but we did not expect them to do any worse. At this point, it is important to note that it is the conceptual and 3-D understanding of electric fields that are critical to mastery. The abilities to thoroughly explain these phenomena and to visualize and manipulate key concepts in 3-D are what educators strive for when teaching electric fields.
Learning experience. In addition to measuring learning, we attempted to characterize student learning experiences in MW and EMF. To account for the potential physical interaction confound (due to differences in MW's and EMF's interfaces), we measured simulator sickness and usability using subjective rating scales. (We used Kennedy, et al's (1993) Simulator Sickness Questionnaire for sickness ratings.) We expected simulator sickness to be greater and usability to be worse for MW than for EMF. To help us explain learning differences between the two groups, we also asked students to rate the meaningfulness of the representations and motivation. Finally, to provide insights into the nature of statistical outcomes, as well as to diagnose the strengths and weaknesses of the microworlds, examined student comments during the post-lesson interviews.
RESULTS & DISCUSSION
Learning & retention. To assess whether students learned as a result of completing lessons in the microworlds, we compared pre- and post-lesson test performance. As Table 1 shows, both groups demonstrated significantly better conceptual, 2-D, and 3-D understanding post-lesson than pre-lesson. These outcomes suggest that lessons in both EMF and MW were meaningful.
Table 2 contains adjusted mean post-lesson and retention scores for both groups. Scores are adjusted for pre-lesson performance measured during stage 1 and are useful for comparing how well the students learned and retained their understanding of electric fields.
To compare the groups' stage 1 learning, we used 1-tailed ANCOVAs. The pattern of outcomes was consistent with our predictions. First, MW students were better able to define concepts than EMF students. Second, MW students were not any worse than the EMF students at sketching concepts in 2-D. While MW students performed better on the force sketches, they performed worse on the sketches relating to potential, resulting in total sketch scores that were similar for the two groups. An explanation for this outcome may be that representations of force (lines and arrows) are more easily translated from 3-D to 2-D than representations of potential (surfaces). Third, MW students were better able to demonstrate concepts in 3-D than EMF students. Despite the inherent three-dimensionality of the lessons and demonstration exercises, all but one EMF student restricted answers to a single plane, drew lines when describing equipotential surfaces, and used terms such as "oval" and "line." In contrast, MW students described phenomena using 3-D gestures and phrases such as "sphere" and "surface." Finally, MW students were better able to predict how changes to the source charges would affect the electric field, recognize symmetries in the field, and apply their knowledge to novel problems. Although not statistically significant, differences also occurred in the Students' ability to describe electric fields in 3-D on the retention test.
Learning experience. We examined subjective ratings to characterize learning experiences in MW and EMF and to determine whether these factors affected learning. Table 3 shows these ratings. Motivation ratings indicated that students felt significantly more motivated by MW than EMF. However, motivation was not a significant predictor of learning outcomes. Additionally, the microworld students used (EMF vs. MW) predicted post-test scores beyond motivation (R2change = .28*), suggesting that it was not motivation alone that accounted for the differences each group's learning. Overall simulator sickness scores in MW were significantly greater than scores in EMF, though symptoms were not severe. When we examined different aspects of simulator sickness (disorientation, oculomotor discomfort, and nausea), we found that MW students experienced significantly more disorientation (F(1,11) = 6.47, p < .05) and oculomotor discomfort (F(1,11) = 5.12, p < .05) than EMF students. Finally, usability ratings suggested that students found using MW significantly more difficult than using EMF. Neither sickness nor usability significantly predicted learning outcomes.
In addition to the experience ratings, student comments provide insight into the nature of the learning experience. Overall, students described MW as fairly easy to use, interesting, and informative. They especially liked the three-dimensional representations, the ability to see phenomena from different angles, and the interactivity of the system. MW students found using the 3-Ball and virtual hand to be somewhat challenging and that the responsiveness of MW was problematic at times. Students described EMF as very easy to use, but somewhat boring. They found the simplicity of its graphics both a strength and a weakness. Comments about MW's interface (e.g., using the virtual hand was somewhat challenging) help explain why students rated MW as harder to use than EMF. Comments describing reactions to systems (e.g., MW was interesting, while EMF was boring) helped explain why students found MW to be more motivating than EMF.
Both learning and learning experience data for stage 2 yielded insights into the value of multisensory representations. In this stage, students demonstrated significantly better understanding of concepts, 2-D sketches, and 3-D demos during the post-lesson tests than during the pre-lesson tests. See Table 4. Stage 2 learning outcomes suggest that students benefited from the visual and multisensory representations used in the lesson. Mean motivation, simulator sickness, and usability ratings (1.80, 6.99, and 1.78 respectively) were similar to the ratings for MW in stage 1. Finally, ratings concerning multisensory representations (haptic and sound), post-lesson understanding, and student comments all suggest that students who experienced difficulty with the concepts benefited most from multisensory representations. The multisensory representations may have helped these students better understand visual representations.
Both stages of this study lend support to the thesis that immersive 3-D multisensory representations can help students develop better runnable mental models than 2-D representations. Learning and retention outcomes for stage 1 show that MW students were able to understand the space as a whole, recognize symmetries in the field, and relate individual visual representations (test charge traces, field lines, and equipotential surfaces) to the electric field and electric potential. EMF students had more difficulty with these tasks. Additionally, MW students appeared to visualize the phenomena in 3-D, while EMF students did not. The kind of understanding acquired by MW students is the kind of understanding critical to the mastery of electric fields. MW students were able to describe concepts and to envision this 3-D phenomena in a variety of novel, as well as familiar situations. Subjective ratings for stage 1 yielded converging evidence that the microworlds' representational differences were responsible for differences in learning. First, motivation, though higher in MW than in EMF, was not a predictor of learning. Second, students learned more using MW than EMF despite its usability and simulator sickness problems. Subject comments also reinforce this finding. For example, students cited MW's immersive three-dimensionality as one of its key strengths. Finally, even further evidence implicating MW's representations in facilitating learning comes from stage 2. By enhancing stage 1's visual representations with multisensory cues in stage 2, we were able to support additional learning gains. Further, it was the students who had difficulty with the concepts that seemed to benefit the most from the added information.
From a design perspective, our research indicates that 3-D multisensory VR microworlds can aid the development of runnable mental models for abstract science concepts. We suspect that immersive VR experiences succeed in supporting students in this task because they provide experiential referents and increase information saliency. Factors that appear to play a role in this process are immersion, 3-D representations, and multisensory information. From a methodology perspective, we believe this study shows the importance of gathering information about learning outcomes and the learning experience when assessing learning environments. Characteristics of the learning experience are useful in interpreting learning outcomes and vice versa. In fact, we believe that even more can be done to understand the interplay between these constructs. For example, a careful analysis of the learning process, usability, and how individual characteristics influence the learning experience may also be useful assessment tools.
As we continue our research, we plan to continue to explore the potential of VR's features. We also plan to incorporate additional assessment tools into our methodologies. We hope that our work will substantially add to our understanding of how to leverage the features of this emerging technology for teaching abstract science.
This work is supported by NSF's AAT Program, Grant RED-9353320, and by NASA's grant NAG 9-713. We gratefully acknowledge the aid of Jeff Hoblit, Deirdre McGlynn, Joe Redish, Saba Rofchaei, Chen Shui, Dane Toler & his students, and Susan Trickett.
Bailey, J. and Witmer, B. (1994). Learning and transfer of spatial knowledge in a virtual environment. In Proceedings of the Human Factors and Ergonomics Society 38th Annual Meeting, pp. 1158-1162.
Brelsford, J. (1993). Physics education in a virtual environment. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, pp. 1286-1290.
Bricken, M. and Byrne, C. M. (1993). Summer students in virtual reality. In Wexelblat, A. (Ed.), Virtual Reality: Applications and Exploration (pp. 199-218). New York: Academic Press, Inc.
Dede, C., Salzman, M., and Loftin, B. (1996). MaxwellWorld: Learning complex scientific concepts via immersion in virtual reality. In Proceedings of the 2nd International Conference on Learning Sciences, pp. 22-29.
Kennedy, R. S., Norman, E. L., Berbaum, K. S., and Lilienthal, M. G. (1993). Simulator sickness questionnaire: An enhanced method for quantifying simulator sickness. The International Journal of Aviation Psychology, 3(3), 203-220.
Hewitt, P. (1982). The missing essential - a conceptual understanding of physics. American Journal of Physics, 51(4), 305-311.
Kozak, J. J., Hancock, P. A., Arthur, E. J and Chrysler, S. T. (1993). Transfer of training from virtual reality. Ergonomics, 36 (7), 777-784
Larkin, J. The Role of Problem Representation in Physics, in Gentner, D. and Stevens, A. L. (Eds.), Mental Models. Lawrence Erlbaum Associates, Inc., Hillsdale NJ, 1983, pp. 75-98.
Frederiksen, J., and White, B. (1992). Mental models and understanding: a problem for science education. In Scanlon, E., and O'Shea, T. (Eds.), New Directions in Educational Technology (pp. 211-226). New York: Springer Verlag.
Psotka, J. (1996). Immersive tutoring systems: Virtual reality and education and training. Instructional Science, 23 (5-6), 405-423.
Redish, E. (1993). The implications of cognitive studies for teaching physics. American Journal of Physics, 62(9), 796-803.
Regian, J. W., Shebilske, W., and Monk, J. (1992) A preliminary empirical evaluation of virtual reality as a training tool for visual-spatial tasks. Journal of Communication, 42(4), 136-149.
Salzman, M. C., Dede, C., Loftin, R. B. Usability and learning in educational virtual realities. In Proceedings of HFES 39th Annual Meeting, San Diego, California, 1995, pp. 486-490.
Trowbridge, D., and Sherwood, B. EM Field. Raleigh, NC: Physics Academic Software, 1994.
White, B. (1993). Thinkertools: Causal models, conceptual change, and science education. Cognition and Instruction, 10, 1-100.
Winn, W. (1993). A conceptual basis for educational applications of virtual reality. (Tech. Rep. No. R-93-9). Washington: University of Washington, HITL. Available at: http://www.