A subscription to JoVE is required to view this content. Sign in or start your free trial.
We have developed a simple, customizable, and efficient method for recording quantitative processual data from interactive spatial tasks and mapping these rotation data with eye-tracking data.
We present a method for real-time recording of human interaction with three-dimensional (3D) virtual objects. The approach consists of associating rotation data of the manipulated object with behavioral measures, such as eye tracking, to make better inferences about the underlying cognitive processes.
The task consists of displaying two identical models of the same 3D object (a molecule), presented on a computer screen: a rotating, interactive object (iObj) and a static, target object (tObj). Participants must rotate iObj using the mouse until they consider its orientation to be identical to that of tObj. The computer tracks all interaction data in real time. The participant's gaze data are also recorded using an eye tracker. The measurement frequency is 10 Hz on the computer and 60 Hz on the eye tracker.
The orientation data of iObj with respect to tObj are recorded in rotation quaternions. The gaze data are synchronized to the orientation of iObj and referenced using this same system. This method enables us to obtain the following visualizations of the human interaction process with iObj and tObj: (1) angular disparity synchronized with other time-dependent data; (2) 3D rotation trajectory inside what we decided to call a "ball of rotations"; (3) 3D fixation heatmap. All steps of the protocol have used free software, such as GNU Octave and Jmol, and all scripts are available as supplementary material.
With this approach, we can conduct detailed quantitative studies of the task-solving process involving mental or physical rotations, rather than only the outcome reached. It is possible to measure precisely how important each part of the 3D models is for the participant in solving tasks, and thus relate the models to relevant variables such as the characteristics of the objects, cognitive abilities of the individuals, and the characteristics of human-machine interface.
Mental rotation (MR) is a cognitive ability that enables individuals to mentally manipulate and rotate objects, facilitating a better understanding of their features and spatial relationships. It is one of the visuospatial abilities, a fundamental cognitive group that was studied as early as 18901. Visuospatial abilities are an important component of an individual's cognitive repertoire that is influenced by both inherited and environmental factors2,3,4,5. Interest in visuospatial abilities has grown throughout the twentieth century due to mounting evidence of their importance in key subjects such as aging6 and development7, performance in science, technology, engineering, and mathematics (STEM)8,9, creativity10, and evolutionary traits11.
The contemporary idea of MR derives from the pioneering work published by Shepard and Metzler (SM) in 197112. They devised a chronometric method using a series of "same or different" tasks, presenting two projections of abstract 3D objects displayed side by side. Participants had to mentally rotate the objects on some axis and decide if those projections portrayed the same object rotated differently or distinct objects. The study revealed a positive linear correlation between response time (RT) and the angular disparity (AD) between representations of the same object. This correlation is known as the angle disparity effect (ADE). ADE is regarded as a behavioral manifestation of MR and became ubiquitous in several influential subsequent studies in the field13,14,15,16,17,18,19,20,21,22,23,24,25. The 3D objects employed in the SM study were composed of 10 contiguous cubes generated by the computer graph pioneer Michael Noll at Bell Laboratories26. They are referred to as SM figures and are widely employed in MR studies.
Two advancements were of great importance in Shepard and Metzler's seminal work; first, considering the contributions in the field of MR assessments. In 1978, Vanderberg and Kuze27 developed a psychometric 20-item pencil-and-paper test based on SM "same or different" figures that became known as the mental rotation test (VKMRT). Each test item presents a target stimulus. Participants must select among four stimuli, which ones represent the same object depicted in the target stimulus, and which do not. VKMRT has been used to investigate the correlation between MR ability and various other factors, such as sex-related differences6,21,24,28,29,30, aging and development6,31,32, academic performance8,33, and skills in music and sports34. In 1995, Peters et al. published a study with redrawn figures for the VKMRT35,36. Similarly, following the "same or different" task design, a variety of other libraries of computer-generated stimuli have been employed to investigate MR processes and to assess MR abilities (3D versions of the original SM stimuli19,22,23,37,38, human body mimicking SM figures25,39,40, flat polygons for 2D rotation41,42, anatomy, and organs43, organic shapes44, molecules45,46, among others21). The Purdue Spatial Visualization Test (PSVT) proposed by Guay in 197647 is also relevant. It entails a battery of tests, including MR (PSVT:R). Employing different stimuli from those in VKMRT, PSVT:R requires participants to identify a rotation operation in a model stimulus and mentally apply it to a different one. PSVT:R is also widely used, particularly in studies investigating the role of MR in STEM achievement48,49,50.
The second advancement of great importance in Shepard and Metzler's seminal work comprises the contributions to the understanding of the MR process, in particular, with the use of eye-tracking devices. In 1976, Just and Carpenter14 used analog video-based eye-tracking equipment to conduct a study based on Shepard and Metzler's ADE experiment. From their results on saccadic eye movements and RTs, they proposed a model of MR processes consisting of three phases: 1) the search phase, in which similar parts of the figures are recognized; 2) the transformation and comparison phase, in which one of the identified parts is mentally rotated; 3) the confirmation phase, in which it is decided whether the figures are the same or not. The phases are repeated recursively until a decision can be made. Each step corresponds to specific saccadic and fixational eye movement patterns in close relation to observed ADEs. Thus, by correlating eye activity to chronometric data, Just and Carpenter provided a cognitive signature for the study of MR processes. To date, this model, albeit with adaptations, has been adopted in several studies15,42,46,51,52,53.
Following this track, several ensuing studies monitoring behavioral18,19,22,23,25,34,40,54,55 and brain activity20,22,56,57 functions during stimuli rotation were conducted. Their findings point to a cooperative role between MR and motor processes. Moreover, there is a growing interest in investigating problem-solving strategies involving MR in relation to individual differences 15,41,46,51,58.
Overall, it can be considered that the design of studies aiming at understanding MR processes is based on presenting a task with visual stimuli that requests participants to perform an MR operation that, in turn, entails a motor reaction. If this reaction allows rotation of the stimuli, it is often called physical rotation (PR). Depending on the specific objectives of each study, different strategies and devices have been employed for data acquisition and analysis of MR and PR. In the task stimulus presentation step, it is possible to change the types of stimuli (i.e., previously cited examples); the projection (computer-generated images in traditional displays22,23,25,29,40,41,59, as well as in stereoscopes19 and virtual60 and mixed43 reality environments); and the interactivity of the stimuli (static images12,27,36, animations61, and interactive virtual objects19,22,23,43,53,59).
MR is usually inferred from measures of RTs (ADE), as well as ocular and brain activity25,46,62. Ocular activity is measured using eye-tracking data consisting of saccadic movements and fixations14,15,42,51,52,54,58,60, as well as pupillometry40. RT data typically arise from motor response data recorded while operating various devices such as levers13, buttons and switches14,53, pedals53, rotary knobs19, joysticks37, keyboard61 and mouse29,58,60, drive wheels53, inertial sensors22,23, touch screens52,59, and microphones22. To measure PR, in addition to the RTs, the study design will also include recording manual rotations of interactive stimuli while participants perform the MR task22,23,52,53.
In 1998, Wohlschläger and Wohlschläger19 used "same or different" tasks with interactive virtual SM stimuli manipulated with a knob, with rotations limited to one axis per task. They measured RT and the cumulative record of physical rotations performed during the tasks. Comparing situations with and without actual rotation of the interactive stimuli, they concluded that MR and PR share a common process for both imagined and actually performed rotations.
In 2014, two studies were conducted employing the same type of tasks with virtual interactive stimuli22,23. However, the objects were manipulated with inertial sensors that captured motion in 3D space. In both cases, in addition to RTs, rotation trajectories were recorded - the evolution of rotation differences between reference and interactive stimuli during the tasks. From these trajectories, it was possible to extract both cumulative information (i.e., the total number of rotations, in quaternionic units) and detailed information about solution strategies. Adams et al.23 studied the cooperative effect between MR and PR. In addition to RTs, they used the integral of the rotation trajectories as a parameter of accuracy and objectivity of resolution. Curve profiles were interpreted according to a three-step model63 (planning, major rotation, fine adjustment). The results indicate that MR and PR do not necessarily have a single, common factor. Gardony et al.22 collected data on RT, accuracy, and real-time rotation. In addition to confirming the relationship between MR and PR, the analysis of rotation trajectories revealed that participants manipulated the figures until they could identify whether they were different or not. If they were the same, participants would rotate them until they looked the same.
Continuing this strategy, in 2018, Wetzel and Bertel52 also used interactive SM figures in "same or different" tasks using touchscreen tablets as the interface. In addition, they used an eye-tracking device to obtain cumulative data on fixation time and saccadic amplitude as parameters of the cognitive load involved in solving MR tasks. The authors confirmed the previous studies discussed above regarding the relationships between MR and PR and the task-solving processes. However, in this study, they did not use fixation mapping and saccades data for the stimuli.
Methodological approaches for mapping eye-tracking data over virtual 3D objects have been proposed and constantly improved, commonly by researchers interested in studying the factors related to visual attention in virtual environments64. Although affordable and using similar eye-tracking devices, apparently, these methods have not been effectively integrated into the experimental repertoire employed in mental rotation studies with interactive 3D objects such as those previously mentioned. Conversely, we did not find any studies in the literature reporting real-time mapping of fixation and saccade motion data on interactive 3D objects. There seems to be no convenient method to integrate eye-activity data with rotation trajectories easily. In this research, we aim to contribute to filling this gap. The procedure is presented in detail, from data acquisition to graphical output generation.
In this paper, we describe in detail a method for studying mental rotation processes with virtual interactive 3D objects. The following advances are highlighted. First, it integrates quantitative behavioral motor (hand-driven object rotation via a computer interface) and ocular (eye-tracking) data collection during interaction sessions with 3D virtual models. Second, it requires only conventional computer equipment and eye-tracking devices for visual task design, data acquisition, recording, and processing. Third, it easily generates graphical output to facilitate data analysis - angular disparity, physical rotation, quaternionic rotation trajectories, and hit mapping of eye-tracking data over 3D virtual objects. Finally, the method requires only free software. All developed code and scripts are available free of charge (https://github.com/rodrigocnstest/rodrigocnstest.github.io).
1. Preparation of data collection tools
2. Data collection
3. Data processing and analysis
4. Task customization
NOTE: This entire section is optional and only recommended for those who like to experiment or understand how to code. Below, you will find some of the many customizable options available, and more options will become available as we develop the methods further.
Evolution of angular disparity and other variables
As depicted in step 3.3.1 in Supplemental File 2, two canvases are presented to the participant on the video monitor screen, displaying copies of the same 3D virtual object in different orientations. On the left canvas, the target object (tObj) remains static and serves as the target position or tObj position. On the right canvas, the interactive object (iObj) is shown in a different position and allows the partici...
As previously stated, this paper aims to present a detailed procedure of real-time mapping of fixation and saccade motion data on interactive 3D objects, which is easily customizable and only uses software available for free, providing step-by-step instructions to make everything work.
While this experimental setup involved a highly interactive task, such as moving a 3D object to match another object's orientation with PR in two of the three possible axes, we ensured thorough documentation...
The authors have no conflicts of interest to disclose.
The authors are thankful to the Coordination for the Improvement of Higher Education Personnel (CAPES) - Finance Code 001 and the Federal University of ABC (UFABC). João R. Sato received financial support from the São Paulo Research Foundation (FAPESP, Grants Nos. 2018/21934-5, 2018/04654-9, and 2023/02538-0).
Name | Company | Catalog Number | Comments |
Firefox | Mozilla Foundation (Open Source) | Any updated modern browser that is compatible with WebGL (https://caniuse.com/webgl), and in turn with Jmol, can be used | |
GNU Octave | Open Source | https://octave.org/ | |
Google Apps Script | Google LLC | script.google.com | |
Google Sheets | Google LLC | https://www.google.com/sheets/about/ | |
Laptop | Any computer that can run the eye tracking system software. | ||
Mangold Software Suite | Mangold | Software interface used for the Eye tracking device. Any Software that outputs the data with system time values can be used. | |
Mouse | Any mouse capable of clicking and dragging with simple movements should be compatible. Human interfaces analogous to a mouse with the same capabilities, such as a touchscreen or pointer, should be compatible, but may behave differently. | ||
Vt3mini | EyeTech Digital Systems | 60 Hz. Any working Eye Tracking device should be compatible. |
Request permission to reuse the text or figures of this JoVE article
Request PermissionExplore More Articles
This article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved