Wearable eye tracking devices such as Tobii Pro Glasses 2 produce eye gaze data mapped to a coordinate system relative to the wearable eye tracker and the recorded video, not to static objects of interest in the environment around the participant wearing the eye tracker. For most statistical/numerical analysis to be meaningful, the collected eye tracking data needs to be mapped on to objects of interest and into a new coordinate system with its origin fixed in the environment around the participant.
Tobii Pro Lab addresses this challenge by allowing the user to map eye gaze data onto still images (snapshots) of environments and objects of interest. Data from a recording can be mapped onto one or several snapshot images. The snapshots are used for generating visualizations, such as heatmaps and gaze plots, and Areas Of Interest.
The mapping can be done either manually, or by using the Real-world Mapping function, which is Tobii Pro’s new software for automatic mapping using advanced algorithms.
For a walkthrough of the Real-world Mapping tool please watch the following video:
For Real-world Mapping to be able to interpret the snapshot images correctly, there are a few things you should consider when you select the picture you want to use as reference (the snapshot).
Real-world Mapping compares the snapshot with the picture frames in the recording from Pro Glasses 2. For this procedure to work as best as possible it is important that the scene in the snapshot is as ‘flat’ as possible. With ‘flat’ we mean that the scene should be as two-dimensional as possible in the sense that all objects in the images should be more or less on the same distance from your viewpoint and always visible, no matter your viewing angle. Imagine a grocery store shelf with lines of cans and cereal boxes; all items on the shelf will be visible even if you move a few meters to the left or to the right from your original position, only a bit skewed, but this is no problem as Real-world Mapping can interpret the image anyway. There is no risk of an item ‘shadowing’ another item; and this makes for a good reference snapshot as we never know where the participant will stand in front of the store shelf. As opposite we can describe a scenery which is much more three-dimensional — imagine a store desk with a cash register on it. To the left of the cash register and a little more to the back on the shelf, there is a can with pens. If we stand right in front of the desk, we see both the cash register and the can, the scene looks two-dimensional from this point of view; and a photograph (snapshot) of this scene would make both items fully visible and the snapshot image would in fact be correct. As long as the images from the participants recording are made from more or less the same position we have no problems. But, what if the participant stands a few meters to the right, so the can of pens is shadowed by the cash register, in effect the can will no longer be visible from that point in the recording and Real-world Mapping will not be able to map the data correctly.