Girish Mahajan (Editor)

3D user interaction

Updated on
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
3D user interaction

In 3D user interaction (3DUI) the human interacts with a computer or other device with an aspect of three-dimensional space. This interaction is created thanks to the interfaces, which will be the intermediaries between human and machine.


The 3D space used for interaction can be the real physical space, a virtual space representation simulated in the computer, or a combination of both. When the real physical space is used for data input, the human interacts with the machine performing actions using an input device that should know the relative position and distance of the user action, among other things. When it is used for data output, the simulated 3D virtual scene is projected onto the real environment through one output device.


Research in 3D interaction and 3D display began in the 1960s, pioneered by researchers like Ivan Sutherland, Fred Brooks, Bob Sproull, Andrew Ortony and Richard Feldman. But it was not until 1962 when Morton Heilig invented the Sensorama simulator. It provided 3D video feedback, as well motion, audio, and feedbacks to produce a virtual environment. The next stage of development was Dr. Ivan Sutherland’s completion of his pioneering work in 1968, the Sword of Damocles. He created a head-mounted display that produced 3D virtual environment by presenting a left and right still image of that environment.

Availability of technology as well as impractical costs held back the development and application of virtual environments until the 1980s. Since then, further research and technological advancements have allowed new doors to be opened to application in various other areas such as education, entertainment, and manufacturing.

3D user interfaces

3D user interfaces, are user interfaces where 3D interaction takes place, this means that the user’s tasks occur directly within a three-dimensional space. The user must communicate commands, requests, questions, intent, and goals to the system, and in turn this one has to provide feedback, requests for input, information about their status, and so on.

Both the user and the system do not have the same type of language, therefore to make possible the communication process, the interfaces must serve as intermediaries or translators between them.

The way the user transforms perceptions into actions is called Human transfer function, and the way the system transforms signals into display information is called System transfer function. 3D user interfaces are actually physical devices that communicate the user and the system with the minimum delay, in this case there are two types: 3D User Interface Output Hardware and 3D User Interface Input Hardware.

3D user interface output hardware

These hardware devices are usually called display devices or output devices and their aim is to present information to one or more users through the human perceptual system. Most of them are focused on stimulating the visual, auditory, or haptic senses. However, in some unusual cases they also can stimulate the user’s olfactory system.

3D visual displays

This type of devices are the most popular and its goal is to present the information produced by the system through the human visual system in a three-dimensional way. The main features that distinguish these devices are: field of regard and field of view, spatial resolution, screen geometry, light transfer mechanism, refresh rate and ergonomics.

Another way to characterize these devices is according to the different categories of depth perception cues used to achieve that the user can understand the three-dimensional information. The main types of displays used in 3D UIs are: monitors, surround-screen displays, workbenches, hemispherical displays, head-mounted displays, arm-mounted displays and autostereoscopic displays.

3D audio displays

3D Audio displays are devices that present information (in this case sound) through the human auditory system, its objective is to generate and display a spatialized 3D sound so the user can use its psychoacoustic skills and be able to determine the location and direction of the sound. There are different localizations cues: binaural cues, spectral and dynamic cues, head-related transfer functions, reverberation, sound intensity and vision and environment familiarity.

3D haptic displays

These devices use the sense of touch to simulate the physical interaction between the user and a virtual object. There are three different types of 3D Haptic displays: those that provide the user a sense of force, the ones that simulate the sense of touch and those that use both. The main features that distinguish these devices are: haptic presentation capability, resolution and ergonomics. The human haptic system has 2 fundamental kinds of cues, tactile and kinesthetic. Tactile cues are a type of human touch cues that have a wide variety of skin receptors located below the surface of the skin that provide information about the texture, temperature, pressure and damage. Kinesthetic cues are a type of human touch cues that have many receptors in the muscles, joints and tendons that provide information about the angle of joints and stress and length of muscles.

3D user interface input hardware

These hardware devices are called input devices and their aim is to capture and interpret the actions performed by the user. The degrees of freedom (DOF) are one of the main features of these systems. These systems are also differentiated according to how much physical interaction is needed to use the device, purely active need to be manipulated to produce information, purely passive do not need to. The main categories of these devices are desktop input devices, tracking devices, 3D mice, brain-computer interface.

Desktop Input devices

This type of devices are designed for an interaction 3D on a desktop, many of them have an initial design thought in a traditional interaction in two dimensions, but with an appropriate mapping between the system and the device, this can work perfectly in a three-dimensional way. There are different types of them: keyboards, 2D mice and trackballs, pen-based tablets and joysticks.

Tracking devices

3D user interaction systems are based primarily on tracking technologies, to obtain all the necessary information from the user through the analysis of their movements or gestures, these technologies are called, tracking technologies.

For the full development of a 3D User Interaction system, is required to have access to a few basic parameters, all this technology-based system should know, or at least partially, as the relative position of the user, the absolute position, angular velocity, rotation data, orientation or height.

The collection of these data is achieved through systems of space tracking and sensors in multiple forms, as well as the use of different techniques to obtain. The ideal system for this type of interaction is a system based on the tracking of the position, using six degrees of freedom (6-DOF), these systems are characterized by the ability to obtain absolute 3D position of the user, in this way will get information on all possible three-dimensional field angles.

The implementation of these systems can be achieved by using various technologies, such as electromagnetic fields, optical, or ultrasonically tracking, but all share the main limitation, they should have a fixed external reference, either a base, an array of cameras, or a set of visible markers, so this single system can be carried out in prepared areas.

Inertial tracking systems do not require external reference such as those based on movement, are based on the collection of data using accelerometers, gyroscopes, or video cameras, without a fixed reference mandatory, in the majority of cases, the main problem of this system, is based on not obtaining the absolute position, since not part of any pre-set external reference point so it always gets the relative position of the user, aspect that causes cumulative errors in the process of sampling data.

The goal to achieve in a 3D tracking system would be based on obtaining a system of 6-DOF able to get absolute positioning and precision of movement and orientation, with a precision and an uncut space very high, a good example of a rough situation would be a mobile phone, since it has all the motion capture sensors and also GPS tracking of latitude, but currently these systems are not so accurate to capture data with a precision of centimeters and therefore would be invalid.

However, there are several systems that are closely adapted to the objectives pursued, the determining factor for them is that systems are auto content, i.e., all-in-one and does not require a fixed prior reference, these systems are as follows:

Nintendo WII Remote ("Wiimote")

The Wii Remote device does not offer a technology based on 6-DOF since again, cannot provide absolute position, in contrast, is equipped with a multitude of sensors, which convert a 2D device in a great tool of interaction in 3D environments.

This device has gyroscopes to detect rotation of the user, accelerometers ADXL3000, for obtaining speed and movement of the hands, optical sensors for determining orientation and electronic compasses and infra-red devices to capture the position.

Should be noted that this type of device can be affected by external references of infra-red light bulbs or candles, causing errors in the accuracy of the position.

Microsoft KINECT

The Microsoft Kinect device offers us a different motion capture technology for tracking.

Instead of basing its operation on sensors, this is based on a structured light scanner, located in a bar, which allows tracking of the entire body through the detection of about 20 spatial points, of which 3 different degrees of freedom are measured to obtain position, velocity and rotation of each point.

Its main advantage is ease of use, and the no requirement of an external device attached by the user, and its main disadvantage lies in the inability to detect the orientation of the user, thus limiting certain space and guidance functions.

Leap Motion

The Leap Motion is a new system of tracking of hands, designed for small spaces, allowing a new interaction in 3D environments for desktop applications, so it offers a great fluidity when browsing through three-dimensional environments in a realistic way.

It is a small device that connects via USB to a computer, and used two cameras with infra-red light LED, allowing the analysis of a hemispheric area about 1 meter on its surface, thus recording responses from 300 frames per second, information is sent to the computer to be processed by the specific software company.

3D Interaction Techniques

3D Interaction Techniques are the different ways that the user can interact with the 3D virtual environment to execute different kind of tasks. The quality of these techniques has a profound effect on the quality of the entire 3D User Interfaces. They can be classified into three different groups: Navigation, Selection and manipulation and System control.

Navigation is the most used by the user in big 3D environments and presents different challenges as supporting spatial awareness, giving efficient movements between distant places and making navigation bearable so the user can focus on more important tasks. These techniques can be divided into two components: travel and wayfinding.


Travel is a conceptual technique that consists in the movement of the viewpoint from one location to another. This orientation is usually handled in immersive virtual environments by head tracking. Exists five types of travel interaction techniques:

  • Physical movement: uses the user's body motion to move through the virtual environment. Is an appropriate technique when is required an augmented perception of the feeling of being present or when is required physical effort form the user.
  • Manual viewpoint manipulation: the user's hands movements determine the displacement on the virtual environment. One example could be when the user moves their hands in a way that seems like is grabbing a virtual rope and pulls his self up. This technique could be easy to learn and efficient, but can cause fatigue.
  • Steering: the user has to constantly indicate where to move. Is a common and efficient technique. One example of this are the gaze-directed steering, where the head orientation determines the direction of travel.
  • Target-based travel: user specifies a destination point and the system effectuates the displacement. This travel can be executed by teleport, where the user is instantly moved to the destination point or the system can execute some transition movement to the destiny. These techniques are very simple from the user’s point of view because he only has to indicate the destination.
  • Route planning: the user specifies the path that should be taken through the environment and the system executes the movement. The user may draw a path on a map of the virtual environment to plan a route. This technique allows users to control travel while they have the ability to do other tasks during motion.
  • Wayfinding

    Is the cognitive process of defining a route for the virtual environment, using and acquiring spatial knowledge to construct a cognitive map of the virtual environment.

    In order for a good wayfinding, users should receive wayfinding supports during the virtual environment travel to facilitate it because of the constraints from the virtual world.

    These supports can be user-centered supports such as a large field-of-view or even non-visual support such as audio, or environment-centered support, artificial cues and structural organization to define clearly different parts of the environment. Some of the most used artificial cues are maps, compasses and grids, or even architectural cues like lighting, color and texture.

    Selection and Manipulation

    Selection and Manipulation techniques for 3D environments must accomplish at least one of three basic tasks: object selection, object positioning and object rotation.

    To accomplish these tasks usually the system provides the user a 3D cursor represented as a human hand whose movements correspond to the motion of the hand tracker. This virtual hand technique is rather intuitive because simulates a real-world interaction with objects but with the limit of objects that we can reach inside a reach-area.

    To avoid this limit, there are many techniques that have been suggested, like the Go-Go technique. This technique allows the user to extend the reach-area using a non-linear mapping of the hand: when the user extends the hand beyond a fixed threshold distance, the mapping becomes non-linear and the hand grows.

    There is another way to select and manipulate objects in 3D virtual spaces and that is pointing objects using a virtual-ray emanating from the virtual hand. When the ray intersects with the objects, it can be manipulated. Several variations of this technique has been made, like the aperture technique, which uses a conic pointer addressed for the user's eyes, estimated from the head location, to select distant objects. This technique also uses a hand sensor to adjust the conic pointer size.

    System Control

    System control techniques allows the user to send commands to an application, change the interaction mode or modify a parameter. The command sender always includes the selection of an element from a set. System control techniques can be categorized into four groups:

  • Graphical menus: visual representations of commands.
  • Voice commands: menus accessed via voice.
  • Gestural interaction: command accessed via body gesture.
  • Tools: virtual objects with an implicit function or mode.
  • Also exists different hybrid techniques that combine some of the types.


    3D user interaction Wikipedia