In a binary tree the *balance factor* of a node N is defined to be the height difference

BalanceFactor(N

) := –Height(LeftSubtree(N

)) + Height(RightSubtree(N

))
of its two child subtrees. A binary tree is defined to be an *AVL tree* if the invariant

BalanceFactor(N

) ∈ {–1,0,+1}

holds for every node N in the tree.

A node N with BalanceFactor(N) < 0 is called "left-heavy", one with BalanceFactor(N) > 0 is called "right-heavy", and one with BalanceFactor(N) = 0 is sometimes simply called "balanced".

Remark
In the sequel, because there is a one-to-one correspondence between nodes and the subtrees rooted by them, we sometimes leave it to the context whether the name of an object stands for the node or the subtree.

Balance factors can be kept up-to-date by knowing the previous balance factors and the change in height – it is not necessary to know the absolute height. For holding the AVL balance information, two bits per node are sufficient.

The height *h* of an AVL tree with *n* nodes lies in the interval:

log_{2}(*n*+1) ≤ *h* < *c* log_{2}(*n*+2)+*b*
with the golden ratio φ := (1+√5) ⁄_{2} ≈ 1.618, *c* := ^{1}⁄ log_{2} φ ≈ 1.44, and *b* := ^{c}⁄_{2} log_{2} 5 – 2 ≈ –0.328. This is because an AVL tree of height *h* contains at least *F*_{h+2} – 1 nodes where {*F*_{h}} is the Fibonacci sequence with the seed values *F*_{1} = 1, *F*_{2} = 1.

Read-only operations of an AVL tree involve carrying out the same actions as would be carried out on an unbalanced binary search tree, but modifications have to observe and restore the height balance of the subtrees.

Searching for a specific key in an AVL tree can be done the same way as that of a normal unbalanced binary search tree. In order for search to work effectively it has to employ a comparison function which establishes a total order (or at least a total preorder) on the set of keys. The number of comparisons required for successful search is limited by the height *h* and for unsuccessful search is very close to *h*, so both are in O(log *n*).

Once a node has been found in an AVL tree, the *next* or *previous* node can be accessed in amortized constant time. Some instances of exploring these "nearby" nodes require traversing up to *h* ∝ log(*n*) links (particularly when navigating from the rightmost leaf of the root’s left subtree to the root or from the root to the leftmost leaf of the root’s right subtree; in the AVL tree of figure 1, moving from node P to the *next but one* node Q takes 3 steps). However, exploring all *n* nodes of the tree in this manner would visit each link exactly twice: one downward visit to enter the subtree rooted by that node, another visit upward to leave that node’s subtree after having explored it. And since there are *n*−1 links in any tree, the amortized cost is found to be 2×(*n*−1)/*n*, or approximately 2.

When inserting an element into an AVL tree, you initially follow the same process as inserting into a Binary Search Tree. After inserting a node, it is necessary to check each of the node’s ancestors for consistency with the invariants of AVL trees: this is called "retracing". This is achieved by considering the balance factor of each node.

Since with a single insertion the height of an AVL subtree cannot increase by more than one, the temporary balance factor of a node after an insertion will be in the range [–2,+2]. For each node checked, if the temporary balance factor remains in the range from –1 to +1 then only an update of the balance factor and no rotation is necessary. However, if the temporary balance factor becomes less than –1 or greater than +1, the subtree rooted at this node is AVL unbalanced, and a rotation is needed. The various cases of rotations are described in section Rebalancing.

By inserting the new node Z as a child of node X the height of that subtree Z increases from 0 to 1.

Invariant of the retracing loop for an insertion
The height of the subtree rooted by Z has increased by 1. It is already in AVL shape.

In order to update the balance factors of all nodes, first observe that all nodes requiring correction lie from child to parent along the path of the inserted leaf. If the above procedure is applied to nodes along this path, starting from the leaf, then every node in the tree will again have a balance factor of −1, 0, or 1.

The retracing can stop if the balance factor becomes 0 implying that the height of that subtree remains unchanged.

If the balance factor becomes ±1 then the height of the subtree increases by one and the retracing needs to continue.

If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation after which the subtree has the same height as before (and its root the balance factor 0).

The time required is O(log *n*) for lookup, plus a maximum of O(log *n*) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log *n*) time.

When deleting an element from an AVL tree, swap the desired element with the minimum element in the right subtree, or maximum element in the left subtree. Once this has been completed delete the element from the new position (the process may need to be repeated). If the element is now a leaf node, remove it completely. Make sure to perform rotations to maintain the AVL property.

Steps to consider when deleting a node D in an AVL tree are the following:

- If D is a leaf or has only one child F, skip to step 6 with N:=D, G as the parent of D and
*dir* the child direction of D at G.

- Otherwise, determine node E by traversing to the leftmost (smallest) node in D’s right subtree which is the
*in-order successor* of D and does not have a left child (see figure 3).
- Remember node G, the parent of E, and
*dir*, the child direction of E at G (which is left if G≠D else right).
- Copy E’s key and possibly data to D, thus inserting a duplicate. The original E will be removed in step 7.
- Let N:=E and F be the right child of E (it may be null).
- If node N was the root (its parent G is null), update root.
- Otherwise, replace N at the child position
*dir* of G by F, which is null or a leaf. In the latter case set F’s parent to G.

(Now the last trace of D has been taken out of the tree, so D may be removed from memory as well.)

The height of the *dir* subtree of G has decreased by 1, either from 1 to 0 or from 2 to 1. So, let X:=G and N be the *dir* child of X in the code piece below, in order to retrace the path back up the tree to the root, thereby adjusting the balance factors (including possible rotations) as needed.

Since with a single deletion the height of an AVL subtree cannot decrease by more than one, the temporary balance factor of a node will be in the range from −2 to +2.

If the balance factor becomes ±2 then the subtree is unbalanced and needs to be rotated. The various cases of rotations are described in section Rebalancing.

By removing node N the height of the that subtree N of X has decreased by 1, either from 1 to 0 or from 2 to 1.

Invariant of the retracing loop for a deletion
The height of the subtree rooted by N has decreased by 1. It is already in AVL shape.

The retracing can stop if the balance factor becomes ±1 meaning that the height of that subtree remains unchanged.

If the balance factor becomes 0 then the height of the subtree decreases by one and the retracing needs to continue.

If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation. It depends on the balance factor of the sibling Z (the higher child tree) whether the height of the subtree decreases by one or does not change (the latter, if Z has the balance factor 0).

The time required is O(log *n*) for lookup, plus a maximum of O(log *n*) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log *n*) time.

In addition to the single-element insert, delete and lookup operations, several set operations have been defined on AVL trees: union, intersection and set difference. Then fast *bulk* operations on insertions or deletions can be implemented based on these set functions. These set operations rely on two helper operations, *Split* and *Join*. With the new operations, the implementation of AVL trees can be more efficient and highly-parallelizable.

*Join*: The function *Join* is on two AVL trees *t*_{1} and *t*_{2} and a key k and will return a tree containing all elements in *t*_{1}, *t*_{2} as well as k. It requires k to be greater than all keys in *t*_{1} and smaller than all keys in *t*_{2}. If the two trees differ by height at most one, *Join* simply create a new node with left subtree *t*_{1}, root k and right subtree *t*_{2}. Otherwise, suppose that *t*_{1} is higher than *t*_{2} for more than one (the other case is symmetric). *Join* follows the right spine of *t*_{1} until a node c which is balanced with *t*_{2}. At this point a new node with left child c, root k and right child *t*_{1} is created to replace c. The new node satisfies the AVL invariant, and its height is one greater than c. The increase in height can increase the height of its ancestors, possibly invalidating the AVL invariant of those nodes. This can be fixed either with a double rotation if invalid at the parent or a single left rotation if invalid higher in the tree, in both cases restoring the height for any further ancestor nodes. *Join* will therefore require at most two rotations. The cost of this function is the difference of the heights between the two input trees.
*Split*: To split an AVL tree into two smaller trees, those smaller than key *x*, and those larger than key *x*, first draw a path from the root by inserting *x* into the AVL. After this insertion, all values less than *x* will be found on the left of the path, and all values greater than *x* will be found on the right. By applying *Join*, all the subtrees on the left side are merged bottom-up using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is asymmetric. The cost of *Split* is order of
O
(
n
)
, the height of the tree.
The union of two AVLs *t*_{1} and *t*_{2} representing sets A and B, is an AVL *t* that represents *A* ∪ *B*. The following recursive function computes this union:

**function** union(t

_{1}, t

_{2}):

**if** t

_{1} = nil:

**return** t

_{2}
**if** t

_{2} = nil:

**return** t

_{1}
t

_{<}, t

_{>} ← split t

_{2} on t

_{1}.root

**return** join(t

_{1}.root,union(left(t

_{1}), t

_{<}),union(right(t

_{1}), t

_{>}))

Here, *Split* is presumed to return two trees: one holding the keys less its input key, one holding the greater keys. (The algorithm is non-destructive, but an in-place destructive version exists as well.)

The algorithm for intersection or difference is similar, but requires the *Join2* helper routine that is the same as *Join* but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the AVL tree. Since *Split* calls *Join* but does not deal with the balancing criteria of AVL trees directly, such an implementation is usually called the "join-based" implementation.

The complexity of each of union, intersection and difference is
O
(
m
log
(
n
m
+
1
)
)
for AVLs of sizes
m
and
n
(
≥
m
)
. More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth
O
(
log
m
log
n
)
. When
m
=
1
, the join-based implementation has the same computational DAG as single-element insertion and deletion.

If during a modifying operation (e.g. insert, delete) a (temporary) height difference of more than 2 arises between two child subtrees, the parent subtree has to be "rebalanced". The given repair tools are the so-called tree rotations, because they move the keys only "vertically", so that the ("horizontal") in-order sequence of the keys is fully preserved (which is essential for a binary-search tree).

Let Z be the child higher by 2 (see figures 4 and 5). Two flavors of rotations are required: simple and double. Rebalancing can be accomplished by a simple rotation (see figure 4) if the inner child of Z, that is the child with a child direction opposite to that of Z, (t_{23} in figure 4, Y in figure 5) is *not higher* than its sibling, the outer child t_{4} in both figures. This situation is called "Right Right" or "Left Left" in the literature.

On the other hand, if the inner child (t_{23} in figure 4, Y in figure 5) of Z *is* higher than t_{4} then rebalancing can be accomplished by a double rotation (see figure 5). This situation is called "Right Left" because X is right- and Z left-heavy (or "Left Right" if X is left- and Z is right-heavy). From a mere graph-theoretic point of view, the two rotations of a double are just single rotations. But they encounter and have to maintain other configurations of balance factors. So, in effect, it is simpler – and more efficient – to specialize, just as in the original paper, where the double rotation is called Большое вращение (russ. for *big turn*) as opposed to the simple rotation which is called Малое вращение (russ. for *little turn*). But there are alternatives: one could e.g. update all the balance factors in a separate walk from leaf to root.

The cost of a rotation, both simple and double, is constant.

For both flavors of rotations a mirrored version, i.e. `rotate_Right`

or `rotate_LeftRight`

, respectively, is required as well.

Figure 4 shows a Right Right situation. In its upper half, node X has two child trees with a balance factor of +2. Moreover, the inner child t_{23} of Z is not higher than its sibling t_{4}. This can happen by a height increase of subtree t_{4} or by a height decrease of subtree t_{1}. In the latter case, also the pale situation where t_{23} has the same height as t_{4} may occur.

The result of the left rotation is shown in the lower half of the figure. Three links (thick edges in figure 4) and two balance factors are to be updated.

As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2, where it is again, when t_{23} and t_{4} were of same height. Otherwise the leaf layer reaches level h+1, so that the height of the rotated tree decreases.

Code snippet of a simple left rotation
Figure 5 shows a Right Left situation. In its upper third, node X has two child trees with a balance factor of +2. But unlike figure 4, the inner child Y of Z is higher than its sibling t_{4}. This can happen by a height increase of subtree t_{2} or t_{3} (with the consequence that they are of different height) or by a height decrease of subtree t_{1}. In the latter case, if may also occur that t_{2} and t_{3} are of same height.

The result of the first, the right, rotation is shown in the middle third of the figure. (With respect to the balance factors, this rotation is not of the same kind as the other AVL single rotations, because the height difference between Y and t_{4} is only 1.) The result of the final left rotation is shown in the lower third of the figure. Five links (thick edges in figure 5) and three balance factors are to be updated.

As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the double rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2 and after the double rotation it is at level h+1, so that the height of the rotated tree decreases.

Code snippet of a right-left double rotation
Both AVL trees and red–black trees are self-balancing binary search trees and they are related mathematically. Indeed, every AVL tree can be colored red–black. The operations to balance the trees are different; both AVL trees and red-black require O(1) rotations in the worst case, while both also require O(log *n*) other updates (to colors or heights) in the worst case (though only O(1) amortized). AVL trees require storing 2 bits (or one trit) of information in each node, while red-black trees require just one bit per node. The bigger difference between the two data structures is their height limit.

For a tree of size *n* ≥ 1

an AVL tree’s height is at most
where

φ
:=
1
+
5
2
≈
1.618
the golden ratio,

c
:=
1
log
2
φ
≈
1.44
,
b
:=
c
2
log
2
5
−
2
≈
−
0.328
,
and

d
:=
1
+
1
φ
4
5
≈
1.07
.

a red–black tree’s height is at most
AVL trees are more rigidly balanced than red–black trees, leading to faster retrieval but slower insertion and deletion.