In queueing theory, a discipline within the mathematical theory of probability, a **polling system** or **polling model** is a system where a single server visits a set of queues in some order. The model has applications in computer networks and telecommunications, manufacturing and road traffic management. The term polling system was coined at least as early as 1968 and the earliest study of such a system in 1957 where a single repairman servicing machines in the British cotton industry was modelled.

Typically it is assumed that the server visits the different queues in a cyclic manner. Exact results exist for waiting times, marginal queue lengths and joint queue lengths at polling epochs in certain models. Mean value analysis techniques can be applied to compute average quantities.

In a fluid limit, where a very large number of small jobs arrive the individual nodes can be viewed to behave similarly to fluid queues (with a two state process).

A group of *n* queues are served by a single server, typically in a cyclic order 1, 2, …, *n*, 1, …. New jobs arrive at queue *i* according to a Poisson process of rate *λ*_{i} and are served on a first-come, first-served basis with each job having a service time denoted by a independent and identically distributed random variables *S*_{i}.

The server chooses when to progress to the next node according to one of the following criteria:

exhaustive service, where a node continues to receive service until the buffer is empty.
gated service, where the node serves all traffic that was present at the instant that the server arrived and started serving, but subsequent arrivals during this service time must wait until the next server visit.
limited service, where a maximum fixed number of jobs can be served in each visit by the server.
If a queueing node is empty the server immediately moves to serve the next queueing node.

The time taken to switch from serving node *i* − 1 and node *i* is denoted by the random variable *d*_{i}.

Define *ρ*_{i} = *λ*_{i} E(*S*_{i}) and write *ρ* = *ρ*_{1} + *ρ*_{2} + … + *ρ*_{n}. Then *ρ* is the long-run fraction of time the server spends attending to customers.

For gated service, the expected waiting time at node *i* is

E
(
W
i
)
=
1
+
ρ
i
2
E
(
C
)
+
(
1
+
ρ
i
)
Var
(
C
i
)
2
E
(
C
)
and for exhaustive service

E
(
W
i
)
=
1
−
ρ
i
2
E
(
C
)
+
(
1
−
ρ
i
)
Var
(
C
i
+
1
)
2
E
(
C
)
where *C*_{i} is a random variable denoting the time between entries to node *i* and

E
(
C
)
=
∑
i
=
1
n
E
(
d
i
)
1
−
ρ
The variance of *C*_{i} is more complicated and a straightforward calculation requires solving *n*^{2} linear equations and *n*^{2} unknowns, however it is possible to compute from *n* equations.

The workload process can be approximated by a reflected Brownian motion in a heavily loaded and suitably scaled system if switching servers is immediate and a Bessel process when switching servers takes time.

Polling systems have been used to model token ring networks.