In graph theory, the metric k-center or metric facility location problem is a combinatorial optimization problem studied in theoretical computer science. Given n cities with specified distances, one wants to build k warehouses in different cities and minimize the maximum distance of a city to a warehouse. In graph theory this means finding a set of k vertices for which the largest distance of any point to its closest vertex in the k-set is minimum. The vertices must be in a metric space, or in other words a complete graph that satisfies the triangle inequality.
Contents
Formal definition
Let
A set
For a metric space (
That is, Every point in a cluster is in distance at most
The k-Center Clustering problem can also be defined on a complete undirected graph G = (V, E) as follows:
Given a complete undirected graph G = (V, E) with distances d(vi, vj) ∈ N satisfying the triangle inequality, find a subset C ⊆ V with |C| = k while minimizing:
Computational complexity
In a complete undirected graph G = (V, E), if we sort the edges in nondecreasing order of the distances: d(e1) ≤ d(e2) ≤ … ≤ d(em) and let Gi = (Vi, Ei), where Ei = {e1, e2, …, ei}. The k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k.
Although Dominating Set is NP-complete, the k-center problem remains NP-hard. This is clear, since the optimality of a given feasible solution for the k-center problem can be determined through the Dominating Set reduction only if we know in first place the size of the optimal solution (i.e. the smallest index i such that Gi has a dominating set of size at most k) , which is precisely the difficult core of the NP-Hard problems.
A simple greedy algorithm
A simple greedy approximation algorithm that achieves an approximation factor of 2 builds
Running time
Proving the approximation factor
The solution obtained using the simple greedy algorithm is a 2-approximation to the optimal solution. This section focuses on proving this approximation factor.
Given a set of n points
i.e.
This theorem can be proven using two cases as follows,
Case 1: Every cluster of
Case 2: There are two centers
Another 2-factor approximation algorithm
Another algorithm with the same approximation factor takes advantage of the fact that the k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k and computes a maximal independent set of Gi, looking for the smallest index i that has a maximal independent set with a size of at least k. It is not possible to find an approximation algorithm with an approximation factor of 2 − ε for any ε > 0, unless P = NP. Furthermore, the distances of all edges in G must satisfy the triangle inequality if the k-center problem is to be approximated within any constant factor, unless P = NP.