In the mathematical theory of decisions, decision-theoretic rough sets (DTRS) is a probabilistic extension of rough set classification. First created in 1990 by Dr. Yiyu Yao, the extension makes use of loss functions to derive α and β region parameters. Like rough sets, the lower and upper approximations of a set are used.
The following contains the basic principles of decision-theoretic rough sets.
Using the Bayesian decision procedure, the decision-theoretic rough set (DTRS) approach allows for minimum-risk decision making based on observed evidence. Let A = { a 1 , … , a m } be a finite set of m possible actions and let Ω = { w 1 , … , w s } be a finite set of s states. P ( w j ∣ [ x ] ) is calculated as the conditional probability of an object x being in state w j given the object description [ x ] . λ ( a i ∣ w j ) denotes the loss, or cost, for performing action a i when the state is w j . The expected loss (conditional risk) associated with taking action a i is given by:
R ( a i ∣ [ x ] ) = ∑ j = 1 s λ ( a i ∣ w j ) P ( w j ∣ [ x ] ) . Object classification with the approximation operators can be fitted into the Bayesian decision framework. The set of actions is given by A = { a P , a N , a B } , where a P , a N , and a B represent the three actions in classifying an object into POS( A ), NEG( A ), and BND( A ) respectively. To indicate whether an element is in A or not in A , the set of states is given by Ω = { A , A c } . Let λ ( a ⋄ ∣ A ) denote the loss incurred by taking action a ⋄ when an object belongs to A , and let λ ( a ⋄ ∣ A c ) denote the loss incurred by take the same action when the object belongs to A c .
Let λ P P denote the loss function for classifying an object in A into the POS region, λ B P denote the loss function for classifying an object in A into the BND region, and let λ N P denote the loss function for classifying an object in A into the NEG region. A loss function λ ⋄ N denotes the loss of classifying an object that does not belong to A into the regions specified by ⋄ .
Taking individual can be associated with the expected loss R ( a ⋄ ∣ [ x ] ) actions and can be expressed as:
R ( a P ∣ [ x ] ) = λ P P P ( A ∣ [ x ] ) + λ P N P ( A c ∣ [ x ] ) , R ( a N ∣ [ x ] ) = λ N P P ( A ∣ [ x ] ) + λ N N P ( A c ∣ [ x ] ) , R ( a B ∣ [ x ] ) = λ B P P ( A ∣ [ x ] ) + λ B N P ( A c ∣ [ x ] ) , where λ ⋄ P = λ ( a ⋄ ∣ A ) , λ ⋄ N = λ ( a ⋄ ∣ A c ) , and ⋄ = P , N , or B .
If we consider the loss functions λ P P ≤ λ B P < λ N P and λ N N ≤ λ B N < λ P N , the following decision rules are formulated (P, N, B):
P: If P ( A ∣ [ x ] ) ≥ γ and P ( A ∣ [ x ] ) ≥ α , decide POS( A );N: If P ( A ∣ [ x ] ) ≤ β and P ( A ∣ [ x ] ) ≤ γ , decide NEG( A );B: If β ≤ P ( A ∣ [ x ] ) ≤ α , decide BND( A );where,
α = λ P N − λ B N ( λ B P − λ B N ) − ( λ P P − λ P N ) , γ = λ P N − λ N N ( λ N P − λ N N ) − ( λ P P − λ P N ) , β = λ B N − λ N N ( λ N P − λ N N ) − ( λ B P − λ B N ) . The α , β , and γ values define the three different regions, giving us an associated risk for classifying an object. When α > β , we get α > γ > β and can simplify (P, N, B) into (P1, N1, B1):
P1: If P ( A ∣ [ x ] ) ≥ α , decide POS( A );N1: If P ( A ∣ [ x ] ) ≤ β , decide NEG( A );B1: If β < P ( A ∣ [ x ] ) < α , decide BND( A ).When α = β = γ , we can simplify the rules (P-B) into (P2-B2), which divide the regions based solely on α :
P2: If P ( A ∣ [ x ] ) > α , decide POS( A );N2: If P ( A ∣ [ x ] ) < α , decide NEG( A );B2: If P ( A ∣ [ x ] ) = α , decide BND( A ).Data mining, feature selection, information retrieval, and classifications are just some of the applications in which the DTRS approach has been successfully used.