In the mathematical theory of decisions, decision-theoretic rough sets (DTRS) is a probabilistic extension of rough set classification. First created in 1990 by Dr. Yiyu Yao, the extension makes use of loss functions to derive
α
and
β
region parameters. Like rough sets, the lower and upper approximations of a set are used.
The following contains the basic principles of decision-theoretic rough sets.
Using the Bayesian decision procedure, the decision-theoretic rough set (DTRS) approach allows for minimum-risk decision making based on observed evidence. Let
A
=
{
a
1
,
…
,
a
m
}
be a finite set of
m
possible actions and let
Ω
=
{
w
1
,
…
,
w
s
}
be a finite set of
s
states.
P
(
w
j
∣
[
x
]
)
is calculated as the conditional probability of an object
x
being in state
w
j
given the object description
[
x
]
.
λ
(
a
i
∣
w
j
)
denotes the loss, or cost, for performing action
a
i
when the state is
w
j
. The expected loss (conditional risk) associated with taking action
a
i
is given by:
R
(
a
i
∣
[
x
]
)
=
∑
j
=
1
s
λ
(
a
i
∣
w
j
)
P
(
w
j
∣
[
x
]
)
.
Object classification with the approximation operators can be fitted into the Bayesian decision framework. The set of actions is given by
A
=
{
a
P
,
a
N
,
a
B
}
, where
a
P
,
a
N
, and
a
B
represent the three actions in classifying an object into POS(
A
), NEG(
A
), and BND(
A
) respectively. To indicate whether an element is in
A
or not in
A
, the set of states is given by
Ω
=
{
A
,
A
c
}
. Let
λ
(
a
⋄
∣
A
)
denote the loss incurred by taking action
a
⋄
when an object belongs to
A
, and let
λ
(
a
⋄
∣
A
c
)
denote the loss incurred by take the same action when the object belongs to
A
c
.
Let
λ
P
P
denote the loss function for classifying an object in
A
into the POS region,
λ
B
P
denote the loss function for classifying an object in
A
into the BND region, and let
λ
N
P
denote the loss function for classifying an object in
A
into the NEG region. A loss function
λ
⋄
N
denotes the loss of classifying an object that does not belong to
A
into the regions specified by
⋄
.
Taking individual can be associated with the expected loss
R
(
a
⋄
∣
[
x
]
)
actions and can be expressed as:
R
(
a
P
∣
[
x
]
)
=
λ
P
P
P
(
A
∣
[
x
]
)
+
λ
P
N
P
(
A
c
∣
[
x
]
)
,
R
(
a
N
∣
[
x
]
)
=
λ
N
P
P
(
A
∣
[
x
]
)
+
λ
N
N
P
(
A
c
∣
[
x
]
)
,
R
(
a
B
∣
[
x
]
)
=
λ
B
P
P
(
A
∣
[
x
]
)
+
λ
B
N
P
(
A
c
∣
[
x
]
)
,
where
λ
⋄
P
=
λ
(
a
⋄
∣
A
)
,
λ
⋄
N
=
λ
(
a
⋄
∣
A
c
)
, and
⋄
=
P
,
N
, or
B
.
If we consider the loss functions
λ
P
P
≤
λ
B
P
<
λ
N
P
and
λ
N
N
≤
λ
B
N
<
λ
P
N
, the following decision rules are formulated (P, N, B):
P: If
P
(
A
∣
[
x
]
)
≥
γ
and
P
(
A
∣
[
x
]
)
≥
α
, decide POS(
A
);
N: If
P
(
A
∣
[
x
]
)
≤
β
and
P
(
A
∣
[
x
]
)
≤
γ
, decide NEG(
A
);
B: If
β
≤
P
(
A
∣
[
x
]
)
≤
α
, decide BND(
A
);
where,
α
=
λ
P
N
−
λ
B
N
(
λ
B
P
−
λ
B
N
)
−
(
λ
P
P
−
λ
P
N
)
,
γ
=
λ
P
N
−
λ
N
N
(
λ
N
P
−
λ
N
N
)
−
(
λ
P
P
−
λ
P
N
)
,
β
=
λ
B
N
−
λ
N
N
(
λ
N
P
−
λ
N
N
)
−
(
λ
B
P
−
λ
B
N
)
.
The
α
,
β
, and
γ
values define the three different regions, giving us an associated risk for classifying an object. When
α
>
β
, we get
α
>
γ
>
β
and can simplify (P, N, B) into (P1, N1, B1):
P1: If
P
(
A
∣
[
x
]
)
≥
α
, decide POS(
A
);
N1: If
P
(
A
∣
[
x
]
)
≤
β
, decide NEG(
A
);
B1: If
β
<
P
(
A
∣
[
x
]
)
<
α
, decide BND(
A
).
When
α
=
β
=
γ
, we can simplify the rules (P-B) into (P2-B2), which divide the regions based solely on
α
:
P2: If
P
(
A
∣
[
x
]
)
>
α
, decide POS(
A
);
N2: If
P
(
A
∣
[
x
]
)
<
α
, decide NEG(
A
);
B2: If
P
(
A
∣
[
x
]
)
=
α
, decide BND(
A
).
Data mining, feature selection, information retrieval, and classifications are just some of the applications in which the DTRS approach has been successfully used.