Page 98 - Read Online
P. 98
Salmani et al. J Surveill Secur Saf 2020;1:79–101 I http://dx.doi.org/10.20517/jsss.2020.16 Page 91
Proof. Consider:
d
0
d
LRSE : [0,1] 7→ [0,1] such that d ≥ d (1)
0
and
d 0 d 0
T i ∈ [0,1] and T i ∈ [0,1]
.
) is:
The entropy of the document vector T i = ( f w 1 ,. . ., f w d
b
∑
H(T i ) = − f w j × log( f w j )
j=1
0
and the entropy of the T i = ( f ,. . ., f 0 ) is:
0
w 1 w d 0
d 0
∑
0 0 0
H(T i ) = − d w j × log(d w j )
j=1
Recall φ = (l 1 ,. . .,l d ), with ∑ d l 0 ∈ T i we have:
ℓ=1 ℓ = d , and for any f w j
l j j−1
∑ ∑
= f 0 = l ℓ .
f w j , where ,α w j
w (k+α w j )
k=1 ℓ=1
Hence:
) = −( f 0 + · · · + f 0 ) log( f 0 + · · · + f 0 )
− f w j log( f w j
w 1 w m w 1 w m
m
∑
0
= f . (2)
where, f w j
w k
k=1
Moreover, note that log(x) is a monotonically increasing function and T i possesses positive values (based on
0
( 1)), thus we have:
− ( f 0 ) × log( f 0 + · · · + f 0 ) ≤ −( f 0 ) × log( f 0 )
w 1 w 1 w m w 1 w 1
m
∑
= f 0 (3)
where, f w j
w k
k=1
By extending the above inequality for all of the keyword frequencies in T i and T :
0
i
d d 0
∑ ∑
− f w j × log( f w j ) ≤ − f 0 × log( f 0 )
w m w m
j=1 m=1
Thus we have:
0
H(T i ) ≥ H(T i )