Page 94 - Read Online
P. 94
Salmani et al. J Surveill Secur Saf 2020;1:79–101 I http://dx.doi.org/10.20517/jsss.2020.16 Page 87
Algorithm 2 Query
1: procedure Query(∆ q ,K, φ)
2: por = randomly choose a portion
3: for all w i in ∆ q do
4: neededVersions=dφ i ×pore
5: j = 1
6: while j ≤ neededVersions do
7: ˆ w = choose randomly one encrypted versions of w i
8: q + = ˆw
9: end while
10: end for
11: return q
12: end procedure
relevance score:
Definition5. (DocumentVector). Let ∆ d bethedictionaryofkeywordsfromcorpusD. Adocumentvector
) is a normalized vector that demonstrates the normalized frequency of each keyword
T i = ( f w 1 ,. . ., f w d
,1 ≤ j ≤ d in document D i. Let T = {T 1 ,. . .,T n } be a set of all document vectors.
f w j
Definition 6. (Query Vector). A query vector Q is a d-element boolean vector such that ∀ 1 ≤ i ≤ d,Q[i] =
1 if w i ∈ ∆ q, and 0 otherwise.
Definition 7. (Relevance Score). Let T i denote the normalized frequency vector of the document D i and Q
be the query vector from ∆ q. Their relevance score s i is inner product of document D i to query vector Q.
Recall that upon receiving the encrypted document collection C the cloud builds its own DTM matrix
ˆ
γ and based on that it generates a set of encrypted document vectors T = {T 1 ,. . .,T n }. However, the
ˆ
ˆ
0
cloud cannot make the real γ matrix and T (see Section 4.1). Upon receiving the encrypted query q, the
cloud server executes Search algorithm. The LRSE Search(q,T, k) algorithm takes the encrypted query q,
ˆ
encrypted document vectors T, and an optional k, and returns the top-k encrypted resultant documents
ˆ
corresponding to R(q) to the data user. Consider that the resultant documents are ranked according to
their relevance score s.
• Decryption. The data user executes Decryption(C i ,K,∆ d ) for each C i in the resultant documents. This
ˆ
0
ˆ
algorithm inputs an encrypted document C i, a secret key K, and an encrypted keyword dictionary ∆ d and
0,
outputs D i. Keywords in the resultant documents are encrypted randomly with a different encrypted piece
of the chain. Thus, before decrypting the results, Decryption algorithm normalizes the results by changing
each keyword’s ciphertext with the first ciphertext in the chain. It then decrypts the normalized encrypted
document using the secret key K.
4.2 Scheme II - Index
InSchemeIIweemployanindexstructuretoincreasethesearchspeed. InSchemeIweencrypteachdocument
keyword by keyword, i.e., each block cipher contains one keyword (we consider each block large enough to
store an encrypted keyword). In contrast, in Scheme II each block cipher contains 128 bits of a file (which may
vary depend on the protocol and employed encryption scheme). Hence, the cloud learns nothing from the
encrypted documents D and is not able to generate its own γ matrix, so must to rely on the index (provided
0
by the data owner) to find and rank the results.
Furthermore, thecloudisnot abletolearn the keywordpositions in the documents. Wealso employthesecure