Page 92 - Read Online
P. 92
Salmani et al. J Surveill Secur Saf 2020;1:79–101 I http://dx.doi.org/10.20517/jsss.2020.16 Page 85
Keyword Keyword
Keyword
...
+ +
Key Encryption Key Encryption Key Encryption
Ciphertext1 Ciphertext2 Ciphertext3
Figure 2. Chaining notion - generating multiple ciphertexts for each keyword.
We divide the maximum frequency of each keyword by the minimum average of the keywords (threshold
factor τ) to ensure each encrypted version will occur with the same frequency as the minimum value in
the average vector A. With this strategy, the cloud (or an adversary) sees all of the encrypted keywords in
the same frequency range. Thus, we gain more uncertainty and a higher level of entropy so identifying the
keywords becomes more difficult for the cloud.
• GenerateCiphertext. Before we explain how we encrypt the documents, we define the BuildChain algo-
rithm which will be used by the Encryption algorithm.
– BuildChain(K,∆ d , φ). This algorithm takes a secret key K, a document keyword dictionary ∆ d,
ˆ
and a required-ciphertext vector φ as input. It outputs an encrypted keyword dictionary ∆ d =
0
0
( ˆw 11 ,. . ., ˆw ij ) where ˆw ij is the j-th ciphertext of w 1≤i≤d and d is the number of encrypted keywords.
i
For each keyword w i ∈ ∆ d , BuildChain generates the first ciphertext ˆw i1 by encrypting keyword w i using
the secret key K. To generate the second ciphertext ˆw i2, BuildChain encrypts XOR of previous ciphertext
ˆ w i1 with keyword w i using the same secret key K (i.e., ˆw i1 ⊕ w i). Hence, each ciphertext is chained to the
previousone. Thealgorithmcontinuesgeneratingnewciphertextforkeyword w i untilwehavetheexpected
number of ciphertexts according to the φ vector. Figure 2 shows the chaining process.
[1]
Note that compared to other approaches that adopt multiple keys , in our chaining methodology, we
employ only one key for all of the required ciphertexts. This characteristic mitigates the key management
challengesandlessensystemcostsduringthekeygeneration. Moreover, itenablesthedatauserstogenerate
the whole chain on their own side which reduces the communication costs and increases security and
privacy. From the privacy viewpoint, keywords with a high frequency get encrypted multiple times (using
chaining notion). As a result, it becomes significantly more difficult for the cloud to track down a specific
keyword in a single document or across multiple documents.
Weexplainhow Encryptionalgorithm(seeDefinition1)worksinLRSE.Notethat,SchemeIisnotanindex-
basedmethod, soitdoesnotneedtotakeinanindex I (wewillemployanindexinSchemeII).Aftertaking
the inputs, Encryption algorithm call the BuildChain to generate the required-ciphertext vector φ. The
ˆ
BuildChain algorithm returns the encrypted keyword dictionary ∆ d to the Encryption algorithm. Then
0
Encryption algorithm starts to encrypt each document by fetching each word from the file and if the word
is one of the keywords in ∆ d, we randomly choose one of its encrypted versions from ∆ d otherwise we
0,
ˆ
1
encryptthewordwiththesecretkey . Thesubroutine“Add”inthisalgorithmaddstheencrypteddocument
(here C i) to the encrypted file collection (C). Algorithm 1 demonstrates how we encrypt each file.
1 We assume the random number generator is fair.