Page 92 - Read Online
P. 92

Salmani et al. J Surveill Secur Saf 2020;1:79–101  I http://dx.doi.org/10.20517/jsss.2020.16  Page 85


                                                      Keyword            Keyword
                                   Keyword
                                                                                             ...
                                                        +                  +

                          Key      Encryption  Key    Encryption  Key    Encryption




                                   Ciphertext1        Ciphertext2        Ciphertext3


                                   Figure 2. Chaining notion - generating multiple ciphertexts for each keyword.


                  We divide the maximum frequency of each keyword by the minimum average of the keywords (threshold
                  factor τ) to ensure each encrypted version will occur with the same frequency as the minimum value in
                  the average vector A. With this strategy, the cloud (or an adversary) sees all of the encrypted keywords in
                  the same frequency range. Thus, we gain more uncertainty and a higher level of entropy so identifying the
                  keywords becomes more difficult for the cloud.

                • GenerateCiphertext. Before we explain how we encrypt the documents, we define the BuildChain algo-
                  rithm which will be used by the Encryption algorithm.
                     – BuildChain(K,∆ d , φ). This algorithm takes a secret key K, a document keyword dictionary ∆ d,
                                                                                                     ˆ
                       and a required-ciphertext vector φ as input. It outputs an encrypted keyword dictionary ∆ d =
                                                                                                       0
                                                                          0
                       ( ˆw 11 ,. . ., ˆw ij ) where ˆw ij is the j-th ciphertext of w 1≤i≤d  and d is the number of encrypted keywords.
                                                                i
                  For each keyword w i ∈ ∆ d , BuildChain generates the first ciphertext ˆw i1 by encrypting keyword w i using
                  the secret key K. To generate the second ciphertext ˆw i2, BuildChain encrypts XOR of previous ciphertext
                  ˆ w i1 with keyword w i using the same secret key K (i.e., ˆw i1 ⊕ w i). Hence, each ciphertext is chained to the
                  previousone. Thealgorithmcontinuesgeneratingnewciphertextforkeyword w i untilwehavetheexpected
                  number of ciphertexts according to the φ vector. Figure 2 shows the chaining process.
                                                                          [1]
                  Note that compared to other approaches that adopt multiple keys , in our chaining methodology, we
                  employ only one key for all of the required ciphertexts. This characteristic mitigates the key management
                  challengesandlessensystemcostsduringthekeygeneration. Moreover, itenablesthedatauserstogenerate
                  the whole chain on their own side which reduces the communication costs and increases security and
                  privacy. From the privacy viewpoint, keywords with a high frequency get encrypted multiple times (using
                  chaining notion). As a result, it becomes significantly more difficult for the cloud to track down a specific
                  keyword in a single document or across multiple documents.

                  Weexplainhow Encryptionalgorithm(seeDefinition1)worksinLRSE.Notethat,SchemeIisnotanindex-
                  basedmethod, soitdoesnotneedtotakeinanindex I (wewillemployanindexinSchemeII).Aftertaking
                  the inputs, Encryption algorithm call the BuildChain to generate the required-ciphertext vector φ. The
                                                                         ˆ
                  BuildChain algorithm returns the encrypted keyword dictionary ∆ d to the Encryption algorithm. Then
                                                                           0
                  Encryption algorithm starts to encrypt each document by fetching each word from the file and if the word
                  is one of the keywords in ∆ d, we randomly choose one of its encrypted versions from ∆ d otherwise we
                                                                                             0,
                                                                                           ˆ
                                               1
                  encryptthewordwiththesecretkey . Thesubroutine“Add”inthisalgorithmaddstheencrypteddocument
                  (here C i) to the encrypted file collection (C). Algorithm 1 demonstrates how we encrypt each file.

                  1 We assume the random number generator is fair.
   87   88   89   90   91   92   93   94   95   96   97