Page 101 - Read Online
P. 101

Page 94                 Salmani et al. J Surveill Secur Saf 2020;1:79–101  I http://dx.doi.org/10.20517/jsss.2020.16


               Theorem 3. LRSE is an indistinguishable secure scheme against known ciphertext attack.



               Proof. Let (G,ξ,D) be an indistinguishable symmetric encryption scheme. Let the C = {ξ(D i ) ,1 ≤ i ≤
               n} be the encrypted document collection. Let SI be the encrypted searchable index, and let the NCN(.)
               be our secure chaining notion. Since (G,ξ,D) is indistinguishable secure the encrypted documents ξ(D i )
               ,1 ≤ i ≤ n are indistinguishable secure from a random string {0,1} . Hence, the encrypted documents are
                                                                         ∗
               indistinguishable secure. Further, the encrypted index SI and encrypted query vectors are secured using
               asymmetric inner product [14] . In Theorem 2 we proved that the NCN is indistinguishable secure, so chaining
               the ciphertexts does not weaken the symmetric encryption. Recall that in Scheme I we encrypt each file word
               by word using our novel chaining notion. In Scheme II we take advantage of the same idea in generating the
               document vectors and encrypt the documents block by block (instead of word by word). Thus, the encrypted
               documents ξ(D i ), 1 ≤ i ≤ n, the index SI and the encrypted query vectors, and the novel chaining notion
               (NCN) are indistinguishable secure, so LRSE is indistinguishable secure.


               5.5  Efficiency and System Costs
               Although two(multi)-party computation can address our designed goals, they suffer from low efficiency. They
               usually employ a n-path (in best case two-path) algorithm which means the retrieval phase needs two rounds
               of communication between the cloud server and data user [25] . Further, one of the biggest drawbacks is the
               complexity of the system and even for basic operations requires significantly more complicated computations.

               Thus,scholarsproposenewmethodologiestomakeatrade-offbetweenprivacy/securityandefficiency. Table1
               compares LRSE with the previous work. In comparison with SSE schemes, LRSE preserves both the search
               pattern and co-occurring terms which are not supported in the multi-keyword SSE schemes.


               Specifically, in comparison with Curtmola et al. [4]  LRSE needs more sever computation, but LRSE supports
               multi-keyword search queries and also preserves the search pattern and co-occurring terms privacy which
                                                                               [9]
               are not among Curtmola et al. [4]  achievements. In compare with Cao et al. , LRSE preserves co-occurring
                                                                                [9]
               terms. Both methods are at the same level of system costs, because in Cao et al. work, to increase the privacy,
               length of the document vectors are extended with dummy keywords. In LRSE, we expand the length of the
               document vectors to increase the uncertainty and at the same time hide the search pattern and co-occurring
               terms (two birds with one stone). Moreover, in Section 6.1 we show that LRSE reaches a higher level result
               accuracy compared to Cao’s approach.

               In comparison with a recent work [18] , LRSE preserves search pattern and co-occurring terms privacy. More-
               over, in Guo et al.’s [18]  approach, a trusted proxy is considered in the system model which increases the system
               cost.



               6   IMPLEMENTATION AND ANALYSIS
               We conducted a comprehensive experimental evaluation of LRSE on 203 English books [26]  with more than five
               million words (excluding stop-words) and more than 2200 keywords. Our experiment includes a user and a
               server. Both entities are implemented using Java (JDK 1.8.0_111) and are executed on Windows 7 machines
               with Core2 Duo CPU at 3.17 GHz and 8 GBs of RAM. The user acts as the data owner and data user, and the
               server acts as the cloud server. We ran the experiments 10 times and difference between the maximum and
               minimum output of the same experiment was lees than 0.5%. For example, the result accuracy experiment
               showed less than a 0.2% difference in 10 runs.

               Recall that in Section 4 we explained that the user’s query is a set of keywords or a sentence in natural language.
   96   97   98   99   100   101   102   103   104   105   106