..
.
Onion router (TOR)
Data collector
Respondents
ℛ 1
ℛ 2
ℛi
풞
(ℐi,pki)
풮i=∑
n
j=1(s
i
j)
1
2
3
4
Outcome table
풰 1
풰 2
풰n
Sum
..
.
..
.
..
.
1 2 ··· n
···
···
···
···
⋱
Encpk 2 (s 11 ) Encpk 2 (s 12 )
Encpk 2 (s 21 ) Encpk 2 (s 22 )
Encpk 2 (sn^1 ) Encpk 2 (sn^2 )
Encpk 2 (풮 1 ) Encpk 2 (풮 2 ) Encpk푛(풮n)
Encpk푛(snn)
Encpk푛(sn 2 )
훼i={Encpk푗(s Encpk푛(s 1 n)
j
i)|j=1,2,...,n}
mi={
(ℐi, 1), if풮i≥k
(ℐi, 0), if풮i<k
Figure 1: Overview of the proposed solution.
Table 2: Outcome table released by the data collector.
V 1 V 2 ⋅⋅⋅ V푛
u 1 Encpk 1 (푠^11 ) Encpk 2 (푠^21 )⋅⋅⋅Encpk푛(푠푛 1 )
u 2 Encpk 1 (푠^12 ) Encpk 2 (푠^22 )⋅⋅⋅Encpk푛(푠푛 2 )
..
.
..
.
..
. d
..
.
u푛 Encpk 1 (푠^1 푛) Encpk 2 (푠^2 푛)⋅⋅⋅Encpk푛(푠푛푛)
SUM Encpk 1 (S 1 ) Encpk 2 (S 2 )⋅⋅⋅Encpk푛(S푛)
To initiate the protocol, the data collector first randomly
assigns a public key pk푖for each QI푗∈QID. If|QID|>푛,
thesamepublickeycanbeassignedtomorethanonequasi-
identifier. Otherwise, the data collector selects푚/푛of the
public keys for the assignment. For simplicity, we will assume
that the size for both quasi-identifier and public key is equal
(i.e.,푚=푛)andℓ={(pk 1 ,QI 1 ),(pk 2 ,QI 2 ),...,(pk푛,QI푛)}.
Next, the data collector publishes(I,ℓ)to a shared location
(e.g., a webpage):
(I,ℓ)={(I 1 ,(pk 1 ,QI 1 )),(I 2 ,(pk 2 ,QI 2 )),...,
(I푛,(pk푛,QI푛))}.
(1)
Based on the information from ( 1 ), each respondent푖
retrievesℓto examine if his records inD푖match any of the
quasi-identifiers QI푗∈QID. At this phase, each respondent
푖maintains a scores list for QID,{푠^1 푖,푠^2 푖,...,푠푛푖}. We denote
푠푗푖 as the score determined by the respondent푖for QI푗.
The respondent raises each score by 1 when a record in
D푖matches the quasi-identifier. Upon the completion, the
respondent푖encrypts each푠
푗
푖by using the public key pk푗
assigned to the quasi-identifier QI푗.Theencryptedscoreslist
computed by each respondent푖can be represented as훼푖 =
{Encpk 1 (푠^1 푖),Encpk 2 (푠^2 푖),...,Encpk푛(푠푛푖)}.Then,alltherespon-
dents send훼푖tothedatacollectorandasharedlocation.Note
that this location can be a separate space that is not shared
with the data collector.
Upon receiving훼푖from all the respondents, the data
collector performs the following tasks.
(1)Aggregates the scores determined by all respondents
for eachQI푖. The data collector performs this com-
putation in an encrypted form by using the additive
property of the Paillier cryptosystem. The output of
the aggregation can be represented as
Encpk푗(S푖)=Encpk푗(푠
푗
1 )+ℎEncpk푗(푠
푗
2 )
+ℎ⋅⋅⋅+ℎEncpk푗(푠푗푛).
(2)
(2)Publishes an outcome table.Thedatacollectorpub-
lishes the scores for each QI푗in an outcome table as
shown inTable 2.InTable 2,eachrow(u푖)represents
the encrypted scores received from each respondent
푖while the column(V푖)shows the encrypted scores
for each quasi-identifier QI푗.Notethatallthedata
inV푗are encrypted by using the same public key pk푗.
Therefore, only the respondent who has been assigned
the QI푗can decrypt Encpk푗(S푖)to learn the number of
matched records(S푖)for QI푗.
After the data collector releases the outcome table, the
respondents need to verify that the data released are genuine.
For instance, each respondent푖verifies that the encrypted
scores list훼푖submitted to the data collector appears as one of
the rows inTable 2. If the respondent fails to verify the data,
heorshethenissuesadecisionmessage푚푖with a random
value.
Let us assume all the respondents successfully verify
the data inTable 2. Next, each respondent푖retrievesV푖