In considering the problem of automated negotiation on the
Internet, we have the specter of millions of people seeking negotiated
agreements with millions of web sites. The
combinatory explosion of possibly legally binding agreements is only one
problem. The other, perhaps even
more daunting, problem is the complexity of deciding what to agree to, and what
not to agree to. Imagine trying to
review a hundred web sites as to the agreements that you struck with them!
If this seems not to be a problem to you, then answer what the privacy
policy on Yahoo is, since you must have visited that one.
Rest assured that, if you went past the first page, Yahoo thinks that it
has an agreement with you regarding your privacy.
Again, if you still don’t see the problem: now try to remember the
privacy policy you agreed to with Amazon when you bought a book or CD there. Of course, this kind of analysis is not actually helpful
except perhaps in illustrating there is a problem.
In order to gain a bit of clarity on this problem, and the kinds of
solutions that we might expect to it, I had fun with the following theoretical
study.
The negotiations of interest in this note are Privacy
Negotiations. These may be seen as
negotiating a non-disclosure agreement between a User Agent (wishing to permit
only limited disclosure) and a Server Agent (desiring to gain disclosure).
The semantics of such agreements are well treated in www.w3.org/p3p.
For example, there are any of an enumerated list of types of information
garnished, purposes of the disclosure, and uses made of the information so
disclosed. If we take the P3P
specification as a specification of a universe of possible agreements, and
further limit the universe in obvious ways to make it countable, the number of
possible agreements will number in the many hundreds of thousands.
The main thrust of the theoretical work was to appreciate the
significance of having potentially hundreds of thousands, if not millions, of
different privacy agreements.
Today there is no such thing as a privacy negotiation on
the Internet. The P3P specification
actually is a one-way statement of privacy policy which assumes agreement by the
User Agent. However, over time
there will be an effective “negotiation” as the market determines how
privacy policies will operate. In
the present context, we anticipate this by using P3P semantics for asserting
privacy agreements that can be negotiated.
Although the P3P semantics are incomplete with respect to negotiated
agreements, they represent a good starting point (see http://yuan.ecom.cmu.edu/psp
).
The model for automated negotiation between a User Agent
and a Server Agent is as follows: A privacy policy is written by one Agent, and
modified by the other. The process
continues, passing the agreement back and forth until an agreed policy is
determined, or the negotiation process is abandoned. In this scenario, there is, theoretically, a set of
agreements acceptable to the User Agent, and a set of agreements acceptable to
the Server Agent. The objective of
the negotiation is to find at least one matching agreement among the two sets
shared by both the User Agent and the Server Agent.
This agreement is the mutually acceptable privacy agreement.
For the sake of studying the structure of the problem, let
us now take this situation out theoretically.
We can require that:
Condition 1. Every User Agent can find at least one
agreement with every Server Agent in the set of candidate agreements held
respectively by the User and Server Agent,
We want to know the minimum and maximum number of
agreements in the universe of agreements so specified.
Let's say we have one agreement held by one User Agent and
one Server Agent. Then there is one
agreement in the universe and this one, agreeable to the User Agent and the
Server Agent satisfies Condition 1.
Now, let’s say there are two User Agents and two Server
Agents. Each has two agreements.
Under this circumstance, there is a minimum of two agreements in the
universe and both agree to both. It
is impossible to have only one because each User Agent and each Server Agent
must hold, at minimum two distinct agreements. There is a maximum of five
agreements, since this is the number required to give each User Agent a unique
‘disagreement’ agreement with each server plus one matched agreement.
To see this clearly, let,
(1) U = number of User Agents
S = number of Server Agents
U. = number of User agreements agreeable to any User
S. = number of Server agreements agreeable to any Server
M = maximum number of agreements in the currently defined universe
ai = a distinct agreement i
where 0<=i<=M
The maximum is always the case where exactly one
agreement, ai, is commonly matched to all, and the rest of the
agreements, aj, are all unique.
So the maximum possible is given by:
(2)
M = 1 + U * (U.-1) + S * (S.-1)
With two user and two Server Agents and two agreements
each, but only one commonly matched we have a total possible of five:
(3)
5 = 1 + 2 * (2-1) + 2 * (2-1)
To better see this, here are two ways to get 4 mutual
agreements total. In (4), U1 and S1
agree with a1, U1 and S2 agree with a3.
(4) U1(a1,a3) S1(a1,a2)
U2(a2,a4) S2(a3,a4)
In (5), the situation is somewhat simpler in that both
Users agree with both Servers by holding either a1 or a2.
(5) U1(a1,a3) S1(a1,a2)
U2(a2,a4) S2(a1,a2)
Here is the one way, from (3), to get the maximum 5
agreements. In this, all the Users
and all the Servers hold one agreement that is mutually acceptable, a1:
(6) U(a1,a2) S(a1,a3)
U(a1,a4) S(a1,a5)
So, now we can
handle any situation of the Internet. For
example, there are, then,
(7)
M = 1 + 1000 * (10-1) + 1000 * (10-1)
= 18,001
Maximum possible agreements in the universe of a thousand
User Agents and a thousand Server Agents who hold ten agreements each.
This maximum is fairly uninteresting because it involves
only one universally agreed-to agreement, while all the other agreements in the
universe are not agreed-to by anyone except the individual agent holding the
agreement. What happens if two
agreements are agreed-to by some number? What
is the maximum then?
We know there must be at least two, the two agreed-to by
some number of User-Server Agent pairs. Furthermore,
we can decide that half the User Agents have one of these two, and the other
half have the other of these two. If this is the case, then all the Server Agents must have
both, in order to meet Condition 1 and reach agreement with all the User Agents.
So
(8)
M = 2 + U * (U.-1) + S * (S. –2)
For U = S = U. = S. = 2, this gives
(9)
M = 2 + 2 (1) + 2 ( 0) = 4
Which is the correct number by demonstration (5):
Now let us say that there are four agreements that are
common as in (4). We can decide
that each of two User Agents hold two of the four:
(10)
M = 4 + 2(0) + 2(0) = 4
Which is the correct number by demonstration (4).
Extrapolating and letting A. be the number of agreements in
common, the maximum possible in the universe of User Agents and Server Agents is
given by,
(11)
M = A. + U * (U.-1) + S * (S.-1)
Now, for the sake of study, let
(12) A. = number of agreements in common
U.m = number of User agreements in each U. matching some Server agreement
S.m = number of Server agreements
in each S. matching some User agreement
A. has a value of A. = C(U.m) = C(S.m). The
function C is a grouping function that counts the number of unique matching
agreements, ai, across User or Server Agents.
By definition, the number of distinct matching agreements is of course
the same for User Agents and Server Agents.
More generally then, there is the precise relation,
(13)
M = A. + U * (U. – U.m) + S * (S. – S.m)
Where U.m <= U. and S.m <= S.
So, if in violation of Condition 1, the matching agreements
(actual agreement) is zero, then the total number of agreements (all not agreed
to) is U * U. + S * S. which is
also correct.
Using the more general (13), we now verify that if we have
4 User matching agreements or 4 Server matching agreements, in sets of two, and
2 Users and 2 Servers,
(14) M = 4 + 2(0) + 2(0) = 4
Which is the correct formula (10) and correct by
demonstration (4)
Now we can ask more generally, if we have 1000 User Agents
and 100 Servers and each is required 10 mutual agreements in a set of 10, to be
maximally efficient, the maximum number of agreements is 10, as in:
(15) M = 10 + 1000(0) + 100(0) = 10
If we allow 10 mutual agreements in a set of 12, we find
(16) M = 10 + 1000(12-10) + 100(12-10) = 2,210
agreements can exist maximally. If we allow 10 mutual agreements but distribute them out
among the Users but not thinly among the Servers we get:
(17) M = 10 + 1000 (12-1) + 100 (12-10) = 11,210.
So this gives a method for counting the maximum number of
distinct agreements possible among a number of User Agents and Server Agents.
What is the minimum number of agreements possible?
The absolute minimum is given when the match is most
complete. Since a condition is
that the agreements held by a User Agent or Server Agent must all be distinct,
then the minimum is given by
(18)Min = max(U., S.)
So, if the User Agents have 10 agreements each and the
Server Agents have twelve agreements each, the minimum number of agreements in
the universe is 12.
To take another example, in the situation described for (7)
where the maximum number of agreements is 18,001, the minimum is 10 since
max(10,10) is 10.
We can now count the minimum and maximum number of
agreements possible in any universe of User Agents and Server Agents.
In other cases, as in (17), where the number of agreements that match are explicitly counted, the minimum and the maximum are the same, perfect value. So the general equation, (13), M, is both a maximum and minimum for a specific number of A., U.m, and S.m.
The problem of determining the maximum and minimum
boundaries in the number of agreements between U User Agents and S Server Agents
is thereby solved.
Now let us move a bit more toward the “the real world.”
What about distributions? For
example, what if the number of agreements held by the User Agents is variable
across User Agents and distributed around a mean value?
The same goes for Server Agents. Surely,
different Server Agents will have different numbers of agreements acceptable to
themselves. Such variability will
create “real-world” dynamic ranges between the theoretical minimum and
maximum values even if we continue to hold tightly onto Condition 1 that
stipulates that every User Agent must be able to agree with every Server Agent.
And, of course, in the “real world” Condition 1 will not hold.
So what can we observe?
Clearly some of the variables, for example, A., S. and S.m
can have complicated dependencies depending on the specific combinatorics of the
situation. For example, if one User
Agent has exactly one agreement, then by Condition 1, this one agreement must be
present for all Server Agents. If
100 User Agents have one agreement each (U. = U.m = 1), and they are all
different (A. >= 100), then every Server Agent must have at least 100
agreements (S. >= 100) that it must manage in order to guarantee Condition 1.
This situation becomes intolerable if there are hundreds of thousands, if
not millions, of unique User Agents or hundreds of thousands, if not millions,
of unique Server Agents.
There is a question as to what value we may derive by
theoretically exploring the variable distributions. An important virtue would be an understanding of minimum
system demands. These are demands
that the system places on the individual. For
example, in the case of the bricks-and-mortar highway system, the system
requires that only a finite countable number of routes be followed from one
location to the other and that routes only approximately cross specific
locations. These are demands that
the highway system places on the user of the highway system.
If you want to locate your house so that it may be easy to travel to,
then you must consider the routes available through the highway system.
So, we can say with some hopeful assurance that the system
of privacy agreements should come close to satisfying Condition 1: There should
always be an agreement possible (viz., a matching agreement) between every User
Agent and every Server Agent. In
practice this may not happen, but it should, hopefully, happen.
Furthermore, we can now conclude, from the analysis already
completed, that, with some assurance, both User Agents and Server
Agents should expect to manage more than one privacy agreement.
If not, there will be unacceptable explosions of the number of agreements
that either User Agents or Server Agents would have to manage in order to
maintain Condition 1. The way to
control the explosion is to allow for reasonable S. and U., that count the
number of agreements held, and also to minimize A., that counts the number of
matching agreements in the entire universe of User Agents and Server Agents.
This analysis would suggest an appropriate solution
in privacy or other agreement negotiation is to create a number of
prototype agreements that are generally agreed as a comprehensive set of
prototypes and widely published. Every
owner of a User and Server Agent would be very wise to strive to use one of
these prototypes in lieu of constructing a special case agreement. This
will maximize the likelihood of a User-Server agreement while minimizing the
loading on the overall system.
A prototype need not be a propositional statement of fact.
Certainly, we would expect to count two agreements that differ only with
a variable instantiation as the same. For
example, a “Name” field in a Server Agreement, if filled in with “John
Smith” by a User Agent agreement is still, presumably, the same agreement, and
is to be counted as a matched agreement. Furthermore,
two User Agreements that differ only by a filled in “Name” are the same
agreement for the purpose of counting the number of distinct agreements.
Variable instantiation would routinely provide such a means
for providing prototype reduction and this is certainly anticipated in the www.w3.org/p3p specification and is anticipated in any XML (www.w3.org/xml)
specification as well for other types of negotiated settlement between User
Agents and Server Agents or between Peers.
In so far as the P3P specification provides a subset of the semantics for
negotiated privacy, the work in variable instantiation is already well treated.
For example, there is already an explicit notion of optional and
mandatory variable instantiation which is probably important in the discovery of
a few prototypes that can be widely accepted.
However, there is another form of agreement specification
that may become very important in the search for a few acceptable prototypes
which is not directly addressed in P3P. This
is a conditional: If X is true then
Y must be true, but if X is false then Y may be true or false.
As a specific example in privacy: “if you type in your name in an order
form, I will record your name for the following purposes and uses”.
Since this is a conditional: we would surmise ‘If you do not type in
your name in an order form then it is neither true nor untrue that I will record
your name for the following purposes and uses.’
I have argued elsewhere (XML 2000 Conference, Paris) that this
limitation, also called programmability and “Turing Power,” which is not
present in XML, is a severe and undue limitation in the XML standard that
invites many problems in automating activity on the Web.
In the case of P3P, however, this limitation is being dealt with partly
in the “use semantics” and partly in the APPEL Rules (a subgroup of P3P)
which provide for the precise specification of such conditional statements.
Conditional statements are important in forming agreements.
I may let you have my name if we both agree to the use you put my name to
and the purpose in so using it. However,
I may wish to require that you agree explicitly to that conditional use of my
name, and you may find that perfectly acceptable where you not find it
acceptable for me simply to refuse to give my name under any circumstances.
However, the use of APPEL rules, like the use of P3P is not
being provided for negotiation or as a statement of agreement.
To show how a Server Agent may readily employ an agreement that contains
explicit conditionals we can consider the notion of “opt-in”:
In an “opt-in” the User Agent must take some explicit action to
permit the transfer of information to the Server Agent.
The “Name” example, above, is good.
Suppose, then, that I have a prototype agreement that says you can take
the certain information pending an opt-in from the User Agent. Coding the Server Agent would involve recognizing that this
User Agent requires Opt-in on taking certain information (for example, accepting
a cookie with a unique ID, accepting a name, etc.) and the Server Agent would
provide the appropriate notifications. As
a User Agent I may specify the uses and purposes that I will accept such Opt-in.
The Server Agent can now decide whether, in its conditional agreement
terms, to accept the information for the restricted uses and purposes demanded
by the User Agent.
The point of this brief foray into conditional statements,
is that they can be employed to potentially greatly reduce the number of
agreements in the universe of agreements and the number of agreements that must
be agreed to by User Agents and Server Agents alike. We have now seen that reducing this number is important and
that reducing the number to one is probably not useful.
An idea presented recently presented by Dr. Robert Baldwin
(www.plusfive.com ) provides another
potentially important method of prototypes.
This is the method of persona. Baldwin
rightly notes that people assume various persona in their behavior on the
Internet, and expect to be treated by the persona.
So, for example, when they visit Yahoo they are “information
seeking,” but when they visit a medical site they may be wished to be treated
“as a patient” (with doctor-patient confidentiality).
When you buy a book from Amazon, you are a “retail product consumer.”
Imagine that your web browser organized your privacy agreements by
persona and that Server Agents would recognize and agree to the agreements based
on these persona. This is akin to
the “purpose” of taking information in P3P, but is a much more direct method
of thinking about, and organizing, various privacy agreements.
While I may not remember who I let have my name and address, I can
remember that I only let out my name and address when I am in the persona of
retail consumer, when I am an investor checking my financial accounts.
In the spirit of fully negotiated agreements, we might also
notice that there are corresponding persona for Server Agents.
Some web sites are “business presence” sites. Other sites are “informational sites,”
“ezine sites,” “shopping sites,” and the like.
It may be possible to identify prototype privacy agreements that are
largely agreed to by all “shopping sites” or by all “business presence”
sites, in order to reduce the overall pool of agreements that must be managed
while maintaining as close an approximation to meeting Condition 1 as possible.
A business, such as a hospital, may have multiple ‘persona sites’
that are managed by the hospital Server Agent, automatically, to insure that
information is properly presented and compartmentalized for use. The present analysis suggests that the ongoing discovery of
the minimal set of such prototypes is essential to automated rights negotiation
on the Internet.
This short note on automatically negotiated agreements on
the Internet has not addressed the technical details of negotiating agreements,
but rather has focused on understanding something of how the entire system can
be envisioned to function. In other
papers on the “Privacy Server Protocol” ( http://yuan.ecom.cmu.edu/psp
) I have been investigating some of the technical details.
The conclusion of this paper is that we need to directly
address the issue of minimizing the number of agreements that must be managed.
Without explicitly recognizing that we need to minimize the number of
generally agreed-on agreements, then we will have fractionalization with
widespread failure of User Agents and Server Agents to communicate with one
another. We have also seen
that simply having one general agreement is probably unwise for theoretical
reasons. A method of
prototypes appears to be promising in this regard. The method seems to have three potentially important
attributes:
Beyond this, the mechanism by which hundreds of millions of participants on the Internet can come to formulate a handful of prototype agreements remains an intriguing mystery worth as much exploration as might be possible.