John T. Langton : artificial intelligence : Utility Theory
Utility theory is used in decision analysis to determine the EU (estimated utility) of some action based on the U (utility) of its possible result(s).
A Utility function U(W) maps from states (W being a world state) to real numbers i.e. U(W) = 5
utility function is when there is a lottery (uncertainty involved), value function is when you're evaluating the worth of actual states. A value or fitness function for chess would be a winning game, but there is no uncertainty. A value for backgammon however is a utility function since the rolling of the dice brings in uncertainty, probability, and chance.
An action can have more than one possible result:
In the simple case an action's result is one deterministic
state (i.e. lights are on or off). In this case the EU (estimated
utility) of an action is equal to the U (utility) of its result.
In reality there are usually a number of possible results for
each possible action:
represents
possible actions to take
represents
possible resulting worlds
where U(Wi) is the utility of Wi and p(Wi|Ak) is the probability of Wi
resulting given Ak (from taking action Ak). Thus, the EU of an action
Ak is the sum of the utilities of all its possible results times the
probability of each result happening.
An action could also have one result with a continuous measure of degrees of intensity and a continuous probability distribution. To get the utility of such a result one would take the integral of its utility values as a function of its probability distribution. This is covered in terms of attributes below.
It is presumed that the action with MEU (maximum expected utility) should be chosen
There is an issue of sequential actions.
Each Wi or action result can have any number of attributes. Wij might be owning a car with j attributes: gas mileage, possible service costs, color, speed, etc.
In some cases strict dominance of U(Wxj) over U(Wyj) can be asserted if the utility values of the j attributes are readily known and U(Wxj) > U(Wyj)
In most cases stochastic dominance must suffice since we usually don't know the exact values of all attributes before the actions take place. For example, possible service costs of owning a car depends on the probability of the car needing service at each possible cost. If we compare the actions of buying car 1 and buying car 2 where
X represents the possible service costs
p1(x) is the probability distribution of one car's possible service costs
p2(x) is the probability distribution of the other car's possible service costs
a is the least amount of money you can spend on service
b is the most you can spend on service, and
then buying car one stochastically dominates buying car 2. The integral tells us that every given probability of every possible service cost for car 1, taken together, is less than or equal to every given probability of every possible service cost for car 2, taken together.
If an action A stochastically dominates all other actions on all attributes, then for any monotonically nondecreasing utility function the expected utility of A is at least as high a the expected utility of all other actions.
utility functions depend on preference patterns per attribute U(x1,...,xn) = F(f(x1),...,f(xn)) where f(x) is some per-attribute value function. This however gets tricky as these functions can be inter-dependent i.e. the value of getting wet is higher when its hot and dry out and I'm wearing a bathing suit.
Preferences without uncertainty translate into a value function, not a utility function. (see above)
If the preference outcomes of (x, y, z) and (x', y', z) are the same for any values of z then x and y are preferentially independent of z
A set of attributes is mutually independent if each of its subsets is independent of the remaining attributes i.e., for (x, y, z) we would have that the outcomes of
(x, y, z) and (x', y', z) are the same for any values of z
(x, y, z) and (x, y', z') are the same for any values of x
(x, y, z) and (x', y, z') are the same for any values of y
Debreu (1960) derived that for a set of mutually independent
attributes, the value for the set can be found by summing the value
for each attribute. This is applicable in many real world cases and
is good as an approximation in cases where it does not hold for
extreme values of attributes.
Preferences with uncertainty: this is extending the preference
independence and mutual preference independence concepts to lotteries
where outcomes are not known and you have a utility function. The
additive value function is replaced by a multiplicative utility
function. See Keeney and Raiffa for more info.
Utility functions are only guaranteed to give an ordering of
states. Their actual value may not say anything meaningful about the
"worth" of a state. Thus the utility function only needs to
follow some rules of preference given here.
For every two states an agent must prefer one to the other or
view both as having equal preference (i.e. you can't not decide,
must assign value so we can order options)
If A is preferred over B and B is preferred over C then A is
preferred over C
Assume A is preferred over B and B is preferred over C. There
should be some probability P that A will happen and some probability
1-P that C will happen, so that the agent is indifferent about
accepting this probability or being sure of getting B.
If the agent is indifferent between result A and B then we
should be able to replace one with the other in various utility
equations.
If 2 lotteries have the same 2 outcomes A and B, and the agent
prefers A to B, then the agent must prefer the lottery with a higher
probability of A happening.
compound lotteries can be defined as one lottery and vice versa
is derived from the axioms above and states that for an agent
following these axioms, there exists a real valued function U such that
U(A)>U(B) if and only if the agent prefers A to B, and U(A) = U(B)
if and only if the agent does not prefer either A or B. I'll leave
it to you to figure out the rest of the numeric consequences.
As stated before, numeric Utility values only have to provide an
ordering of actions according to preference. They can be subjective in
any other way.
is the "strict objective utility" where utility is
in dollar amounts
gambling a $1000 on the toss of a coin, the EMV would be
%50*$1000 = $500
Grayson (1960) proved Bernoulli (1738) right showing that the
utility of money for most people is proportional to the logarithm of
the amount.
U(current_$ + $_gained) = -263.31 + 22.09log(n + 150,000)
for the range -$150,000-$800,000
This means people are more willing to gamble if they have more money,
or debt, beyond some threshold.
is where an agent's utility measurement for a lottery is
less than the EMV of the lottery
would be the opposite
is the amount an agent would rather walk away with rather than
gamble at winning some increased amount at a given probability.
Studies show in the $1000 coin toss that most people would rather
walk away with $400. Thus $400 is the certainty equivalent of a
lottery consisting of a $1000 pay off at 50% odds.
For the $1000 coin toss it would be $500-$400 = $100
There are various methods for determining what an agent's utility values are. One is to establish a scale with a best possible prize and a worst possible catastrophe.
Normalized Utilities use a scale where the best prize has a utility value of 1 and the worst catastrophe has a value of 0
Intermediate values (between best and worst) are determined by questioning the agent and determining its certainty equivalent concerning many hypothetical lotteries for various outcomes. You can then plot the answers as points on a graph, draw a line, and derive a utility function. However, often these values or only valid for linearly local, incremental values.