## Utility Theory

This is a page of notes summarized from various sources including Russel and Norvig's classic AI book: Artificial Intelligence: A Modern Approach. My intuitions and mistakes (partly from quick note taking) are mixed in.
• Utility theory is used in decision analysis to determine the EU (estimated utility) of some action based on the U (utility) of its possible result(s).

• A Utility function U(W) maps from states (W being a world state) to real numbers i.e. U(W) = 5

• utility function is when there is a lottery (uncertainty involved), value function is when you're evaluating the worth of actual states. A value or fitness function for chess would be a winning game, but there is no uncertainty. A value for backgammon however is a utility function since the rolling of the dice brings in uncertainty, probability, and chance.

• An action can have more than one possible result:

• In the simple case an action's result is one deterministic state (i.e. lights are on or off). In this case the EU (estimated utility) of an action is equal to the U (utility) of its result. • In reality there are usually a number of possible results for each possible action: represents possible actions to take represents possible resulting worlds where U(Wi) is the utility of Wi and p(Wi|Ak) is the probability of Wi resulting given Ak (from taking action Ak). Thus, the EU of an action Ak is the sum of the utilities of all its possible results times the probability of each result happening.

• An action could also have one result with a continuous measure of degrees of intensity and a continuous probability distribution. To get the utility of such a result one would take the integral of its utility values as a function of its probability distribution. This is covered in terms of attributes below.

• It is presumed that the action with MEU (maximum expected utility) should be chosen

• There is an issue of sequential actions.

• Each Wi or action result can have any number of attributes. Wij might be owning a car with j attributes: gas mileage, possible service costs, color, speed, etc.

• In some cases strict dominance of U(Wxj) over U(Wyj) can be asserted if the utility values of the j attributes are readily known and U(Wxj) > U(Wyj)

• In most cases stochastic dominance must suffice since we usually don't know the exact values of all attributes before the actions take place. For example, possible service costs of owning a car depends on the probability of the car needing service at each possible cost. If we compare the actions of buying car 1 and buying car 2 where

• X represents the possible service costs

• p1(x) is the probability distribution of one car's possible service costs

• p2(x) is the probability distribution of the other car's possible service costs

• a is the least amount of money you can spend on service

• b is the most you can spend on service, and

• then buying car one stochastically dominates buying car 2. The integral tells us that every given probability of every possible service cost for car 1, taken together, is less than or equal to every given probability of every possible service cost for car 2, taken together.

• If an action A stochastically dominates all other actions on all attributes, then for any monotonically nondecreasing utility function the expected utility of A is at least as high a the expected utility of all other actions.

• Multi-attribute

utility functions depend on preference patterns per attribute U(x1,...,xn) = F(f(x1),...,f(xn)) where f(x) is some per-attribute value function. This however gets tricky as these functions can be inter-dependent i.e. the value of getting wet is higher when its hot and dry out and I'm wearing a bathing suit.

• Preferences without uncertainty translate into a value function, not a utility function. (see above)

• preference independence

If the preference outcomes of (x, y, z) and (x', y', z) are the same for any values of z then x and y are preferentially independent of z

• mutual preferential independence

A set of attributes is mutually independent if each of its subsets is independent of the remaining attributes i.e., for (x, y, z) we would have that the outcomes of

1. (x, y, z) and (x', y', z) are the same for any values of z

2. (x, y, z) and (x, y', z') are the same for any values of x

3. (x, y, z) and (x', y, z') are the same for any values of y

Debreu (1960) derived that for a set of mutually independent attributes, the value for the set can be found by summing the value for each attribute. This is applicable in many real world cases and is good as an approximation in cases where it does not hold for extreme values of attributes. • Preferences with uncertainty: this is extending the preference independence and mutual preference independence concepts to lotteries where outcomes are not known and you have a utility function. The additive value function is replaced by a multiplicative utility function. See Keeney and Raiffa for more info.

• #### Axioms of preference in Utility Theory:

Utility functions are only guaranteed to give an ordering of states. Their actual value may not say anything meaningful about the "worth" of a state. Thus the utility function only needs to follow some rules of preference given here.

• #### Orderability:

For every two states an agent must prefer one to the other or view both as having equal preference (i.e. you can't not decide, must assign value so we can order options)

• #### Transitivity:

If A is preferred over B and B is preferred over C then A is preferred over C

• #### Continuity:

Assume A is preferred over B and B is preferred over C. There should be some probability P that A will happen and some probability 1-P that C will happen, so that the agent is indifferent about accepting this probability or being sure of getting B.

• #### Substitutability:

If the agent is indifferent between result A and B then we should be able to replace one with the other in various utility equations.

• #### Monotonicity:

If 2 lotteries have the same 2 outcomes A and B, and the agent prefers A to B, then the agent must prefer the lottery with a higher probability of A happening.

• #### Decomposability:

compound lotteries can be defined as one lottery and vice versa

• #### The Utility Principle:

is derived from the axioms above and states that for an agent following these axioms, there exists a real valued function U such that U(A)>U(B) if and only if the agent prefers A to B, and U(A) = U(B) if and only if the agent does not prefer either A or B. I'll leave it to you to figure out the rest of the numeric consequences.

• #### Utility Values:

As stated before, numeric Utility values only have to provide an ordering of actions according to preference. They can be subjective in any other way.

• #### Expected Monetary Value

is the "strict objective utility" where utility is in dollar amounts

• gambling a \$1000 on the toss of a coin, the EMV would be %50*\$1000 = \$500

• Grayson (1960) proved Bernoulli (1738) right showing that the utility of money for most people is proportional to the logarithm of the amount.
U(current_\$ + \$_gained) = -263.31 + 22.09log(n + 150,000)
for the range -\$150,000-\$800,000
This means people are more willing to gamble if they have more money, or debt, beyond some threshold.

• #### Risk averse

is where an agent's utility measurement for a lottery is less than the EMV of the lottery

• #### Risk seeking

would be the opposite

• #### Certainty Equivalent

is the amount an agent would rather walk away with rather than gamble at winning some increased amount at a given probability. Studies show in the \$1000 coin toss that most people would rather walk away with \$400. Thus \$400 is the certainty equivalent of a lottery consisting of a \$1000 pay off at 50% odds.

• #### Insurance premium = EMV - (the certainty equivalent)

For the \$1000 coin toss it would be \$500-\$400 = \$100

• #### Deriving Utility Values

There are various methods for determining what an agent's utility values are. One is to establish a scale with a best possible prize and a worst possible catastrophe.

• Normalized Utilities use a scale where the best prize has a utility value of 1 and the worst catastrophe has a value of 0

Intermediate values (between best and worst) are determined by questioning the agent and determining its certainty equivalent concerning many hypothetical lotteries for various outcomes. You can then plot the answers as points on a graph, draw a line, and derive a utility function. However, often these values or only valid for linearly local, incremental values.