Intuitive Bayesian methods for portfolio selection – Part II Bayes and Jeffrey

April 7, 2009

Bayes’ theorem in its common form describes the way in which one’s beliefs about observing ‘A’ are updated by having observed ‘B’. Bayes’ theorem relates the conditional and marginal probabilities of events A and B, where B has a non-vanishing probability.


Each term in Bayes’ theorem has a conventional name:

P(A) is the prior probability or marginal probability of A. It is “prior” in the sense that it does not take into account any information about B.

P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B.

P(B|A) is the conditional probability of B given A.

P(B) is the prior or marginal probability of B, and acts as a normalizing constant.

Bayesian belief updating is the model we use for learning. We in effect already use it when we sit in meetings, discussing best options, as we will have individually modified belief over time as we receive new information – the problem is that it is difficult for others to see what evidence corroborates this belief, which opens up the door for our cognitive biases and simple heuristics.

Jeffrey’s Rule

During product selection and development we acquire and learn new information, which allows us to update our belief about how to make future investment. However, we know that some information is of a higher quality e.g. let’s say two people make exactly the same statement; one is a lead customer and the other is a stranger on the street, we know which is of a higher quality with a higher information content. Bayes’ rule relies on learning a definitive new truth to revise our belief. Most new knowledge we acquire during product development cannot be classed as definitively true e.g. one customer may say one thing and another may say something totally different. Jeffrey’ rule allows us to deal with opinion, rumor and weakly supporting evidence.

We can formulate a partition of hypothesis Ho and ~Ho

Ho = We will sell 10 products to customer x this year

We are at a trade show talking to a distributor who tells us he has heard that customer x is currently trialing our competitors products. We will call this new piece of evidence E

E = Customer x is currently trialing our competitors products

Before we had heard this we may have been quite bullish about the prospects of selling to customer x because we have had several meetings where they expressed interest and have been talking about using some demo equipment.

Pr(Ho)=0.8

However if it is true that customer x is currently trialing the competitor products then I figure that is bad news as they need to commit resource to testing and are further down the line with our competitors.

Pr(Ho/E)=0.1

If what I’ve heard is not true then I have no other reason to revise my prior belief

Pr(Ho/~E)=0.8

I represent my belief in light of the new rumor as Pr*, so that Pr*(Ho) stands for my belief in Ho in light of the new information E.

When talking to the distributor he can’t remember who he heard it from but is pretty sure that he is right. I might assign a probability that the information is right to 0.75.

Pr*(E)=0.75    Pr*(~E)=0.25

Jeffrey’s revision of Bayes’ rule is reminiscent of the rule for total probability

Pr*(Ho) =Pr(Ho/E)Pr*(E)+Pr(Ho/~E)Pr*(~E)

Jeffrey tells us to conclude that Pr*(Ho)=0.275. Before we heard the rumor we thought it was quite probable that we would sell to customer x, but things are looking a bit more bleak.

Dashboard representation

We can put together a dashboard that allows a user to start with a prior belief and update using Jeffrey’s rule. Two sliders are used to input Pr(Ho/E) and Pr*(E). The numeric inputs are augmented with descriptive labels.

Examples

If we receive a new piece of information that definitively refutes our hypothesis, but we know the source is completely unreliable then we would have no reason to update our belief e.g. if a stranger in the street says he wouldn’t buy our chemical detection equipment, this has no relevance or impact on my belief that the US Army will.

If we receive a new piece of information that we know is definitely true but is doesn’t add much to support our hypothesis then our posterior belief will be unchanged. For example, two people from one company tell me a piece of information separately. When I hear it from the first person I update my belief accordingly, when I hear it for the second time is gives me no new knowledge even though I believe the source completely.

Potential problems with the application of Jeffrey’s rule

Prior Belief

We can look at what happens if we start out with very different prior beliefs. If we are rationally updating with new evidence and agree on the impact and quality we should eventually converge on a common belief.

Evidence 

Pr(Ho/~E) 

Pr(Ho/E) 

Pr*(E) 

Updated 

0 

1.00 

0.16 

0.23 

0.81 

1 

0.81 

0.11 

0.44 

0.50 

2 

0.50 

0.64 

0.51 

0.57 

3 

0.57 

0.90

0.69 

0.80 

4 

0.80 

0.16 

0.04 

0.78 

5 

0.78 

0.74 

0.38 

0.76 

6 

0.76 

0.62 

0.40 

0.71 

Table 1 Change in belief from a starting belief of 1

Evidence 

Pr(Ho/~E) 

Pr(Ho/E) 

Pr*(E) 

Updated 

0 

0.00 

0.16 

0.23 

0.04 

1 

0.04 

0.11 

0.44 

0.07 

2 

0.07

0.64 

0.51 

0.36 

3 

0.36 

0.90 

0.69 

0.74 

4 

0.74 

0.16 

0.04 

0.72 

5 

0.72 

0.74 

0.38 

0.73 

6 

0.73 

0.62 

0.40 

0.68 

Table 2 change in belief from a starting belief of 0

The tables above and graph below illustrate the sequential application of Jeffrey’s rule. We start with differing prior beliefs and as new evidence arrives we update our belief. The dataset for Pr(Ho/E) and Pr*(E) are randomly generated number between 0 and 1. We can see that after 3-4 pieces of evidence we are starting to converge on a common belief. While not rigorous, inspection of simulated cases supports the idea that beliefs will converge irrespective of the staring belief.

Applying the principle of insufficient reason to prior belief

What happens if we start with no evidence at all for a hypothesis? We may be inclined to say that there is nothing to choose between the alternatives, true or false, so they should be treated as equally probable- this is the principle of insufficient reason or the principle of indifference. However we can look at a simple example; I state a hypothesis, “your car is red”. Initially without any evidence it doesn’t seem that the partition “your car is red” and “your car is not red” would have an equal probability.

In most business examples I can think of it is usually more likely for a specific hypothesis to be false; “this product will be successful” vs “this product will fail”. There are usually many more ways to fail than to be successful. We may be happy to assign a personal probability to the prior belief as opposed to assuming indifference. However this may allow certain hypothesis an ‘easy ride’ without forcing us to find evidence to corroborate or falsify. I prefer to operate the maxim ‘guilty until proven innocent’; assume the hypothesis is false until proven otherwise. This forces me to find evidence so I can justify my belief position – just because I think it is obvious that something is true doesn’t mean that others do. If I already have a high prior belief it should be easy for me to find the supporting evidence. This also means that I will be operating conservatively in the early stages as my belief is ‘dragged down’ by the memory of initial belief up to the point of convergence.

Order of discovery

It would also seem intuitively obvious that the order in which we uncover new evidence should make no difference to our eventual beliefs. We have generated 20 discrete pieces of evidence and updated belief at each stage. We have then reordered the evidence (re-sampling without replacement) and calculate the new belief trajectory. Interestingly we can have marked differences in belief at the end of the process. The results are presented without further discussion, but this may pose a significant problem in the application of this belief updating methodology.

The above re-sampling example assumes that we would actually assign the same ‘marginal belief change’ irrespective of the order of discovery. This may not be a valid assumption and we can look at an example from history. In 1818 Siméon Poisson deduced from Augustin Fresnel’s theory the necessity of a bright spot at the centre of the shadow of a circular opaque obstacle. With his counterintuitive result Poisson hoped to disprove the wave theory; however Dominique Arago experimentally verified the prediction and today the demonstration goes by the name “Poisson’s (or Arago’s) spot.” Since the spot occurs within the geometrical shadow, no particle theory of light could account for it, and its discovery in fact provided weighty evidence for the wave nature of light, much to Poisson’s chagrin. If I believed in the corpuscular theory of light I would be extremely surprised to see a Poisson spot. However once I have seen it and adjusted my belief accordingly, seeing it again would only have a very small impact on my belief; the new experiment contains very little information. This is the same as saying that the marginal belief change for a particular piece of evidence depends on my current belief and the history of how I arrived here. It doesn’t therefore seem valid to resample, as we deal with marginal change in belief, not absolute values as new evidence arrives.


Intuitive Bayesian methods for portfolio selection – Part I Background

April 7, 2009

Introduction

Disruptive platform technologies usually have a broad base of application. During early stage development, before there is a developed market, the selection of a particular product is usually a ‘high risk, low data’ decision. There are a large number of unknowns, both the known unknowns and the unknown unknowns; we seek the resolve these over time. In this type of situation it is difficult to make the initial portfolio selection decision and to effectively monitor the resolution of uncertainty, and determine the ultimate ‘chance of success’ for the product.

Problems in portfolio selection and project monitoring

The portfolio selection process, even when highly structured, often reduces to persuasion by advocates and champions. When a lot of data is being presented it is easy to forget ‘how we arrived’ at a particular position, assigning a higher importance to things that we heard recently (or long ago, depending on how your mind works). Soaring rhetoric can outweigh sober analysis and dispassionate appraisal of risk. It can be difficult to judge the ‘quality’ of a piece of information, which may find itself as a lynchpin in an argument to take a particular course of action. With a lot of unknowns it can be difficult to formulate go/no-go metrics and not relax the criteria when you get to the decision point.

Cognitive biases

The field of behavior economics examines some of the less rational beliefs of Homo economicus. Work by Tversky and Kahneman illustrate cases of overconfidence in our abilities, the desire to go with the herd and a propensity for rolling rationalization. Here is a list of cognitive biases that you can easily imagine arise in portfolio selection processes.

Objectives

  1. Develop a simple methodology and toolset that allows us to :-
  2. Reduce complex business decisions to specific and testable hypothesis, which can be definitively refuted.
  3. Systematically revise our ‘belief’ in a hypothesis as we receive new information.
  4. Integrate new information of many types and forms, of varying degrees of ‘quality’.
  5. Maintain a history of how we arrived at a particular belief to provide an ‘audit trial’ or ‘memory’ to support future decisions and actions.
  6. Integrate and logically connect hypothesis to create a ‘belief network’ that supports complex decision making.
  7. Avoid cognitive biases and increase objectivity

Logic and Probability

There are three main modes of argument, deduction, induction and abduction (inference to best explanation IBE). Inductive logic analyses risky arguments using probability ideas. There are however different interpretations of what ‘a probability is’.

Frequentists talk about probabilities only when dealing with experiments that are random and well-defined. The probability of a random event denotes the relative frequency of occurrence of an experiment’s outcome, when repeating the experiment. Frequentists consider probability to be the relative frequency “in the long run” of outcomes.

Bayesians, however, assign probabilities to any statement whatsoever, even when no random process is involved. Probability, for a Bayesian, is a way to represent an individual’s degree of belief in a statement, given the evidence.

Logical Probability is thought of as a logical relation between a hypothesis and the evidence for it. J.M. Keynes and Rudolf Carnap both favored a logical theory of probability. Personal probabilities are a private matter, they are up to the individual and anything goes so long as be basic rules of coherency are obeyed. Logical probability maintains that there are uniquely correct, uniquely rational judgments of the probability of a hypothesis in the light of evidence.

For the purposes of decision making in a business context there are very few cases where a Frequentists approach can be used. We tend to use the Bayesian notion of probability where belief allows us to make investment decisions.

It is plausible to connect personal degrees of belief and personal betting rates

You would not pay more than $1 to win $2 on the flip of a coin. If you have some domain specific business knowledge that allows you to exploit an opportunity, your betting rate would be markedly different from someone without that knowledge. During product development as uncertainty is resolved our beliefs are updated and we revise the level of investment we are willing to make. People have always used this ‘managerial flexibility’ and there is now a move to formalize this type of ‘real option’ thinking in investment and portfolio selection.

Verificationism and Falsifiability

There are two common problems in portfolio decision making, how do we extrapolate experience to the future? And how can we provide definitive go/no-go criteria when we do not know the problem well? The former is the problem of induction, and is the question of whether inductive reasoning leads to truth. That is, what is the justification for presupposing that a sequence of events in the future will occur as it always has in the past (for example, that the laws of physics will hold as they have always been observed to hold). If we cannot assume uniformity of nature for physical laws we definitely cannot do so in a business context where we know that the landscape changes very quickly.

Often a go/no-go criteria is framed in a way that allows it to get out of jail down the line. A criteria such as, “show interest from a customer” is quite broad. If in a month’s time if we hear a statement “Fred and Jeff seem quite interested”, this adds practically no new useful knowledge upon which to base a decision – “A difference that makes no difference is no difference”. It also allows us to introduce an ad hoc revisions to ‘pass’ the criteria. If we set criteria such as “one sale made by the end of the quarter”, then we have something that is definitively testable. This is a criterion that puts itself at risk, which can be refuted or falsified – falsification adds new knowledge as it allows us to eliminate options and make definite investment decisions i.e. don’t invest. Falsifiability was put forward as solution to the problem of induction by Karl Popper.

This is related to the Logical Positivist view of the verifiability theory of meaning: the meaning of a sentence consists in its method of verification. In other words, if a sentence or statement has no possible method of verification, it has no meaning. It is pointless to make a go/no-go goal such as, “demonstrate our value proposition and facilitate end to end knowledge transfer”, as there is no possible way to test this and it therefore falls into the category of a nonsensical statement (also known as bullshit bingo).


Intuitive Bass Diffusion

July 4, 2008

There is an excellent post by Mathias, which rephrases the Bass diffusion curve in more accessible terms and language. You can view his posting for the full explanation. I’ve taken the Excel file and created an interactive dashboard from it.

Click here to launch the Dashboard


Interactive Gompertz Model

June 28, 2008

In high tech start-ups the development cycle can last for a period of several years. We can capture new product introduction where the pre-revenue start-up phase is anticipated to be long using a Gompertz curve. There is a full description on Wikipedia.

Sales Function


Cumulative Sales Function


m

500

Ultimate market potential (m)

b

0.4

Scale Parameter (b)

η

30

Shape Parameter (n)

Click here to launch Interactive Gompertz Model


The Xcelcius dashboard allows users to interactively vary the parameters of the model. This is useful when doing ‘what if’ analysis during product portfolio planning stage. The sales profile is more realistic and can be embedded into the interactive portfolio or we can create a Monte Carlo income statement with distributions to describe uncertainty.


Interactive Net Present Value (NPV)

June 28, 2008

Net present value (NPV) is defined as the present value of net cash flows. It is a standard method for using the time value of money to appraise long-term projects. See full description at Wikipedia.

Each cash inflow/outflow is discounted back to its present value (PV). Then they are summed. Therefore


Where

t – the time of the cash flow

N – the total time of the project

r – the discount rate (the rate of return that could be earned on an investment in the financial markets with similar risk)

Ct – the net cash flow (the amount of cash) at time t

Interactive NPV Calculation – Click here to launch NPV chart


The following dashboard allows you to change the discount rate and see the time value of a dollar over a period of 5 years.


At a 10% discount rate a dollar in year 5 is worth 62 cents in today’s money


At a 50% discount rate a dollar in year 5 is worth 13 cents in today’s money


We can also demonstrate an arbitrary project with an initial cash outlay and an increasing yearly income stream


At a higher discount rate of 50% the future income is heavily discounted and the overall project NPV is significantly reduced.


Index of Interactive Dashboards

June 25, 2008

Interactive Portfolio Model – Posting / Dashboard

Product Requirements Capture and Competitive Benchmarking – Posting / Dashboard

Economic Value Model – Posting / Dashboard

Bass Diffusion Model – Posting / Dashboard

Value at Risk – Posting / Dashboard

Net Present Value – Posting / Dashboard

Gompertz Model – Posting / Dashboard

Intuitive Bass Diffusion – Posting / Dashboard

 


Interactive Bass Diffusion Model

June 25, 2008

Click here to launch Interactive Bass Diffusion Model

The Bass diffusion model was developed by Frank Bass and describes the process how new products get adopted as an interaction between users and potential users. The model is widely used in forecasting, especially product forecasting and technology forecasting. Click here for the full description on Wikipedia.

The function describing Sales S(t) is given by


m

500

Ultimate market potential

p

0.02

Coefficient of imitation

q

0.4

Coefficient of innovation

Click here to launch Interactive Bass Diffusion Model

The Xcelcius dashboard allows users to interactively vary the parameters of the diffusion model. This is useful when doing ‘what if’ analysis during product portfolio planning stage. The sales profile is more realistic and can be embedded into the interactive portfolio or we can create a Monte Carlo income statement with distributions to describe uncertainty in ultimate market potential (m), coefficient of imitation (p) and coefficient of innovation (q).