An Algorithm for Business Success?

November 29, 2009

It seems that testing is the flavour of the month in business these days. All the presentations I go to talk about A/B split testing and multivariate Taguchi methods. Of course the guiding principle of testing is a good one; but I think it gives some  the misguided notion that business is a purely deterministic process and that persistent testing  provides an algorithm for success (or quick, cheap failure, which is also good). There are some useful parallels between empiricism and its critics.

What am I actually testing? The process seems pretty simple; do A/B tests on your google ads, your landing pages, your email blasts, your automated workflows etc etc. Eke out success one word change at a time. How do you know you are isolating the one thing you want to test? How do you know you are not just locally optimising in totally the wrong place.

The empiricists and positivists thought the only source of knowledge is experience. It is a fundamental part of the scientific method that all hypotheses and theories must be tested against observations of the natural world, rather than resting solely on a priori reasoning, intuition, or revelation. Sounds reasonable. Quine illustrated problems with this view in the “Two Dogmas of Empiricism”. Quine argued for a holistic theory of testing; he thought that you cannot understand a particular thing without looking at its place in a larger whole. Holism about testing says that we cannot test a single hypothesis in isolation; instead we can only test complex networks of claims and assumptions. To test one claim you need to make assumptions about many other things e.g. measurement equipment, data quality etc. So whenever you think you are testing a single idea, what you are really testing is a long, complicated conjunction of statements. If a test has an unexpected result, then something in that conjunction is false, but the failure of the test itself does not tell you where the error is.

Take an example of ‘test the business model over a period of one year’, the background assumptions and  conjunction of interdependencies are legion. Two things can happen; you can say it doesn’t work when there is a simple element, which can be changed easily, in the web of dependencies that is the cause of failure  i.e. you get a false negative. A wrong pricing decision for example. You can also ‘forgive’ a fundamental problem by saying that something else in the chain is the cause i.e. a false positive. For any complex business decision the theory is always underdetermined by the available evidence i.e. there will always be a range of possible alternative theories compatible with the set of evidence. So what good is my test if it doesn’t tell us something definitive?

It didn’t work this time is different from it doesn’t work. People are also very keen with the notion of failing fast and failing cheap. Once again admirable but how do you know when you have failed? Karl Popper thought science progressed by a process of falsification; from the problem of induction you could never say that a general statement was true from a handful of observations but you could say the statement was false if an observation contradicted it. The issue of underdetermination rears its head again; you could never force someone to logical conclude that a theory was false because it may be a background assumption that is at fault. Falsification also struggles with probabilistic statements; take the example of proton decay – some grand unified theories predict that a proton should decay into new X bosons. During the 80’s there were a lot of experiments and they never saw a proton decay. They were able to put a lower limit of the proton half-life of 6.6×10^33 years but were not able  to say that it doesn’t decay. Most people may conclude that it doesn’t decay but the key thing is that they have to make a choice to believe so, it does not follow logically from observation. Doing a split test on a low volume search term feels a bit like waiting for proton decay.

Now take an example like James Dyson – he made 5,126 prototypes of his vacuum cleaner before hitting the big time. Why did he not declare that he had failed quickly and cheaply after the first 10 tries? Often it is difficult to know if you have the admirable quality of persistence or whether you are just a nutter.

Putting things to the test is a good idea but it only really works in a very well bounded context; most of the success stories come from web-based business that have a large enough user base to derive useful conclusions. For the majority of businesses there will be other things that matter a great deal more.  A business has a huge amount of knobs that you can turn, the only problem is that you can’t turn them all independently of each other. Basically I don’t think people should spend a lot of their time obsessing with analytics. Doing things intuitively has served a lot of people well for a very long time. If anyone can figure out how to do an A/B split test on the ‘cut of your jib’ please let me know.

It is a fundamental part of the scientific method that all hypotheses and theories must be tested against observations of the natural world, rather than resting solely on a priori reasoning, intuition, or revelation
Advertisements

Intuitive Bayesian methods for portfolio selection – Part II Bayes and Jeffrey

April 7, 2009

Bayes’ theorem in its common form describes the way in which one’s beliefs about observing ‘A’ are updated by having observed ‘B’. Bayes’ theorem relates the conditional and marginal probabilities of events A and B, where B has a non-vanishing probability.


Each term in Bayes’ theorem has a conventional name:

P(A) is the prior probability or marginal probability of A. It is “prior” in the sense that it does not take into account any information about B.

P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B.

P(B|A) is the conditional probability of B given A.

P(B) is the prior or marginal probability of B, and acts as a normalizing constant.

Bayesian belief updating is the model we use for learning. We in effect already use it when we sit in meetings, discussing best options, as we will have individually modified belief over time as we receive new information – the problem is that it is difficult for others to see what evidence corroborates this belief, which opens up the door for our cognitive biases and simple heuristics.

Jeffrey’s Rule

During product selection and development we acquire and learn new information, which allows us to update our belief about how to make future investment. However, we know that some information is of a higher quality e.g. let’s say two people make exactly the same statement; one is a lead customer and the other is a stranger on the street, we know which is of a higher quality with a higher information content. Bayes’ rule relies on learning a definitive new truth to revise our belief. Most new knowledge we acquire during product development cannot be classed as definitively true e.g. one customer may say one thing and another may say something totally different. Jeffrey’ rule allows us to deal with opinion, rumor and weakly supporting evidence.

We can formulate a partition of hypothesis Ho and ~Ho

Ho = We will sell 10 products to customer x this year

We are at a trade show talking to a distributor who tells us he has heard that customer x is currently trialing our competitors products. We will call this new piece of evidence E

E = Customer x is currently trialing our competitors products

Before we had heard this we may have been quite bullish about the prospects of selling to customer x because we have had several meetings where they expressed interest and have been talking about using some demo equipment.

Pr(Ho)=0.8

However if it is true that customer x is currently trialing the competitor products then I figure that is bad news as they need to commit resource to testing and are further down the line with our competitors.

Pr(Ho/E)=0.1

If what I’ve heard is not true then I have no other reason to revise my prior belief

Pr(Ho/~E)=0.8

I represent my belief in light of the new rumor as Pr*, so that Pr*(Ho) stands for my belief in Ho in light of the new information E.

When talking to the distributor he can’t remember who he heard it from but is pretty sure that he is right. I might assign a probability that the information is right to 0.75.

Pr*(E)=0.75    Pr*(~E)=0.25

Jeffrey’s revision of Bayes’ rule is reminiscent of the rule for total probability

Pr*(Ho) =Pr(Ho/E)Pr*(E)+Pr(Ho/~E)Pr*(~E)

Jeffrey tells us to conclude that Pr*(Ho)=0.275. Before we heard the rumor we thought it was quite probable that we would sell to customer x, but things are looking a bit more bleak.

Dashboard representation

We can put together a dashboard that allows a user to start with a prior belief and update using Jeffrey’s rule. Two sliders are used to input Pr(Ho/E) and Pr*(E). The numeric inputs are augmented with descriptive labels.

Examples

If we receive a new piece of information that definitively refutes our hypothesis, but we know the source is completely unreliable then we would have no reason to update our belief e.g. if a stranger in the street says he wouldn’t buy our chemical detection equipment, this has no relevance or impact on my belief that the US Army will.

If we receive a new piece of information that we know is definitely true but is doesn’t add much to support our hypothesis then our posterior belief will be unchanged. For example, two people from one company tell me a piece of information separately. When I hear it from the first person I update my belief accordingly, when I hear it for the second time is gives me no new knowledge even though I believe the source completely.

Potential problems with the application of Jeffrey’s rule

Prior Belief

We can look at what happens if we start out with very different prior beliefs. If we are rationally updating with new evidence and agree on the impact and quality we should eventually converge on a common belief.

Evidence 

Pr(Ho/~E) 

Pr(Ho/E) 

Pr*(E) 

Updated 

0 

1.00 

0.16 

0.23 

0.81 

1 

0.81 

0.11 

0.44 

0.50 

2 

0.50 

0.64 

0.51 

0.57 

3 

0.57 

0.90

0.69 

0.80 

4 

0.80 

0.16 

0.04 

0.78 

5 

0.78 

0.74 

0.38 

0.76 

6 

0.76 

0.62 

0.40 

0.71 

Table 1 Change in belief from a starting belief of 1

Evidence 

Pr(Ho/~E) 

Pr(Ho/E) 

Pr*(E) 

Updated 

0 

0.00 

0.16 

0.23 

0.04 

1 

0.04 

0.11 

0.44 

0.07 

2 

0.07

0.64 

0.51 

0.36 

3 

0.36 

0.90 

0.69 

0.74 

4 

0.74 

0.16 

0.04 

0.72 

5 

0.72 

0.74 

0.38 

0.73 

6 

0.73 

0.62 

0.40 

0.68 

Table 2 change in belief from a starting belief of 0

The tables above and graph below illustrate the sequential application of Jeffrey’s rule. We start with differing prior beliefs and as new evidence arrives we update our belief. The dataset for Pr(Ho/E) and Pr*(E) are randomly generated number between 0 and 1. We can see that after 3-4 pieces of evidence we are starting to converge on a common belief. While not rigorous, inspection of simulated cases supports the idea that beliefs will converge irrespective of the staring belief.

Applying the principle of insufficient reason to prior belief

What happens if we start with no evidence at all for a hypothesis? We may be inclined to say that there is nothing to choose between the alternatives, true or false, so they should be treated as equally probable- this is the principle of insufficient reason or the principle of indifference. However we can look at a simple example; I state a hypothesis, “your car is red”. Initially without any evidence it doesn’t seem that the partition “your car is red” and “your car is not red” would have an equal probability.

In most business examples I can think of it is usually more likely for a specific hypothesis to be false; “this product will be successful” vs “this product will fail”. There are usually many more ways to fail than to be successful. We may be happy to assign a personal probability to the prior belief as opposed to assuming indifference. However this may allow certain hypothesis an ‘easy ride’ without forcing us to find evidence to corroborate or falsify. I prefer to operate the maxim ‘guilty until proven innocent’; assume the hypothesis is false until proven otherwise. This forces me to find evidence so I can justify my belief position – just because I think it is obvious that something is true doesn’t mean that others do. If I already have a high prior belief it should be easy for me to find the supporting evidence. This also means that I will be operating conservatively in the early stages as my belief is ‘dragged down’ by the memory of initial belief up to the point of convergence.

Order of discovery

It would also seem intuitively obvious that the order in which we uncover new evidence should make no difference to our eventual beliefs. We have generated 20 discrete pieces of evidence and updated belief at each stage. We have then reordered the evidence (re-sampling without replacement) and calculate the new belief trajectory. Interestingly we can have marked differences in belief at the end of the process. The results are presented without further discussion, but this may pose a significant problem in the application of this belief updating methodology.

The above re-sampling example assumes that we would actually assign the same ‘marginal belief change’ irrespective of the order of discovery. This may not be a valid assumption and we can look at an example from history. In 1818 Siméon Poisson deduced from Augustin Fresnel’s theory the necessity of a bright spot at the centre of the shadow of a circular opaque obstacle. With his counterintuitive result Poisson hoped to disprove the wave theory; however Dominique Arago experimentally verified the prediction and today the demonstration goes by the name “Poisson’s (or Arago’s) spot.” Since the spot occurs within the geometrical shadow, no particle theory of light could account for it, and its discovery in fact provided weighty evidence for the wave nature of light, much to Poisson’s chagrin. If I believed in the corpuscular theory of light I would be extremely surprised to see a Poisson spot. However once I have seen it and adjusted my belief accordingly, seeing it again would only have a very small impact on my belief; the new experiment contains very little information. This is the same as saying that the marginal belief change for a particular piece of evidence depends on my current belief and the history of how I arrived here. It doesn’t therefore seem valid to resample, as we deal with marginal change in belief, not absolute values as new evidence arrives.


Intuitive Bayesian methods for portfolio selection – Part I Background

April 7, 2009

Introduction

Disruptive platform technologies usually have a broad base of application. During early stage development, before there is a developed market, the selection of a particular product is usually a ‘high risk, low data’ decision. There are a large number of unknowns, both the known unknowns and the unknown unknowns; we seek the resolve these over time. In this type of situation it is difficult to make the initial portfolio selection decision and to effectively monitor the resolution of uncertainty, and determine the ultimate ‘chance of success’ for the product.

Problems in portfolio selection and project monitoring

The portfolio selection process, even when highly structured, often reduces to persuasion by advocates and champions. When a lot of data is being presented it is easy to forget ‘how we arrived’ at a particular position, assigning a higher importance to things that we heard recently (or long ago, depending on how your mind works). Soaring rhetoric can outweigh sober analysis and dispassionate appraisal of risk. It can be difficult to judge the ‘quality’ of a piece of information, which may find itself as a lynchpin in an argument to take a particular course of action. With a lot of unknowns it can be difficult to formulate go/no-go metrics and not relax the criteria when you get to the decision point.

Cognitive biases

The field of behavior economics examines some of the less rational beliefs of Homo economicus. Work by Tversky and Kahneman illustrate cases of overconfidence in our abilities, the desire to go with the herd and a propensity for rolling rationalization. Here is a list of cognitive biases that you can easily imagine arise in portfolio selection processes.

Objectives

  1. Develop a simple methodology and toolset that allows us to :-
  2. Reduce complex business decisions to specific and testable hypothesis, which can be definitively refuted.
  3. Systematically revise our ‘belief’ in a hypothesis as we receive new information.
  4. Integrate new information of many types and forms, of varying degrees of ‘quality’.
  5. Maintain a history of how we arrived at a particular belief to provide an ‘audit trial’ or ‘memory’ to support future decisions and actions.
  6. Integrate and logically connect hypothesis to create a ‘belief network’ that supports complex decision making.
  7. Avoid cognitive biases and increase objectivity

Logic and Probability

There are three main modes of argument, deduction, induction and abduction (inference to best explanation IBE). Inductive logic analyses risky arguments using probability ideas. There are however different interpretations of what ‘a probability is’.

Frequentists talk about probabilities only when dealing with experiments that are random and well-defined. The probability of a random event denotes the relative frequency of occurrence of an experiment’s outcome, when repeating the experiment. Frequentists consider probability to be the relative frequency “in the long run” of outcomes.

Bayesians, however, assign probabilities to any statement whatsoever, even when no random process is involved. Probability, for a Bayesian, is a way to represent an individual’s degree of belief in a statement, given the evidence.

Logical Probability is thought of as a logical relation between a hypothesis and the evidence for it. J.M. Keynes and Rudolf Carnap both favored a logical theory of probability. Personal probabilities are a private matter, they are up to the individual and anything goes so long as be basic rules of coherency are obeyed. Logical probability maintains that there are uniquely correct, uniquely rational judgments of the probability of a hypothesis in the light of evidence.

For the purposes of decision making in a business context there are very few cases where a Frequentists approach can be used. We tend to use the Bayesian notion of probability where belief allows us to make investment decisions.

It is plausible to connect personal degrees of belief and personal betting rates

You would not pay more than $1 to win $2 on the flip of a coin. If you have some domain specific business knowledge that allows you to exploit an opportunity, your betting rate would be markedly different from someone without that knowledge. During product development as uncertainty is resolved our beliefs are updated and we revise the level of investment we are willing to make. People have always used this ‘managerial flexibility’ and there is now a move to formalize this type of ‘real option’ thinking in investment and portfolio selection.

Verificationism and Falsifiability

There are two common problems in portfolio decision making, how do we extrapolate experience to the future? And how can we provide definitive go/no-go criteria when we do not know the problem well? The former is the problem of induction, and is the question of whether inductive reasoning leads to truth. That is, what is the justification for presupposing that a sequence of events in the future will occur as it always has in the past (for example, that the laws of physics will hold as they have always been observed to hold). If we cannot assume uniformity of nature for physical laws we definitely cannot do so in a business context where we know that the landscape changes very quickly.

Often a go/no-go criteria is framed in a way that allows it to get out of jail down the line. A criteria such as, “show interest from a customer” is quite broad. If in a month’s time if we hear a statement “Fred and Jeff seem quite interested”, this adds practically no new useful knowledge upon which to base a decision – “A difference that makes no difference is no difference”. It also allows us to introduce an ad hoc revisions to ‘pass’ the criteria. If we set criteria such as “one sale made by the end of the quarter”, then we have something that is definitively testable. This is a criterion that puts itself at risk, which can be refuted or falsified – falsification adds new knowledge as it allows us to eliminate options and make definite investment decisions i.e. don’t invest. Falsifiability was put forward as solution to the problem of induction by Karl Popper.

This is related to the Logical Positivist view of the verifiability theory of meaning: the meaning of a sentence consists in its method of verification. In other words, if a sentence or statement has no possible method of verification, it has no meaning. It is pointless to make a go/no-go goal such as, “demonstrate our value proposition and facilitate end to end knowledge transfer”, as there is no possible way to test this and it therefore falls into the category of a nonsensical statement (also known as bullshit bingo).


Who said I was rational

September 7, 2008

Studies in behavioural finance often highlight the ‘irrationality’ of an economic agent in making decisions. This may just boil down to the definition we use for rationality, which may not necessarily in itself be rationally justified.

We can start by looking at other uses of the word ‘rational’ in the philosophy of science. The Scottish empiricist David Hume [1739] asked “What reason do we have for thinking that the future will resemble the past?” There is no contradiction in supposing that the future could be totally unlike the past. It is possible that the world could change radically at any point, rendering previous experience useless. This is known as the problem of induction; we have no rational basis or reason to expect the past to resemble the future. A famous example attributed to Bertrand Russell is that of a turkey, which is fed every day at 9am for 364 days of the year; the next day the turkey walks out assuming a bowl of food will be waiting and instead the farmer wrings its neck and cooks it for Christmas dinner. There have been a number of attempts to tackle the problem of induction. Karl Popper claimed to solve the problem of induction using the principle of falsification, which can be summarised with a quote attributed to Einstein, “No amount of experimentation can ever prove me right; a single experiment can prove me wrong.” This sentiment is strongly highlighted by the example of Newton’s laws; the most tested and confirmed laws in the history of mankind yet they were proven to be ‘incorrect’ in certain situations and were eventually displaced by Einstein’s theory of relativity.

The definition of rationality in this context relates to deductive reasoning, where if the premisses of an argument are true, then the conclusion is guaranteed to be true – how can inductive references be rational when the conclusions are not guaranteed by the premisses? Normal usage of the word rational is not strictly limited to deductive reasoning; people would class it as one species of rational argument. At the same time nearly everybody applies the term rational to other types of reasoning and in particular to inductive reasoning. If two people designed a bridge, one of whom had already successfully built ten previous bridges out of steel, the other wants to build his bridge from butter, which would people rather walk on? There is no logically guarantee that the steel bridge will not collapse and equally there is no logical guarantee that the bridge made from butter will collapse. Most people would, by definition, say the builder making his bridge from steel is rational because he has designed his bridge using tried and tested techniques derived inductively. This is the ‘paradigm case argument’; what more is needed to show that inductive reasoning is rational than everyone agreeing that it is rational.

Another analogy used to defend the rationality of inductive reasoning is put forward by Peter Strawson. If someone is concerned whether a particular action is legal they can consult the law books; but what if they were to ask if the law itself was legal. The law is a standard against which the legality of other things is judged, and it makes little sense to enquire whether the standard itself is legal. Strawson argues that the same applies to induction. Induction is one of the standards we use to decide whether claims about the world are justified; so it makes little sense to ask whether induction itself is rationally justified.

The logical positivists acknowledged the problem of induction but they took a pragmatic approach and did not see this as a problem in practice – “if an observation to which a given proposition is relevant conforms to our expectations, the truth of that proposition is confirmed. One cannot say that a proposition has been proved absolutely valid, because it is still possible that a future observation will discredit it”. In response to Popper’s solution by falsification Ayer states “If a proposition is consequence of an unfavourable observation, one cannot say that it has been invalidated absolutely. For it is still possible that future observations will lead us to reinstate it”. Ayer rephrases the definition of rationality in relation to induction –”There is no absolute standard of rationality, just as there is no method of constructing hypotheses which is guaranteed to be reliable. We trust the methods of contemporary science because they have been successful in practice. If in the future we were to adopt different methods, then beliefs which are now rational might become irrational from the standpoint of these new methods. But the fact that this is possible has no bearing on the fact that these beliefs are rational now”.

The classical definition of ‘rationality’ of economic man, homo econimicusIn economics, sociology, and political science, a decision or situation is often called rational if it is in some sense optimal, and individuals or organizations are often called rational if they tend to act somehow optimally in pursuit of their goals. Thus one speaks, for example, of a rational allocation of resources, or of a rational corporate strategy. In this concept of “rationality”, the individual’s goals or motives are taken for granted and not made subject to criticism, ethical or otherwise. Thus rationality simply refers to the success of goal attainment, whatever those goals may be.

There is extensive literature in behavioral finance relating to economic irrationality and cognitive biases – the assertion of irrationality is typically framed within the narrow definition of homo econimicus. A classic example of conventional irrationality is found with the Ultimatum Gamean experimental economics game in which two players interact to decide how to divide a sum of money that is given to them. The first player proposes how to divide the sum between themselves, and the second player can either accept or reject this proposal. If the second player rejects, neither player receives anything. If the second player accepts, the money is split according to the proposal. In principle if an unfair proposal (e.g. 95% for me 5% for you) was made by the first player it is strictly speaking still rational for the second player to accept the offer. In practice this rarely happens and the second player will reject the offer as unfair. However if player two left the experiment and told his friends about the decision they would accept that he had done the ‘right thing’. Within the context of a paradigm case argument the majority consensus of the decision defines it as a rational act.

The definition if rationality, as related to deductive reasoning, omits other valid modes of inference. It is equally the case that the narrow definition of rational economic man excludes other valid decision making protocols – we live deep in societal structure where notions of reciprocity are strongly embedded. We use decision making heuristics that serve us well in the complexity of real life; we do not jettison these in the artificial abstraction of a one shot economic game designed to flag contrived irrationality. If everyone agrees that I’m acting rationally then how can the economists say I’m not?