Thursday, May 9, 2013

Some Math for Kevin D. Williamson

I just read this article by my favorite columnist Kevin D. Williamson. The article is primarily about economics, but Mr. Williamson makes an analogy between capitalism and biological evolution that intrigued me. My objective here is not to criticize him for making a bad analogy. As analogies go it's not that bad, and any analogy taken too far will eventually disintegrate. I rather wish to make an important point about the theory of biological evolution that I believe someone of Mr. Williamson's intellectual caliber has the ability to grasp as well as the political leanings to take seriously. I think the analogy he uses is useful for explaining the point, and I hope he will hear me out.

Mr. Williamson uses the pencil as his example of something which no individual person knows how to make but gets made anyway through a complicated network of cooperating entities because each of these cooperating entities knows how to do one particular thing. I'll make this as simple as possible for purposes of discussion. Let's say one factory can make the eraser, one can make the wooden piece with the graphite in it, and a third factory can make the metal connector. Then a fourth factory knows how to assemble all three pieces into a finished pencil. The point Mr. Williamson makes about this bears repeating. Because no single person or even single organization possesses all the knowledge, expertise and resources to make a pencil from scratch, it would be a major mistake to put one person or organization in charge of the process. The pencil gets made because several different organizations know how to do one part of the process and cooperate with each other to do it. There is no hierarchy or authority over this process. It's just capitalism in action. There is an important point to be made here about how knowledge justifies authority and ignorance doesn't, but that will be in a forthcoming post.

Mr. Williamson then compares this process to biological evolution, presumably because biological evolution also has no central authority guiding everything to a desired result. He explains that like evolution, the market automatically weeds out products that people don't want. The analogy to biological evolution is twofold. The Darwinian model posits random variation and natural selection as working together to produce all of biology from a common ancestor. In this analogy, the random variation would be produced by the factories and the selection provided by the market, or consumers, the end result being a market full of products that are valuable to people and tend to get better over time as bad products are weeded out. Williamson uses this to argue that "Failure works", meaning that producers are basically engaged in a trial and error process that weeds out the bad leaving only the good behind. The producers produce variation and the consumers select what they want and what they don't want from that variation. That is all well and good, but what if the producers actually made things randomly with no objective in mind? What if there really were no intelligence trying to connect the products to the needs and wants of consumers? This is the situation with biological evolution, and we can think about it in terms of Mr. Williamson's analogy.

Suppose we have one hundred factories in the world, each of them responsible for producing random objects the size of a pencil eraser. What is the likelihood that one of these factories will produce a pencil eraser? Let's assume the available resources are one thousand compounds and further stipulate that within this group of compounds exist the ones that could produce a pencil eraser. Let's say that a pencil eraser requires five different compounds and the factories always make parts that are a mix of five compounds. Let's say these factories switch to a new random part once a year and that they have ten years to get it right. A functional eraser can be created by one thousand different combinations of compounds. What is the probability these factories will randomly produce a workable eraser in the time allotted?

The probability calculation would have to include all the probabilistic resources available divided by the total number of possible combinations. The total number of possible combinations is easy to calculate:

1000^5 or 10^15 (for simplicity's sake I assume that the same compound can be used any number of times in the part)

Out of the one quadrillion possibilities, one thousand will produce a workable eraser, so the total probability of getting a workable eraser is:

10^3 / 10^15 = 1/10^12 or one in a trillion

The probabilistic resources available are the number of factories multiplied by the rate multiplied by the time:

100 factories * 1 random products per year per factory * 10 years = 10^3 or one thousand total different products

To get the probability the factories will hit on the right combination to make a workable eraser, simply multiply the probabilistic resources and the probability of finding a workable eraser:

10^3 * 1/10^12 = 1/10^9

For comparison, the odds of winning the Powerball jackpot are 1 in 175 million. Rounded to the nearest power of ten that would be 1/10^8. So the odds of these factories getting a workable eraser is about ten times less likely than the single ticket you buy tonight winning the Powerball jackpot. Obviously we wouldn't call anyone crazy for believing that will probably not happen. But what if there were more factories, or more time for them to get to their goal? Let's use a million factories instead of a hundred, and give them a thousand years as well. And what if each factory switched products ten times a year instead of only once? What then?

10^6 factories * 10 random products per year per factory * 10^3 years = 10^10 total different products

10^10 * 1/10^12 = 1/100

Now the odds are only one in a hundred. Still not good, but perhaps one could be forgiven for believing such a thing could happen. Perhaps a million factories making ten different random products a year for one thousand years could beat the odds and produce a workable eraser. Better would be ten million factories making one hundred different random products a year for a thousand years, because then the probability of hitting the right combination somewhere along the line is one. But the important point is that the probability of a product being made randomly depends on both the probability of the event itself and the probabilistic resources available to the producers.  Another important thing to note is that all this must occur before the marketplace can do its work. This random search must succeed before the consumers in the market can select for the eraser, because if no eraser exists than the marketplace is simply continuously rejecting a long line of equally useless products. Selection cannot act in favor of a positive innovation until the innovation actually exists in the marketplace. Then there is the further question of how the eraser once it is selected gets paired with the other parts of the pencil and, even more difficult, how all the parts get randomly assembled in the right way. 

Of course we all know that there is a great deal of intelligent guidance involved in the making of a pencil. Just because no individual person knows the whole process does not obligate us to believe that nobody knows anything and the whole process is random and unguided, like evolution. There is also plenty of central organizational authority in the small organizations which produce various pieces of the pencil. Though no single company does the whole thing, each single company does do part of the job and within each company there are bosses to organize the labor as well as engineers to design not only the parts themselves to the specifications from both their customers and vendors, but also design the process which makes the parts. The market does indeed favor cooperation between smaller units, but organizational authority is required within even small units, and a great deal of intelligent design is involved in every part of the process, within the units or between them. Decentralization of knowledge, skills and decision-making does not eliminate the need for knowledge, skills and decision-making. Just because a large number of independent intelligent people have a hand in the process doesn't mean the process is not intelligently guided. In fact Williamson makes the case that because more people with more brainpower are involved the process is more intelligent than a socialist style centralized process, not less intelligent. At any rate we cannot get off believing that decentralization is equivalent to randomness and the process lacks intelligent guidance. Biological evolution, however, is claimed to be a random process lacking intelligent guidance and guided instead by natural selection, but it has the same type of probabilistic limits as our fictional random economy. It must also succeed in a completely random and unguided search before natural selection can act. The only thing natural selection can do is take a mutation randomly produced in a single individual and spread it through the entire population of its species. So the mutation must come first and it must come randomly. Natural selection cannot help there. All that's left is the math.

Applying all of this to biological evolution is much less fanciful than the pencil analogy. Calculating the total number of possible amino acid combinations for a given length of protein is straightforward and does not require as many limiting assumptions. In nature there is a natural limit of twenty amino acids. Yes, there a few more that show up in extremely rare cases, but 99% of the time nature is working with twenty. The length of proteins varies from the shortest known functional protein at twenty amino acids long and the longest known at well over 30,000 amino acids, but a typical functional protein is about three hundred amino acids long. The number of possible amino acid combinations is literally more than astronomical:

20^300 or about 10^390

For comparison, the number of atoms in the observable universe is estimated to be around 10^80. The number of seconds that have passed since the big bang 13.77 billion years ago is about 10^18. But what about the probabilistic resources? How many chances does evolution have to win the jackpot?

You can read about Bill Dembski's Universal Probability Bound here. (Dembski, by the way, appears to be a fan of Friedrich Hayek.) I prefer to use a more practical probability bound based on the total number of cells in all of history multiplied by the mutation rate of E. coli, a common prokaryotic cell with a relatively high mutation rate, though not the highest, a distinction belonging to HIV which is not a cell but a virus.

Total number of cells produced per year: 10^30
Total number of years: 10^10
Mutation rate of E. coli:  1*10^-3 per genome per generation

Multiply these all up and you get 10^37 mutations. But hey, let's give evolution the benefit of the doubt and say the mutation rate is one per every new cell, about the mutation rate of HIV. That means we get a nice round number of 10^40. This is the number Michael Behe uses in The Edge of Evolution, a must-read for anyone interested in this topic. The reference for the 10^30 number is a peer-reviewed paper and is given by Behe in the book. I'm lazy and don't want to look it up. Go buy the book yourself or use PubMed. Let's also use a shorter amino acid chain, about half the length of the average protein, of a hundred and fifty residues. I use this number because of Douglas Axe's research on a protein of that length. Again, I don't feel like looking up the references, but they are all peer reviewed published articles in scientific journals and are referenced in Stephen Meyer's book Signature in the Cell. In Axe's case his papers were published in the 90s before he "came out" as an intelligent design theorist, otherwise he probably would have been blacklisted and the papers never published or redacted. (Behe survived at Lehigh University only because he had tenure before he "came out". Dembski wasn't so lucky and was expelled from, of all places, Baylor University.)

Anyway a few years ago I realized you could do a simple experiment, at least by today's standards, to estimate the number of functional proteins within the search space for a given protein length. Shortly thereafter I read Meyer's book and found out it had already been done by Axe, as well as some others. This is the sort of research I would have been doing had I not become discouraged with the government monopoly on scientific research preventing anyone with politically incorrect points of view from getting government funding and competing on a level playing field, but I digress...or not. I would love to hear Mr. Williamson's views on the government monopoly on scientific research and if he believes that only scientific research which conforms to political guidelines like separation of church and state be subsidized. In his article he argues that politics gets in the way of a working free market. What about the marketplace of ideas? Does anyone really believe that young earth creationists would not get funded in a working free market when upwards of forty percent of the American people are young earth creationists? I got into politics because it has thrust itself into my own life in a very personal way, but also because I believe as Mr. Williamson does that politics, not science, not evidence, not reason, is the only thing preventing certain points of view from competing on a level playing field in the marketplace of ideas.

To make a long story short, the number Axe calculated from his experimental results was 1/10^74.

Total number of possible amino acid sequences of 150 residues: 20^150 or about 10^195
Probability of finding another functional protein within that space starting with a functional protein as a template: 1/10^74
Probabilistic resources available in all of life's history: 10^40
Probability of finding a novel functional protein of 150 residues via a random search: 10^40*1/10^74 = 1/10^34

That is worse than the odds of you winning the Powerball jackpot four times in a row, and that's just one short enzyme. Enzymes work alone. Most proteins work in precisely engineered complexes of six or more that must also be assembled correctly. Mr. Williamson likes math. I wonder if he likes those odds? 

Now that's whack.

UPDATE: I just trotted over to UD for the first time in awhile and noticed a new post bearing directly on my topic here. The post is about a paper published by an ID biochemist in Croatia showing mathematically how the distribution of protein families does not show the same type of complexity as self-organizing complexity networks like those found in a free market.

UPDATE: The paper has several of the references I referred to in its bibliography. If you read it (this section is also pasted into the post) you'll see Axe's numbers are 10^-53 to 10^-77. I'm not sure where the range comes from, but if I remember right the 10^-77 number is the number for beta lactamase only. Axe calculated 10^-74 for all proteins of 150 residues.