Thursday, February 28, 2008

I've Got Game

If my wife and I are in a room with a hundred or so people, chances are she is the best looking woman and smartest person in the room. And half the people in the room looking at us are probably wondering "why is that beautiful woman hanging with that big oaf". The answer, which most who know me will find hard to believe, is that I've got game

For example, on the day I met my wife Roxy I was downtown and I saw three tiny vietnamese women running for the bus. At the same time they were yelling to a forth woman who was about a block down the street. The three women boarded the bus and the forth run was running to catch it. Unfortunately some big pedestrians were in her way and the bus driver didn't see her waving her hands and just drove off.

The woman was more than disappointed. She looked scared. I could also tell that she felt small and insignificant. Instantly I knew that this was an opportunity for my game. Like a lot of people my game begins by offering to do a favor. But it goes deeper. To be really effective the favor must be offered in a time of need. And this was such a situation.

I walked up to her and asked her if I could give her a ride. She said something back to me in Vietnamese. As I suspected, she didn't speak English and had no idea what I just said to her. So I got down on my hands and knees and motioned for her to get on my shoulders, and she did. You might think this is kind of wierd but in Vietnam human powered taxis are common and I knew this. As I stood up I looked in store window and saw a sense of relief come over her face as she wiped tears from her eyes.

The next task was to figure out where to take her. Since she couldn't speak English it wasn't easy for her to tell me which way to go but eventually we found a way to communicate. When she wanted me to turn right she pulled one ear and when she wanted be to turn left she pulled the other ear. She pulled both ears when she wanted me to stop. Luckily she knew how to say "go" and she said it with authority.

Seven miles and two and a half hours later we arrived at her doorstep where her mother and two sisters were waiting. I lifted her over my head and put her down on the ground. When she turned around and looked up at me I saw more than gratitude in her eyes -- I saw love.

So what happened? How did this woman fall in love with me in just two and a half hours? Well it turns out that I did more than offer to do a favor in a time of need. I reached inside and satisfied her deepest psychological need. You see, like a lot of small people, what Roxy really needs is someone to boss around. And the bigger that someone is the more that need is satisfied. My game was to give her what she really needed and I did this by allowing her to boss me around for two and a half hours while she rode me home.

Now some of you reading this probably think that I made up this story -- and maybe I did. But you may have also noticed that every once in a while Roxy reaches over and pulls my ear. She does this both in fond rememberance of the day we met and to remind me that she is the boss.

Friday, February 01, 2008

Most DNA is not Junk

One of the things that software developers hate most is providing estimates for how much it will cost to develop a piece of software. It’s hard to answer with any accuracy a question like that but we have to do it so that bean counters can make business decisions. The reason why the question is hard to answer is because there are a lot of unknowns and we have to make guesses about those unknowns based on previous experience. Books have been written on estimating software development costs but a simplified description of the process goes like this:

1) Break the development tasks up into subtasks.
2) Estimate how much it will cost to do these subtasks based on the costs of similar tasks completed previously.
3) Add the estimates of the subtasks to get an estimate of the total.

As you can see there is a lot of guess work involved. It’s not uncommon for such estimates to be off by more than a factor of two. Perhaps they should be called guesstimates.

In my blog entry Junk DNA is a Myth I spouted off about how it was ridiculous to think that 97% of our DNA is junk. I could believe 5% junk due to entropy but not 97%. This blog entry came under criticism by Professor Scott Page. In his criticism, he never provided any proof that the vast majority of DNA is junk, just ridicule. This ridicule may have been a knee-jerk reaction to my blogging alias “Intelligent Designer”. Scott Page believes that anyone who is a creationist is an idiot.

In my defense I am going to make a stab at guesstimating a plausible amount of non-junk DNA in the human genome. I can already hear Scott laughing away in his office now. I’ll have to make several assumptions to come up with this estimate. I’ll italicize these assumptions as I go along. I also plan to make revisions to this estimate as I think through it so please don’t quote me until I am done. I am publishing this draft because I am hoping someone with expert knowledge will stumble on it and chime in with useful information.

So let’s begin. In this estimate I will be using the word “information” to denote DNA that is not junk and “data” to denote DNA which may or may not be junk. I will also be talking about the data in terms of bytes and MBs [megabytes]. A nucleotide can be represented with two bits of data, a string of 4 nucleotides by a byte of data, and 4 million nucleotides by a MB of data. Thus 3.2 billion base pairs of the human genome is equivalent to 800 MB of data. Professor Page believes the human genome has only 24MB of information and that the rest is junk – that makes me laugh.

According to Professor Larry Moran "a bacterial genome is about 4 million base pairs and there's no junk". So I think it is safe to say that there is at least 1MB of information in the human genome.

Now there are 210 known cell types in the human body. I’ll assume that each cell type requires at least 1MB of information. These cell types share a lot of common features so I’ll assume there is a lot of common information. Just how much of the information is shared between these cell types is a guess. I am going to assume that 90% of the information in each cell type is shared and 10% is unique. This means that 210 cell types require 1MB + 209 * .1MB of information. Rounding this implies that there is at least 22MB of information in the human genome.

But this is just the information needed to construct the different cell types. More information is needed for spatial orientation and to coordinate activity among cells to perform complex functions like vision, motor control, digestion and tissue repair. Since the most efficient algorithms to just sort n objects have an order of nlog(n) I am tempted to guesstimate by multiplying 22MB by log(210) to get a lower bound. But that would be bad applied math and just plain lazy. But then again I am not exactly getting paid to do this (wink).

I can think of two other approaches that could be taken. For one of them I need some data points. In particular I need size data about genomes of the simplest multicellular life forms that are well studied and believed not to have junk.