Wednesday, March 05, 2008

Repetitive Sequences Don't Imply Junk DNA

It seems like part of the argument in favor of junk DNA is the various types of repetitive sequences found in DNA. Shouldn’t repetition imply design instead? I don’t know anything about electronics but I do know what I see when I pop the cover off my computer and look at the circuit boards – the same old widgets used over and over again.

Consider software. If you look at software executables (like .exe and .dll files on Windows computers) they are full of repeated sequences. You may have written a program yourself. If so, you would certainly be familiar with the concept of a subroutine. At the assembly level, whenever a subroutine is called registers are pushed on the stack, when one returns they are popped of the stack. The code to push and pop registers is automatically generated by the complier and is therefore not apparent at the source code level. This translates into a massive amount of simplistic repetition at the binary level. These kinds of repetitive sequences would probably be classified as SINES by geneticists trying to understand the binary code. While this kind of code doesn’t map to any kind of a program function it is essential.

You may also know that most software developers these days work with object oriented languages where inheritance and polymorphism are used to develop hierarchies of classes. At the source code level inheritance enables developers to reuse source code without retyping it. However, when source code is compiled into binary form the result is a massive amount of repetition, but of a more sophisticated nature than that of just pushing and popping registers. These kinds of repetitive sequences would probably be classified as LINES.

You could say that repetition is a sign of information. Aflkjkjnejiadudfmoqe. The characters preceding this sentence were randomly typed and contain no information. This sentence repeats character sequences that have been used before and you can recognize them. The repetitive nature of DNA implies information content – not the other way around.

13 Comments:

At 10:49 PM, March 07, 2008, Anonymous Anonymous said...

If there was one thing I'd like you to take from all the thinking you've been doing about junk DNA. The Genome is not a computer program or an electronic circuit.

The genome is a set of instructions for making proteins. It's the proteins and other "gene products" that actually go about making and running bodies. Why would you need 1 million copies of the same ~300 base sequence if the whole genome is in every cell and avaliable to be used at anytime?

I'm no programmer but if I write a python script that I know will repeatedly use then I'll define that up front so I can call on it anytime I want it.

 
At 10:15 PM, March 08, 2008, Blogger Randy Stimpson said...

Let me put your mind at ease ... I know that a genome is not a computer program. Anyone who suggests I think otherwise is constructing a strawman argument.

I could speculate that you think nucleotide sequences are python scripts. You have just argued that since repetition doesn't make sense for python scripts it doesn't make sense for nucleotide sequences either. Therefore nucleotide sequence repeats must be junk DNA.

I could also speculate that you think there is only one genome. You did say "The Genome". Instead I'll assume that you didn't proof read your comment very well.

 
At 7:02 PM, March 09, 2008, Anonymous Anonymous said...

Hmm,

I think what I argued was that because a gene is available to be used millions of times in each cell of the body their is little need for each cell carry a million copies of a particular gene. The fact that the gene that I'm talking about, Alu, has almost certainly got it's wide distribution due to piggy backing on other genes in the genome. The Python buit was just an example since you're so keen on using programming and electronics as a metaphor for the way biology works (an astoundingly poor metaphor in this biologists opinion...)

I guess I was using "The Genome" in the slightly tongue in cheek Aristotelian sense; some of your other posts have suggested you are mainly concerned with a near junkless human genome. I don't know why humans should be more junkless than onions but that's up to you.

 
At 11:35 AM, March 10, 2008, Blogger Randy Stimpson said...

I wasn't aware that you are a biologist. Does that mean you are a biology major in college or did you graduate with a degree in biology and are now working in a profession that uses your educational background in biology?

By the way, an Alu sequence is not a gene.

 
At 4:56 PM, March 10, 2008, Anonymous Anonymous said...

Randy,

I'm a PhD student in evolutionary biology.

Ask three biologists and you'll get three definitions of gene. If we mean a distinct DNA sequence that does something (and getting copied must be something) then alu is a gene.

 
At 8:28 PM, March 10, 2008, Blogger Randy Stimpson said...

That's interesting in a lot of ways. It would be cool if you had a blogger profile even if you don't have a blog so that others reading your comments can know something about you. You can make one here. I myself am wondering what university you are attending and how far you are in your Ph.D. program.

By your definition of a gene it would appear that all DNA sequences are genes. Alu sequences are the most common type of transposon. Transposons were once called "jumping genes" but I don't think that is considered acurate phraseology. What I have just said is a bit off the topic of this blog entry so we will just have to agree to disagree about whether Alu is considered a gene.

 
At 7:16 AM, June 24, 2008, Blogger Jens Hegg said...

I think what is missing from your argument is that software doesn't change. The bits and bytes stay exactly as they were compiled. DNA on the other hand has been proven to change through mutation, cell division accidents, transposition, etc...

Yes there are long strings of repeated info in software, but just because they appear to be similar in their repetitiveness to DNA does not imply causation or design. We know why software has repition, we also understand a possible mechanism for the repition of junk DNA.

Proposing a designer based on a pattern would be fine if there were no alternate explanation, assuming you then tested it as a hypothesis. What you are saying is basically, "We have a pattern, somebody must have made it since we don't fully understand it" and then leaving it at that and incorporating that into your beliefs.

Think of ripples seen in rock. There are several kinds. In one whole layers can look wavy, often in a very even, repetitive way and even on small scales within and between layers. In another, ancient seas created sand ripples that were preserved in the rock on a small scale and between other layers.

Just as in your analogy there are two similar patterns. In the case of geology one would never state that layer ripples were caused by waves. There are differences between the two that point to a different mechanism. One mechanism is extant and the other is too slow to wrap your mind around easily. There are big differences between software and DNA as well which more than likely point to a different cause for each.

Discounting the difficult one, just because it isn't easy to wrap your head around, is not the best way out of a problem. There are viable theories based on known genetic processes that can create junk DNA. Discounting these theories without testing and evidence simply because you see a similar pattern in an unrelated field doesn't cut the mustard. You might as well be saying that ancient oceans caused all the rock ripples we can see around us.

 
At 11:04 PM, June 24, 2008, Blogger Randy Stimpson said...

Hi Jens,

You are wrong about software staying the same once it is compiled. Once a program is loaded into computer memory and executed it becomes dynamic and all kinds of things can happen to it. But that's irrelavent because I am not arguing that DNA is exactly like software. However, I have read arguments by people with only a superficial understanding of software who have asserted that since repetitive sequences don't make sense in software they must be junk if found in DNA. That argument begins with a misunderstanding about the nature of software and the intent of this blog entry it to refute that false assumption.

Also I am not arguing that We have a pattern, somebody must have made it since we don't fully understand it. On the contrary, I think that some have argued that We don't understand how DNA that has all these repetitive sequences could possibly have a function -- it must be junk and I think that is wrong.

You are obviously very educated (I read your profile) so you should have no problem answering this question. If you think that DNA is around 97% junk what is the most compelling reason for that conclusion?

 
At 8:32 AM, June 25, 2008, Blogger Doppelganger said...

Hi Randy,

I never told you how much of a genome think is 'junk' because I do not know. I did, if I remember correctly, link to the blog of Larry Moran who was doing a series of posts on what is really 'junk' and his last tally was on the order of 50%.

Consider the genome of A. proteus. It contains about 290 Billion bps, or about 96 times as much as we do.

Do you think that A. proteus (a type of amoeba) is 96 times more complex than a human, or that it's genome has lots of redundancy and junk in it?

 
At 6:52 PM, June 25, 2008, Blogger Randy Stimpson said...

Hi Scott,

I am not surprised that the size of genomes is not proportional to their host’s apparent complexity. I am a little surprised that an amoeba genome is 96 times the size of a human genome, however, I wouldn’t jump to the conclusion a lot of it must be junk. That reasoning certainly wouldn’t apply to the world of software.

When designing software programs tradeoffs often have to be made between execution speed and space. A very fast algorithm that does the same job as a slower one may require much more space. In some cases this could take the form of redundant objects doing the same task in parallel. I wouldn’t classify these redundant objects as junk. If such redundant objects exist in a genome, because entropy tends to increase, you would expect some of them to become defective; and if not lethal they would then be junk.

Larry Moran counts LINES and SINES as junk.

 
At 9:10 AM, June 29, 2008, Blogger Doppelganger said...

Hi Randy,

"That reasoning certainly wouldn’t apply to the world of software."


Yet I thought you acknowledged that genomes are not just like software?

You write that when designing software you would do this and that, and note that you are surprised at the size of the A. proteus genome.

Had you considered that a genome may not be the product of design in the first place and so software developing considerations might be irrelevant?

 
At 3:07 PM, November 05, 2008, Anonymous Anonymous said...

Original:
The question running through my mind is why are there six teeth in this sculptued replica instead of three? I know it's a small thing but why did they add more teeth? And why did they add more bones (not such a small thing)? If they want to connect the teeth to the skullcap I think it would have been more informative to have the replicated bones a slightly different color from the fabricated bones and to say so. Instead they present a consideral amount of fabrication as actual.

Reply:
There were more fossils of the Java Man that were found after 1891, so the additional replicas probably included those subsequent discoveries. In order to reassemble a more complete organism, they don't use only the bones of that belong to only that one individual. It is rare to find a fossil of such an ancient creature with a lot of bones so they'll use "Bob's", "Bill's", and maybe even "Sue's" bones to reconstruct just one.

Original:
So then the question arises: how much of the museum is fabricated and how much is actual? I am not a paleontologist but I noticed quite a bit more fabrication -- like the Peking Man fossil replica which was right next to Java Man. There was no mention of the history of Peking Man fossils. (If you click the link to the Wikipedia artical on Peking Man note that they show a picture of a complete skull fabricated from a skullcap which was allegedly lost at sea). I was beginning to wonder if I would find the fraudulant Piltdown Man on display but it wasn't there.

Reply:
So non-scientist adults and children can learn and understand evolution better, suspected models of what previous "homo" models are often reconstructed using plaster based on fossil, anthropological, and medical knowledge. These models are meant to illustrate the way we think these organisms may have looked. It is not a fabrication, it is conjecture since such conclusions are extrapolated from physical evidence instead of it being created entirely from one's imagination. It is also unfairly biased to call the picture of a plaster skull that was reconstructed from a fossil lost at sea a fabrication. A fabrication is a mythical Centaur or dragon. Clearly, the deceased that once occupied that skull existed since we found the fossilized skull later.

Original:
I find it hypocritical that evolutionists refer the theory of intelligent design as psuedoscience. Maybe it is -- but no more so than the theory of evolution.

Reply:
Actually, intelligent design is not science. It is philosophy. So is Creationism. Evolution is science. It is based on research that is peer reviewed.

Original:
You are obviously very educated (I read your profile) so you should have no problem answering this question. If you think that DNA is around 97% junk what is the most compelling reason for that conclusion?

Reply:
I think you are wrong about the 97%. Lines and Sines are a larger proportion of our DNA however it isn't 97%.

The most commonly accepted theory, at least as of the mid-90s is that they protect us from mutations, replication errors, and cancer (since many forms of cancer are caused by the carcinogen disrupting gene sequences rendering them nonsensical). In the language of geneticists, "junk" sequences are simply sequences that are not known to encode any gene.

Original:
By the way, an Alu sequence is not a gene.

Reply:
We often use generic terms even when it isn't the technical definition. Don't you ever refer to a computer as multitasking? Haven't you ever called a called function a subroutine when trying to explain something? Haven't you ever called a track pad a mouse? David calling a transposon a gene is the equivalent of a botanist calling a tomato a vegetable--it's insignificant nit-picking to make a person look stupid.

Original:
You are wrong about software staying the same once it is compiled.

Reply:
Although code can become corrupt over time, the mechanism by which this happens isn't the same as the way mutations arise. A defensive adaptation that would evolve to minimize mutations, consequently, wouldn't be comparable to anything that us humans can do in order to minimize our code from getting corrupted.

Original:
However, I have read arguments by people with only a superficial understanding of software who have asserted that since repetitive sequences don't make sense in software they must be junk if found in DNA.

Reply:
OK, first, however, people who say that repetitive sequences do not make sense in software are wrong, aren't they? First of all, what we see when we look at code is completely different than compiled code, right? The computer reads a different language than we do. One example that comes to mind is when you look at a COBOL program (or some other business application language). Each time you include or copy a file into your source code, that entire file is "inserted" into your code as if you had typed that yourself. There are only a finite number of i/o commands. If you are reading two input files, you are coding two read subroutines that do almost the same thing, typed with almost the same sequence of characters (the difference being the actual name of the file being read). So this argument holds no water, right?

Second, DNA and computers are very different (like, no duh, you aren't an idiot), so if they are two different things, an inorganic, man made invention versus an organic, carbon based organism, why would these two comparison be valid, anyway? For argument's sake, let's say that there is definitely sensible reason for a software to have repetitive sequences. Being that organisms and machines operate completely differently beyond the atomic level, why would it follow that genetic nonsensical sequences have no purpose either?

 
At 8:25 PM, November 10, 2008, Blogger Randy Stimpson said...

Hi Socraticsuicide,

Thanks for your comments. However, it looks like you are commenting on more than one blog entry. If would be nice if you could break up your post and put the comments on the blog entries they relate to. That way you will also get multiple links back to your blog (which is great) instead of just one.

 

Post a Comment

<< Home