How many of your ancestors are you related to?

David Aldous asked me that question over lunch one day. I didn’t have an answer, so he explained: In the simplest model of human genetics, you get half your genes from each parent, a fourth from each grandparent, and so on. Thus the fraction of your genes contributed by each member of the nth generation is 1/2n. But there must be some value of n for which 2n exceeds the total number of genes in the human genome. Suppose you have 50,000 genes. Well, 16 generations ago you had 216 = 65,536 ancestors, so roughly 15,000 thousand of those family members were left out of the lottery. They’re your ancestors, but you inherited no genes from them.

There’s also a value of n for which 2n is greater than the entire human population, so if you look back far enough, you have more ancestors than there were people on the planet. This has got to be a sign of something awry in the model; these calculations are not to be taken as a quantitatively accurate guide to the human family tree. Nevertheless, the idea of counting genes and counting ancestors is basically sound.

These issues have come to the fore lately with news coverage of the discovery that Barack Obama is a distant cousin of Dick Cheney. According to Lynne Cheney, both are descended from Mareen Duvall, a 17th-century Hugenot. In today’s New York Times, Nicholas Wade comments on the significance (or otherwise) of this genealogical connection:

Mr. Obama probably inherited a minute fraction — one divided by two to the 11th power — of Mareen Duvall’s genome, which would amount to less than one gene, assuming the Y chromosome was not inherited.

Alas, though the concept is right, the numbers don’t quite add up. Two to the 11th power is only 2,048, and we surely have more genes than that. Under simple assumptions of random assortment, the expected number of genes passed down to the eleventh generation would be ten or so.

Wade correctly notes that the candidate and the vice president are very unlikely to have inherited any of the same genes from their common ancestor. Not that I would change my vote just because they had a few snippets of DNA in common.

This entry was posted in biology.

9 Responses to How many of your ancestors are you related to?

  1. randomwalker says:

    A better way to play the numbers to bring out the absurdity of the issue is that you have 2048 ancestors at the 11th generation, each of whom has 2048 descendents on average, so assuming relatively little inbreeding and emigration in this population sample, you are related to 4 million other Americans. If you’re a particularly prolific individual, chances are you’ve slept with someone you’re related to. Eww.

    The popular perception of genetics is complete bullshit anyway; genes matter in our makeup far less than most people think they do.

  2. Barry Cipra says:

    “Suppose you have 50,000 genes. Well, 16 generations ago you had 2^16 = 65,536 ancestors, so roughly 15,000 thousand of those family members were left out of the lottery.”

    I’d say “at least” instead of “roughly.” Modulo all the usual caveats about in-breeding, X-chromosomes, etc., a good way to look at how many ancestors passed genes along to you is by working backwards, progressively (or is it regressively?) divvying up your genes between your parents, grandparents, great-grandparents, and so on, but keeping in mind there is a bit of a crapshoot at each step. That is, while it’s true (by assumption) that *exactly* half your genes come from your mother and half from your father, it’s not true that *exactly* a quarter come from each grandparent. The quarter each is only approximate, with a bit of binomial drift. Let me explain by simplifying the genome to consist of exactly 8 genes, which we want to trace back three generations.

    By assumption, your 8 genes come 4 from your father and 4 from your mother. But to go the next generation back, you need to consider the genes your parents *didn’t* pass along to you from their 8-gene genome. Of the 70 (=8-choose-4) ways each of your parents can divvy his/her genome, only 36 ways give 2 of your genes to each grandparent. The other 34 ways are a 3-1 split or (rarely) a 4-0 split. So it’s reasonably likely that your 8 genes come in something like a (2,2,3,1) split among your grandparents. The 1 is significant because it means that, in this scenario, at least one great-grandparent gets left out of the “you” lottery. It’s also reasonably likely that at least of the 2′s will fail to split 1-1 at the great-grandparent level, so you may only be related to 6 of your 8 great-grandparents. Thus your 8-gene family tree may look something like this (pardon the crude typesetting):

    8 you
    / \
    / \
    4 4 parents
    / \ / \
    2 2 3 1 grandparents
    / \ /\ /\ /\
    1 1 2 0 1 2 1 0 great-grandparents

    A more careful analysis and/or simulation, which I don’t have time to do correctly, could give a precise answer for the expected number of ancestors who “exherit” your genes. In the 8-gene case, my guess is it’ll be between 6 and 7 when going back 3 generations. For 50,000 genes going back 16 generations, I would expect the expected number to be at least a few thousand shy of 50,000. That’s just a semi-educated guess though.

  3. Lord says:

    If related means having a gene in common, I would assume we are all related since so many genes deal with basic cellular processes and would be highly conserved, and we would be related not only to humans but many other species as well. Not related would seem to be the remarkable exception meaning genetic drift and mutation has made every gene different. Nor is related necessarily limited to genes or epigenetics, but could also embrace experience, customs, and other transmissals.

  4. Isabel says:

    I have no real comment on this post, because I’m not familiar enough with genetics, but it seems that if you waant to calculate the probability that two people share a gene, you have to take into account the probability of “crossing over” on a chromosome. So the probability that two people share at least one gene (in the sense that they share the particular ancestor they inherited them from) is greater than the chance they’d share a gene if everybody had 23 genes (the number of chromosomes) but less than the probability that they’d share a gene if all genes were independent.

    But the main reason I wrote this comment was just to point out that your spam screening requires a certain fairly high level of intelligence; it’s not only spam-screening but uninformed-person-screening.

  5. brian says:

    @Isabel:

    Crossing over certainly complicates the situation, but even if you take it to the limit, and suppose that all three billion base pairs in the human genome are inherited independently — even then you can only go about 31 generations back before you run out of ancestors.

    As for the spam screening: Unfortunately, it hasn’t been quite stringent enough to stop all the spammers.

  6. Rob says:

    It would seem we inherit genes from all of our ancestors, just that the fragments get twice as small for each generation back. There is always that chance, however slight, of a trait such as eye color showing up in a newborn that has not been seen for several generations.

  7. I am interested in the math but I just also had to say that i am related to the Aldous family I wonder if it is the same one as your friend’s?

  8. Mike Williamson says:

    Taking the case as posed, this is mathematically not too difficult to calculate. I am going on one important assumption folks here have implied:

    ** The question is not how many of the genes are SHARED between you & an ancestor, but how many you ACTUALLY INHERITED. ** The difference might seem subtle, but we cannot assume that our parents have entirely complementary gene sets.

    OK, moving from that point, it is a probability issue:

    chance to pass on a gene: 50%

    # of genes: 50,000 (just taken from story, I don’t know)

    “reasonable chance” you have no genes from an ancestor: 50% (I am choosing this for simplicity, after seeing the calculation it should make sense)

    solve: “number of generations till reasonable chance no genes from ancestor”: call it “x”

    ——————

    “chance a gene does NOT make it through all ancestors”: (1 - [chance to pass on gene]^[number of generations]) = (1 - 0.5^x)

    “chance NO GENES make it through all ancestors”: (1 - [chance a gene does not make it]^[number of genes]) = (1 - (1-0.5^x)^50,000)

    Solving for x, this gives ~ 16 generations. So it takes ~ 16 generations for someone to share NOT EVEN ONE GENE with an ancestor.
    Note that it is POSSIBLE for this to occur much sooner. E.g., if you look back “only” 13 generations, the chance is 0.2%.

    Mike

  9. Mike Williamson says:

    Sorry, just realized that I forgot to mention:

    for the last equation:

    (1 - (1 - 0.5^x)^50,000) = 0.5 (I didn’t put this 0.5 on the right side of the equation). That 0.5 is the “reasonable chance” that I mention. If there is a 50/50 chance you have no genes from your ancestor, then that is a “reasonable chance”, I figure.