More than 7,000 Americans named John Smith have gone missing.
Smith is unchallenged as the most numerous surname in the U.S., some 28 percent ahead of second-place Johnson, according to the U.S. Census Bureau and WhitePages.com. And, based on the most recent available data from these sources, John heads the list of the most frequent first names. And yet, John Smith doesn’t even rank in the top 10 combinations of first and last name in the country. What happened? Where did all the John Smiths go?
The nation’s most frequent surname combines with the most popular first name at a rate some 24 percent below what it would be if the top first names were handed out in the same proportions in all families regardless of last name. Smith families name their sons John less often than other families.
There are 400 possible combinations of the country’s 20 most frequent first names and surnames. According to WhitePages.com, 3,153,792 Americans have both their first and last names from these lists.
Of this group, 8.62 percent are named John, and 12.15 percent are named Smith, so it might be expected that the product of these two percentages—namely 1.04733 percent, or 33,031 people—would be named John Smith. But the real number of John Smiths, 25,255, falls short by 7,776—the “lost” John Smiths. Using this mathematical maneuver—call it the “percentage-product formula,” or PPF—each combination of first and last name can be assigned a percentage by which its occurrence in real life exceeds or falls short of its idealized frequency (for more about the data, the method, and other names, see “John Smith et al.” here). John Smith has a PPF rating of negative 23.5 percent.
Without a survey of parents named Smith, we can only speculate about what factors might be responsible for the shortage of Johns in their families. There are no notorious John Smiths in history who might cause parents to avoid the name, as they now avoid Adolf, for example. Creative baby-naming is a well-known feature of African-American culture, and it is worth investigating if this practice has had a negative impact on the number of John Smiths in the U.S. However, only 22 percent of Smiths identified themselves as black in the U.S. census of 2000, while other common surnames showed much higher frequencies (blacks comprise 53 percent of the surname Jackson, 47 percent of Williams, and 38 percent of Thomas). More importantly, Smith and these other common surnames show positive PPF values with other common first names (James and Charles, for example), so there’s evidently no general deficit of conventional first names in combination with surnames that are frequent among African-Americans. So it’s reasonable to conclude that no significant amount of the John Smith shortage can be ascribed to the use of unconventional first names by African-Americans.
But there are two more promising theories. First is the cultural status of John Smith as a “placeholder name.” John and Smith together form a name often used to refer to an archetypal “everyman.” (Another example, of course, is John Doe.) As such, John Smith’s restricted use in the real world may be in the result in part of parents’ wish to avoid this implication of facelessness. Other combinations of common first and last names do not undergo this restriction: John Martin, James Smith, and Mary Williams are overrepresented by 13 to 20 percent.
John Smith as a hastily chosen alias or, at best, an unimaginative name, is demonstrated by its two entries in the online Urban Dictionary:
 [P]seudonym most used to “shake” the FBI
FBI Agent: Sir, are you Crackhead Pete?
Crackhead Pete: Uh… no… my name is uh… John Smith.
 An English name… contains two clichés.
guy #1: I’m John Smith.
guy #2: Can’t your mom think of a better name than that?
Meanwhile, Thesaurus.com lists the following synonyms for John Smith:
average joe, average person, common man, everyman, jane doe, joe blow, joe doakes, joe sixpack, john q. public, man in the street, mr[.] brown, mr. nobody, ordinary joe, richard roe.
As a probable secondary check on the number of John Smiths, there is reason to believe that the ethnic origins of family names—not surprisingly—exert some influence on the choice of first names. The proportion of Smiths named Patrick, for example, is only 1.82 per 1,000; but among Murphys, Patrick comes in at 11.62 per 1,000. My attention was first drawn to this ethnic factor by an uncanny shortage of Josephs in combination with each of the top 20 surnames. Joseph is the 12th-most-popular name for individuals in the U.S., but the PPF rating for Joseph in combination with each of the top 20 surnames is negative (with the slight exception of Joseph Martin, a full name whose PPF rating musters a tiny one-tenth of a percent on the positive side).
Where have all the Josephs gone, and what can they tell us about the lost John Smiths? To begin to answer this question, we have to look at the ethnicities of the top 20 surnames. Garcia, Rodriguez, and Martinez, of course, are Spanish. Of the remaining 17 names, almost all are said (by Wikipedia) to have their origins either wholly or partly in England. Smith is described as “originating in England”; Johnson, as an “English, Scottish and Irish name of Norman origin”; Williams “originated in medieval England”; and so on. Only Davis, “originating in Wales,” is not attributed to England.
When we look at the combinations of the top 20 first names with surnames of Irish, Italian, Polish, German, Scandinavian, Scottish, and Jewish origins, we get a very different picture. Now Joseph surges ahead with huge positive PPF values in combination with Italian surnames such as Russo (340 percent) and Esposito (339 percent), or Polish ones such as Kowalski (113 percent) and Kaminski (109 percent). Also positive (though more modest) figures emerge for Joseph with Irish surnames (Kelly 26 percent, Sullivan 24 percent). These positive numbers help to explain Joseph’s overall popularity in spite of its poor showing with English surnames.
In looking at the nationalities of the surnames with which Joseph is most favored—Irish, Italian, and Polish—it can’t go unnoticed that what they have in common is the strong presence of Roman Catholicism. The obvious implication is that Americans with those surnames are often inspired to name their sons after one of the most popular saints of that church.
So how does all this bear on John Smith? Somewhat like Joseph—although to a lesser degree—John has neutral-to-negative PPF ratings not only with Smith but with most of the popular English surnames, and neutral to strongly negative ones with German, Scandinavian, Scottish, and Jewish names. The surnames with which John has the strongest positive affinity are—like those for Joseph—the Irish, Italian, and Polish ones: the stereotypically Catholic surnames.
The ethno-religious factor probably has contributed a little to the underuse of the name John Smith, simply because Smith is not a predominantly Catholic surname (notwithstanding Al Smith, who in 1928 was history’s first Roman Catholic nominee for U.S. president). But the lion’s share of the John Smith shortage must be chalked up to the name’s use as a designation for a nonentity.
As surnames are inherited, Smith will continue to be the most frequent surname in the U.S. for the foreseeable future. But as first names increasingly change with the whims of fashion, John is facing serious competition from Michael and James for first place in the present population. And to judge from the names given to newborn boys in the 21st century, the popularity of the name John will soon be a thing of the past. On the list of popular male baby names for 2012 maintained by the U.S. Social Security Administration, John didn’t even make its way into the top 20. The list is headed by Jacob, Mason, Ethan, and Noah. All indications are that more John Smiths will disappear in the future.