Tag Archives: baby names

baby name Chi-Square confidence interval Facebook fame Google Google Docs Google Spreadsheets information graphics internet journalism literary nonsense poll Save the Children SPSS statistical analysis survey time series chart Twitter

Picking a Baby Name That’s Uniquely Popular

We’re letting the whole Internet vote on the name for our baby, so it might seem a little strange that I’m posting about picking a name that’s unique. Why bother with a poll if you don’t want a popular name? What does uniquely popular mean, anyway?

Looking at the data right now, Alexander is comfortably in the lead. But Alexander is also a very popular name in general right now – it was the 9th most popular name for boys in 2012 according to the U.S. SSA. So, I’d like to see if some names are getting more votes in our poll than you would expect just by general popularity.

I did a similar analysis when we did this for my daughter five years ago. Back then I used SPSS and a little more statistical rigor, but I haven’t had a chance to play around with R to do something similar yet. For now I’ll stick to what I can do in Google Spreadsheets.

Here’s a plot of each name, showing the number of votes in our poll vs the number of babies with that name in 2012:

So what does this tell us? First off, the names seem to line up more or less on a line going up and to the right. This means there’s probably a correlation between our poll and popularity in the U.S. Google Spreadsheets has a function to give you the correlation coefficient called CORREL(). Right now this is 0.69, which is a pretty strong correlation.

Second, if we guesstimate where we would have to put a straight line to best fit these points, we can see which names are above the line – right now, it’s Nikola, Luka, and Finn, with Soren maybe just peeking over the top. If we want to pick names that are uniquely popular in our poll, those are good choices.

I’ve plotted the U.S. Babies in 2012 totals on a log scale for two reasons – first, it’s much easier to read this way, and second, it doesn’t look like baby names are distributed very evenly:

This is from U.S. SSA data again. In this chart you can see that a very small number of the most popular names (at the left side of the graph) are given to a very large number of babies. Looking toward the right of the graph, there’s a very long tail of many names given to much smaller numbers of babies.

This looks like it might be a Zipf distribution, which is a pretty common distribution for data like wordcounts and website popularity. If we shift that graph to a log scale it starts to look more like a straight line.

By the way, if you haven’t voted on our baby name poll yet, go ahead and vote now – this baby is coming soon!

See the Trending Baby Names in our Poll

One of the nice things about using Google Forms to name your baby is you can use the built-in charts and reports. Here’s a fun timeline chart showing how each name has collected votes over time:

The graphic is a little small here, so here’s a larger version: http://www.jasonmorrison.net/static/timeline.html.

For example, you can see that Finn just surpassed Isaac on the 15th. Also Xavier, Nikola, and Luka have been battling each other throughout the whole poll.

If you haven’t voted, you still have time – but hurry, the baby is due in May. Click here to vote.

1000 Votes in our Baby Name Poll – Which Name is Winning?

We’ve passed 1,000 votes in our baby name poll!

Now we have enough data to start showing off the top names:

Looks like Alexander, Isaac and Finn have taken an early lead. I haven’t done any real data analysis yet, but we’re getting enough votes now that I won’t be able to resist for long.

If you haven’t voted, please follow this link to the form. For each vote, we (and some other folks at Google) have pledged to donate $1 to Save the Children. Just by voting, you ensure a child receives a polio vaccine.

See the original post and the FAQ for more info.