Tag Archives: popularity

abuse baby names chart confidence interval correlation directories linear regression R-Square scatterplot spam SPSS statistical analysis tagging Twitter

Twitter user directories – Wefollow vs. Justtweetit vs. Twellow

I happened to notice Brian’s tweet that he is officially the sole authority on burritos in Twitter, at least according to Wefollow, a Twitter user directory.

Twitter is the only major service I can think of where sites have popped up to provide a function so basic as user directories. This is due in part to their great API, but also because they don’t provide any real functionality on their site. You can search for users individually and import your contacts, but the suggested user page seems uselessly weighted toward the most popular people on the entire site.

Who should I follow if I’m interested in usability or cartography or legume horticulture? I took a look at a few directories, which follow slightly different models:

– Has a list of predefined categories
– Each user can only be in one category
– Users are self submitted

– Users can be tagged by any word or phrase, though the most popular show up as main categories on the home page
– Each user can use up to three tags
– Users are self submitted

– Has predefined categories, a large list that looks similar to Open Directory.
– Each user can be in up to 10 categories
– Seems to pick up users automatically, but users can add themselves to additional categories

My guess is the more specific the categories, the more useful the organization system will be. Wefollow gets points here for allowing open tagging but the front page, with such broad categories, isn’t as useful as the search or drill down pages. Twellow actually works pretty well, since the built-in category list is so extensive.

All three seem like they might be a bit open to abuse, since users can add themselves to the directory – with Twellow and Wefollow, at least they have to be logged in to their account to do so. But if I were a spammer and had found some way to use Twitter for spam, I could quickly add my spam accounts to these sites as well.

It would be really interesting to see a measure of quality other than just the number of followers. For example, if I say I’m in the haberdashery business, the system could check to see how often haberdashery shows up in my tweets – that could be a quality score for the classification, used in concert with number of followers, which is a proxy measure for the quality of my account.

Has anyone else used these directories, or others? Would you follow someone just because they’re the most popular person listed in your area of interest? Let me know in the comments below.

Choosing a Unique Baby Name with Statistics

We’ve got well over 10,000 votes, we know that the vote totals are significantly different from random, so do we have enough information to pick a name yet?

There’s another stats exercise I want to go through before we narrow down the list.  We want to pick a name that people have voted for, but we’d also like to choose a name that’s not too popular.  This is just a personal preference that Ann and I have, we think it’s a little more fun to have a more unique name.

Also, it would be pretty boring if the vote gives us the exact same information as a list of most popular baby names.  So, how to do we choose a name that’s popular with friends and family (and in our case, random internet strangers), that’s still reasonably unique?

Based on the chart below, names that fit our criteria include Ada, Cassia, Athena, Erin, or Olivia for a girl and Nikolas, Levi, Isaac, Dylan or Alexander for a boy.  Follow along and I’ll explain where I got the data and how it helps me pick names.

Scatterplot of baby name votes versus baby name popularity

Link to the full-sized graph at Flickr.

The graph you see above is a scatterplot of the names, showing the vote total versus the number of babies given that name in the U.S. in 2007.  For example, Isaac has 1220 votes as of this writing and 10,066 babies were named Isaac in 2007.

Continue reading