Posts Tagged ‘social software’

Academic Papers Blog blogging collaboration comment spam compliment spam Facebook folksonomies Google iphone Projects social-bookmarking social networking spam tagging trust Twitter webspam web standards WordPress

The 5 People Who Could Destroy Twitter

Friday, June 5th, 2009

I’m a fan of Twitter – it can be really useful. But status update services and microblogging are relatively young technologies. Twitter is the frontrunner now, but it’s still possible that everything could go south really fast. Here are five people (or more accurately, types of people) who could destroy Twitter and what can be done to stop them.

The list is in no order, except I’ve saved the most dangerous for last.

1. Spammers

Seeing a lot more spammers on Twitter lately... Twitter spam is growing, and my guess is it’s a profitable business to be in. Spammers are getting crazy refollow-rates with very little effort put into their fake profiles. Part of this is a technical problem, with Twitter playing catchup to the collective innovative power of the greediest jerks on the internet. The more difficult part is social – users’ trust barriers are too low. Either Twitter finds ways to deal with this, or people will start treating reply tweets, direct messages, and invites the same way they do unsolicited emails now. One of the reasons I stopped logging in to MySpace was a flurry of fake friend requests that followed every session. Twitter runs that risk, in addition to the risk of service degradation.

What can be done? The good news is that no communication medium can be considered successful until someone has tried to send you unsolicited marketing and scams over it. But the Twitter team needs to redouble their efforts and head off potential problems proactively. For example, there are lots and lots of apps built on top of Twitter’s API – and almost all of them ask for your username and password. How long until one of those apps is compromised, or worse scammers make password-phishing apps of their own? Twitter needs to implement strong API keys or something like OpenID.

2. Anyone who uses url shortening services.

It’s hard to fit both a witty observation and a url in 140 characters, especially given url inflation. Bit.ly, Tinyurl, and the like perform the valuable service of giving you more space. They also cloak the destination of almost all the links on Twitter and get everyone used to following links blindly. I’ve already had friends whose accounts were hacked in order to send out a tweet like: “Check out this hilarious video: http://tiny/innocuousgibberish”. The New York Times’ account has been hacked, among others. Twitter can work on improving security and removing spam, but the more everyone uses url shorteners the more we train our friends to click recklessly. I’m as guilty on this one as anyone.

What can be done? People post links to Twitter frequently enough that maybe it should be separate field with it’s own character limit. If that’s too much complication for the brilliantly simple interface, maybe url previews should be enforced. Clients can do this now, but to be safe it should be done by Twitter.

3. Pirates, ninjas, zombies, and mafia thugs

Ah, I remember logging into Facebook the day I got my first “robots vs. hobos vs. Chuck Norris vs. etc.” request. “Ha,” I thought, “that’s a somewhat entertaining way to extend an internet meme into a social networking site.” Little did I know the horror that was about to unfold.

In all seriousness, the “tag, you’re it” games and gratuitous survey apps didn’t ruin Facebook, but they did make everything a bit more tedious. Those apps still fit within the umbrella of social networking – they don’t work at all in Twitter’s use model. When I log in, I want to see, very quickly, what the people I’m interested in are doing or reading. I don’t want to weed through their halves of various games I’m not interested in.

What can be done? This one is up to us – just don’t do it. Twittering with a hashtag for an event, a theme, etc. is fun and useful to others. Sending around vampire bites is not.

4. Chinese government officials

Think periodic fail whale sightings is bad for Twitter’s reliability? China can (and does) just block the whole site, most recently in advance of the Tienanmen Square anniversary. Why does this matter? China is a huge market, and growing. The days where being big in the U.S. meant major marketshare on the whole web are running short. What’s worse countries with theoretically free speech like Australia are following the Chinese model, proposing national internet content control (i.e. censorship).

What can be done? Many American companies just give up. Even Google has had to bend to government pressure. This is not easy to remedy. Perhaps there’s a way to take advantage of the small byte size of tweets, decentralize serving, and wrap access with something like Tor to get it through the Great Firewall. Let’s hope there’s a grad student or genius hacker out there with the right idea and Twitter is smart enough to hire them.

And finally, the absolute worst, most pressing threat the Twitter’s survival is…

(drumroll….)

5. Your mom

Despite the allure of turning this into one big “your mom” joke, I am completely serious. What happens when your mom joins Twitter? Do you censor yourself? Take your tweets private? Delete off-color tweets from your recent past?

There’s no right answer. Just about any social software eventually runs into this dilemma where the very different ways you communicate personally, professionally, and publicly collide.

What can be done? Some of the problem might fade as the userbase of sites like MySpace, Facebook and Twitter ages. But that will take years, so what can Twitter do now? It might help to have better relationship management. You could at least put your friends in one group and family in another. But in general, this strikes me as the toughest problem of them all – I don’t think there are any real solutions for the general possibility of parental embarrassment, or all efforts of every teenager in the world has yet to reveal discover them.

Disagree? Any threats I missed? Please post in the comments below.

Seeing more spammers on Twitter lately?

Tuesday, May 12th, 2009

It was inevitable. As Twitter has grown and started pushing into the mainstream, spammers have started ramping up abuse. At first glance, Twitter isn’t the most obvious target – you actually have to follow someone to get content from them, users don’t generally search it for high-cpc stuff like meds and lawyers, and how much spam can you really get into 140 character messages?

But I’m seeing more invites from users like the one below:

Seeing a lot more spammers on Twitter lately...

First: What is Twitterspam? How do I know this is a spammer?

When it comes to spam, most people “know it when they see it,” but it’s helpful to look at the specific signals that this user might not be worth talking to. First off, they have 180 followers and yet haven’t posted a single update. The photo is a dead giveaway. The bio is actually pretty well-done, it’s in English and it’s not outlandish, but the homepage link (http://my-pictures.no.tp/tlow/) – she’s in Portuguese Timor?

Second: Why spam Twitter?

Spammers have two reasons to abuse Twitter: monetary payoff, and because it works.

How can they make money by tweeting a bunch of random people? Well in this case they aren’t, at least not yet. The payoff has to be through the homepage link, which I’m not following and you shouldn’t either. You get a friend invite on a system that, so far, has been a medium of immediate, short, personal communication. Your trust barriers thus weakened, you at least want to see who it is. They don’t have any updates yet, so you click the homepage link and… Virus. Or a maze of PPC affiliate pages and redirections.

Above I said spammers are hitting Twitter because it’s working. How do I know? Look at the number of followers, and the ratio of people followed to followers. About 22 percent of the people spammed so far have responded. I don’t know how many click through to the home page link, but if half the people bother to go that far they’ve got an amazing success rate for spam.

I wish Twitter luck. I know a few people over there, they’ve got their work cut out for them. This sort of thing isn’t easy to fight, it’s an ongoing process. They’ve already taken some visible steps, like using rel=”nofollow” on the Bio link, which at least keeps away blackhat SEOs looking for sources of pagerank. They’ll probably have to do more, most of it on the backend where you and I will never be the wiser. Happy spamfighting!

Google Spreadsheets as Social Art Medium

Friday, November 21st, 2008

I used Google Docs spreadsheets to pick a name for my baby. Here’s another interesting social use of spreadsheets.

Check out this YouTube video:

It’s amazing what people will think of once you put something mundane up on the web and give people the freedom to work together.

What Happens When You Ask the Internet for Baby Name Suggestions

Monday, October 13th, 2008

Silhouette before sunset At this point we’re well past 4,500 votes in our baby name poll. We had a huge surge in votes recently as stories appeared in the international press and blogs all over the world. This is becoming a pretty wild ride, and will make a great story for our little Morrison to tell years from now. Thanks to everyone who has participated so far.

So… what happens when you ask the World Wide Web to name your child?

I’ll share the literal results below. Beyond the raw data, though, what happens when you try to crowdsource you’re kid’s moniker? It’s a bit of a risk – we’ve opened ourselves up to the possibility of criticism, abuse, and pranksterism during a very emotional time in our lives.

This little project still ongoing, and the baby isn’t due yet for another month, but at this point I can give you a little advice about using the web to involve family, friends, and even perfect strangers in your life’s – or your work’s – decisions:

  • Set the tone – We’re serious about using everyone’s votes and suggestions in our decision, but we realize this is a pretty goofy way to choose a name. So that’s how we presented it – fun, a bit geeky, but actually quite useful. If you’re wondering about the secret of Google’s success, you have my guess right there.
  • Expect abuse and embrace pranksterism – Our voting form has been spammed and we’ve been called some rather nasty names. Those are both unfortunate, but you know what? The vast majority of the people voting and commenting have been helpful, earnest, and encouraging. And funny suggestions, when they are actually funny, should be celebrated, not repressed or cast aside. Pompous decorum and solemnity are straight out – you’re not doing anyone any favors by letting them participate, you’re inviting them to join in the fun.
  • Make it interesting – I’m not sure we would have had the same reaction if we wanted the world to vote on what we should have for dinner tomorrow, but people really love coming up with baby names. They love making videos of Stephen Colbert. They love picking a new theme song for hockey night. And if you really do need advice on dinner tomorrow, involve a group of friends or local foodies, pick people who will be interested in adding their advice.

Another way to look at it is the framework presented in the Wisdom of Crowds:

  • Diversity of opinion – We have really lucked out on this one, since we have votes from all around the world (and feel free to give your home town / home country a shout out in the comments below).
  • Independence – There’s discussion on this site and others, and people can always check the leaderboards, but for the most part people have been giving us names with very personal, independent reasoning behind them.
  • Decentralization – We have input from family who have known us all our lives as well as strangers, and there’s no obviously complicated hierarchy or committee to act as a bottleneck.
  • Aggregation – You can see some of the ways we’re looking at the data already and in the coming days I’ll add even more.

Let me repeat one point, just because it’s so astonishing – we’ve really put ourselves, and our unborn child’s appellation, out there. Any abusive behavior has been vastly outweighed by good wishes and helpful contributions. So thanks again, unwashed masses of the interwebs. And now, the suggestions:

Baby name suggestions

You can see the earlier summary graphs and charts here and here. Below are the big lists of suggested names.

Suggestions for boys names:

Suggestions for girls names:

Comment Spam Article on the Google Webmaster Central Blog

Friday, October 3rd, 2008

I hate comment spam. I think it’s safe to say we all do. So how do you keep it off your blog or forum? Check out this article I wrote on the Google Webmaster Central Blog with some ways to prevent comment spam.

It’s interesting that one of the commenters brings up compliment spam – I just wrote about it on this blog a little while ago.

This was pretty cool for me, because I can’t really share much about my work at Google. It’s also fun to see my text translated into German.

Next up I’ll post an update on the baby name poll with more fun charts and graphs.

Quick Tip: Keeping Comment Compliment Spam off your Blog

Sunday, September 7th, 2008

Blogs are great because they give you a creative outlet and let your readers comment on you posts, making it a much more social experience.  But spammers take advantage of comment forms, using scripts and bots to fill the web with links back to their site.

What can you do about it?  Even with captchas, systems like Akismet, and other automatic techniques (you can read more about these here), some spam will slip through.  Specifically, compliment spam.

What is compliment spam? Spammers know you and I like to be told what great writers we are, how helpful our posts are, and that we are brilliant geniuses.  So they set their bots to spam you with complimentary comments that just so happen to link back to their crappy blog, online casino, or fake viagra store.  Here’s an example:

Typolight
http://www.typolight-blog.de | info@typolight-blog.de | 82.146.49.61

Thanks, you nice post that helped me alot.

From Keep your WordPress site from being hacked with automatic upgrades, 2008/09/06 at 9:27 AM

So, at first glance this looks like a legit comment.  The post in question was a “how-to”, so it would be nice to hear that someone found my instructions helpful.  But, do a Google search with the comment in quotes (an exact phrase search) and you’ll see the problem:

http://www.google.com/search?q=%22Thanks%2C+you+nice+post+that+helped+me+alot.%22

At the time of this writing, we see 168 instances of this exact comment.  By this same Typolight person.

So that’s my tip – if a comment seems a bit too randomly complimentary, throw it in quotes and do a Google search. Then, if it’s spam, make sure to spam it – systems like Akismet only work because we’re all reporting spam.

If you really want to go after the spam poster, you can also give their site a bad rating on Web of Trust, StumbleUpon, and other reporting systems.

Maybe if I get some time I’ll throw together a WordPress plugin to make this easy to do.  If you’d like a plugin like this (or have other tips), drop me a comment and it will help motivate me.

iPhone Apps – Pandora vs. Last.fm vs. iTunes

Monday, July 14th, 2008

San Jose Taiko rocking the main stage Since the release of the iPhone 2.0 firmware and the App Store, I’ve been like a kid in a candy store. At some point I’ll get around to a list of recommended apps but for now I just want to compare two music listening / online radio applications: Last.fm and Pandora.

You do, of course, have many more options – the App Store Music category has about 30 apps listed, many of them designed to help you enjoy and discover new tunes. And you always have the built-in iPod functionality of the phone which syncs with iTunes on the desktop. But Last.fm and Pandora have been around for a while as very impressive web apps so those were the first two I decided to take a look at. They have very different approaches to recommending music with lots of data and cool algorithms.

Pandora

Pandora is based on the Music Genome Project – basically, their system breaks down each song into a series of attributes. For example, Queen’s Bohemian Rhapsody has “demanding vocal performances, mild rhythmic syncopation, heavy use of vocal harmonies, a prominent rhythm piano part,” among other features. Give Pandora a song or musician and it will create a radio station of similar music. It’s really that simple.

As each song comes up you can give it a thumbs up or thumbs down and you can skip a few songs per station per hour. The iPhone interface displays the album art front and center with a button in the upper-right corner to show you why the system chose the song.

I’ve played with Pandora off and on for a while and my experience is that it does much better with stations created around one or two bands or songs than stations built on large lists of music you enjoy. Add 10 rock bands to your “Road trip with Steve 2008″ station and if one of them has folk influences you’re bound to get some sleepy folk in there now and again. Give it just one band and it can get some amazing results – check out my Gorillaz station, for example.

The drawback to Pandora is that it only has very rudimentary data collection and social features. You can find other people listing to the same song on the website but user profiles are pretty sparse, and there’s no groups, message boards, etc. But if you just want to listen, and don’t want to bother with all that other stuff, Pandora provides a pretty great experience.

Last.fm

Last.fm builds radios stations for you and makes recommendations based on the listening data of thousands of other listeners, whether they’re using the Last.fm site, the mobile app, or a scrobbler plugin in their desktop MP3 player software. You can also listen to stations based around a single musician or band, but Last.fm gives you more options and better results the more you listen and participate in the social features of the site. For example, take a look at the listing for Bohemian Rhapsody – you can see top listeners, how users have tagged the song, similar songs, comments, message board posts, etc.

The user interface is actually quite similar to Pandora’s, with options to note that you love or hate a song, a skip button, album art, etc. You can see a bio of the band, similar artists, and upcoming events, which is cool in theory but I haven’t really used.

I’m a long time user of Last.fm from back in the Audioscrobbler days (check out the Geek Music group) and you definitely get more out of it the more you listen. You don’t really have to participate that much, just letting Last.fm know what you’re listing to improves recommendations and radio plays. My favorite thing about it is all the stats it collects. You can see which bands and songs you listen to most often and find out the most popular bands in Sri Lanka.

Compared to Pandora, though, the recommendations aren’t always as interesting… not bad, but I find myself pleasantly surprised more often while listening to Pandora. For comparison, listen to the Gorillaz similar artists radio station.

iPod + iTunes

You can, of course, skip online radio altogether and just use the built-in iPod functionality along with iTunes on the desktop.  There’s a lot to be said for going this route – the interface is nice and usable, the iPhone holds a decent amount of music, and iTunes collects of the same listening data that makes Last.fm so cool.  Also, it will work no matter how conjested the local network is and doesn’t drain the battery nearly as quickly.

But you miss out on all the social networking features and it’s a lot harder to discover new music.  So I think of it more as a back-up plan…  guaranteed access to some of my personal music library.

The Winner

Actually, there’s no need to pick one as the winner – they’re all available for use on your computer and your iPhone.

Have a favorite?  Share your experience in the comments section.

The Urge to Deletion: Is Wikipedia is making molehills out of mountains?

Tuesday, June 17th, 2008

Black Mountain Wikipedia is great.  Even now, it’s still kind of amazing that such a huge body of knowledge has been organized ad-hoc by volunteers, most of whom have never met in person. Most social software systems would die for this level of collaboration.

That said, has anyone else gone to a random Wikipedia article from, say, search results and ended up a little depressed?  It seems like every other article I find lately has a big warning label at the top – this article contains too much trivia, this article has too many fictional references for an encyclopedic and academic approach of this topic, and worst one of all: this article has been marked for deletion.

I understand that it must be very difficult to wrangle all the millions of contributions into a consistently high-quality encyclopedia.  Just dealing with all the spam and abuse must be an enormous undertaking, even when distributed among thousands of good samaritans.  But one of the things that was great about Wikipedia was the breadth of coverage and the depth on some particulars, even if it was excessive to the point of comedy.

But a brief look at the list of articles marked for deletion the last few days illustrates my point.

1. Horse Ranch Mountain. You know there’s something wrong when a mountain doesn’t meet the notability requirement.   Here’s the comment opening the deletion on the talk page:

In what way is Horse Ranch Mountain notable? I am quite familiar with the area, and I cannot think of any way in which it is notable. Please convince me otherwise.

I would think it’s notable because it is a mass of millions of tons of rock and earth sticking out of the ground.  One a less sarcastic note, I’m sure I’m not the only one who’s looked at a map, spotted a feature I’ve never heard of, then looked it up online.  Even if it’s not accessible it’s probably helpful to have a reference noting that it’s the highest point in Zion, measured at X meters tall, etc.

2.  List of redundant expressions. I understand the argument that an encyclopedia is not a trivia game or a book of lists, but these sorts of pages used to be one of my favorite features of Wikipedia.  Exhaustive lists of palindromes, English words of Polish origin, etc., give examples, context, and can help connect concepts in language.  Also, the use or omission of redundancy is an important stylistic consideration when writing – it can be used for everything from emphasis to characterization.

3.  Hindu literature. Delete the article on Hindu literature?  Granted, the article needs work.  But isn’t it worrying how the marked for deletion pages are filled with subject matter from outside the U.S. and maybe Europe?

I know the standard answer to complaints like these is that if you feel so strongly, you should participate in the debates and push for things not to be deleted.  Judging by the talk pages I wonder if I would be drowned out by all the “I’m a history major and this is a programming term, never heard of it, not notable” comments.  I’ll admit my contribution to Wikipedia is limited to random spelling and grammar corrections that were obvious enough that even I noticed them, so I could be wrong.  I just feel like some of what made Wikipedia so addictive is slowly being drained away.

Agree?  Think I’m wrong?  Leave me a comment below.  See, it’s kind of like a talk page, but even with consensus you can’t edit my article.   Until the next WordPress exploit comes out.

Why have a website, why create a blog, why Twitter?

Monday, June 9th, 2008

Golden Gate Bridge from the northMy esteemed colleague Beah just started blogging, and opened her blog with a very important question – Why Blog?  I remember people asking a similar question years ago when I registered this domain – why would you want to have a website with your name on it?  Almost the same question has come to my mind recently when playing around with Twitter.

So, why blog?  With all the hundreds of thousands of blogs on the web you might think there’s no need to ask this question.  One of the best things about social science is asking questions about things that everyone takes for granted.  Unfortunately the “science” part of social science is a bit too time-consuming to finish up on a Sunday-evening blog post, so instead we’ll look at a few sites of friends and colleagues and maybe collect some thoughts on what motivates people to blog.

First, why do I blog here?  I try to keep this blog relatively professional, posting mostly on topics that I encounter in my work, in my academic research, and in my side projects (the standard disclaimer, as always, applies).  One of my motivations was sharing some of the research done for classwork – it seemed a shame to write up a report, turn it in to a professor, and then let it gather dust in some corner of my hard drive.  My undergrad degree was in journalism and I do miss writing, so that’s another motive.  Also, having been through some rough patches in my career during the dot-com downturn, I thought blogging might help me establish a bit of a professional brand.  I have my URL on my resume and I would hope that any company looking to hire me would get an idea that I’m knowledgeable and interested in relevant areas.

But I’m not a very random sample, so let’s look at a few other blogs and try to appreciate why they write.  I think I can place them into a few rough categories:

Personal takes on professional / technical interests:

This is largely where my blog falls.  Common post topics will include things like “how to get around an annoying issue with some software/programming language,” “very excited about the new device from Apple,” “report from a conference,” and “very disappointed with the new device from Apple.”

Public journaling to keep in touch with friends and family:

I’ve done this in the past as well – blogs taking the place of those old-fashioned mass emails you used to send out freshman year of college.  If you went to college in the ancient days before blogs and Facebook.  This is a place for both epic travelogues and saved IM conversations filled with inside jokes.

Sharing interests and reviews:

This category runs the gamut from folks who just want to show their friends a funny Youtube video to blogging a season of a TV show to reviewers writing prolifically about a very obscure musical genre.

Artistic or literary expression:

Self-publishing has opened the doors for artists and writers, both amateur and professional, to share their work with whatever audience they find.  This can run from virtual serial galleries shows to community-driven commentary and learning.

Of course these all overlap, and some blogs cover all the bases.  See KooKoo for KokoPuffs for an example

So do we answer our question with a plethora of distinct motives for blogging?  Not necessarily.  There’s one theme that runs throughout all of the above – these are all social activities.  Ultimately blogging is human interaction.

Oh, and that other question – why use Twitter?  No clue.

Got a reason why you use Twitter?  Are you a co-worker angry at me for misconstruing your blog?  Please let me know in the comments below.

Social software and the problem of trust

Friday, November 30th, 2007

Although you don’t hear about it much, trust is an extremely important issue in the software world.  A common example is eBay – how could eBay stay in business if millions of anonymous buyers and sellers didn’t have a certain level of trust?

Andy Brice, a software developer, gives a really interesting example of the problem of trust in his blog.  He became concerned that his software products were getting a ridiculous number of awards and 5-star ratings from shareware download sites.  He devised an experiment: if you create a text file, change the file extension to .exe, and submit it to 700 download sites, how many award would you get?

It turns out you would get tons of awards.  A large percentage of these sites, which ostensibly provide users the service of evaluating shareware and freeware, are in reality just trying to skim adwords revenue.

Social software, if applied correctly with enough participation, can help to solve this problem.  It is much harder to fake 1000 del.icio.us bookmarks than it is to make an authoritative-looking award banner.

Many of us work on projects internal to companies where we don’t confront these issues directly on a day-to-day basis.  Large companies can generate billions of pages of documents and code each year.  Add to that the billions of external web pages we use as reference material.  Tools such as social bookmarking can help build up this network of trust and sift through the less useful resources even on intranets.

So now that we have the tools available, all we need is participation.  You’re reading this, so I’m probably already preaching to the choir.  Trust is a really interesting issue, though, so I’ll be writing about it here and there in the future.