Tag Archives: trust

abuse Akismet Blog censorship intranets news aggregation Reddit security self-censorship social-bookmarking social software spam tagging Twitter Web2.0 webspam web standards WordPress

Quick Tip: Keeping Comment Compliment Spam off your Blog

Blogs are great because they give you a creative outlet and let your readers comment on you posts, making it a much more social experience.  But spammers take advantage of comment forms, using scripts and bots to fill the web with links back to their site.

What can you do about it?  Even with captchas, systems like Akismet, and other automatic techniques (you can read more about these here), some spam will slip through.  Specifically, compliment spam.

What is compliment spam? Spammers know you and I like to be told what great writers we are, how helpful our posts are, and that we are brilliant geniuses.  So they set their bots to spam you with complimentary comments that just so happen to link back to their crappy blog, online casino, or fake viagra store.  Here’s an example:

Typolight
http://www.typolight-blog.de | info@typolight-blog.de | 82.146.49.61

Thanks, you nice post that helped me alot.

From Keep your WordPress site from being hacked with automatic upgrades, 2008/09/06 at 9:27 AM

So, at first glance this looks like a legit comment.  The post in question was a “how-to”, so it would be nice to hear that someone found my instructions helpful.  But, do a Google search with the comment in quotes (an exact phrase search) and you’ll see the problem:

http://www.google.com/search?q=%22Thanks%2C+you+nice+post+that+helped+me+alot.%22

At the time of this writing, we see 168 instances of this exact comment.  By this same Typolight person.

So that’s my tip – if a comment seems a bit too randomly complimentary, throw it in quotes and do a Google search. Then, if it’s spam, make sure to spam it – systems like Akismet only work because we’re all reporting spam.

If you really want to go after the spam poster, you can also give their site a bad rating on Web of Trust, StumbleUpon, and other reporting systems.

Maybe if I get some time I’ll throw together a WordPress plugin to make this easy to do.  If you’d like a plugin like this (or have other tips), drop me a comment and it will help motivate me.

New social news site – NewsTrust.net

I happened across NewsTrust.net, a new social news aggregation site.  I’m a big fan of other sites in the category like Reddit, despite their flaws, and NewsTrust includes a tagging system so I feel obligated to investigate it like any other folksonomy.

So I created an account to give it a try.  The big difference between this site and others is the emphasis on quality journalism.  NewsTrust asks for your real name, and in addition to giving weight to users who write good reviews and get votes from other users, it adds factors like experience as a journalist to the mix.  It makes specific disticntions between mainstream media sources and altenrative media sources.

It’s an interesting idea, and it’s good to see journalists working together with programmers and web developers to make use of some of the social software techniques that newspaper websites so often catch on the trailing edge.  The site’s features seem geared toward providing users with the best that professional journalism has to offer with a dash of brilliant amateur writing thrown in – even the page layout looks more like a newspaper site than a Digg or Del.icio.us clone.

But I’m not sure it will work, at least not without some tweaking.  I don’t know if they put a lot of weight into the “experience” of users, but it didn’t require any verification of my 5-9 years of journalism experience (for the record, that’s four years in college plus more than a year of stringing here and there).  Here’s the problem of trust again, though hopefully mitigated by fellow users’ reviews.

The other issue is interaction design.  The widgets and buttons all work just fine, but when you rate a story you’re asked to score on six dimensions: Recommendation, Trust, Information, Fairness, Sources, and Context.  Only the first is required, but give users options and they are bound to feel obligated to exercise them.  Give them too many tasks and they will tend to give up.  So the simple interaction model of Reddit, where users don’t even have to click through to rate a story, might be information-poor but participation-rich in comparison.

Still, I will play with the site more and I wish them luck, I think they have some promising ideas.  For example, in their blog they talk about gathering sources from other countries based on big world news events, specifically the Russian invasion of Georgia.  Reddit is only fleetingly so reflective and few sites use temporary peaks in interest to get long-term data on source credibility.

The power of microformats

Considering a Descent A few months ago I attended a really interesting talk by Eric Meyer where he touched on the use of microformats.  You might know Eric from his excellent O’Reilly Press CSS books.

What are microformats?  Before giving an example, I’ll give a little context.  When Tim Berners-Lee created the web, he tried to make HTML simple, flexible, and meaningful.  He succeeded on the first two counts but the third was quickly left by the wayside – many designers didn’t care what a particular tag meant, so long as it could be used for page layout.  The use of tables to arrange graphic elements instead of holding tabular data is a perfect example.

So Berners-Lee has been talking for years about the next step – the semantic web.  In the semantic web, tags are used to say what a particular piece of content is, with all styling done with stylesheets.  There is, of course, more to the semantic web than just separating content and presentation, after all you can work that way with HTML and CSS now.  One other key component is the web of trust, where people and web sites are able to describe relationships to each other so that search engines can help you find trustworthy content automatically.

Unfortunately, the semantic web has not really taken off.  There have been lots of meetings and XML schemas but it’s all too complicated, the process is too bureaucratic, and everything is being designed from the top down.

This is where microformats come in.  Let’s say you have a blog and you’ve tagged all your articles.  You’d like to let search engines and aggregators like Technorati know what your tags are.  But HTML doesn’t have anything like this:

<tag>semantic web<tag>

So what do you do?  Simple, use the rel-tag microformat:

<a href=”http://example.com/tag/semantic+web” rel=”tag”>semantic web</a>

The microformat makes use of existing html tags and attributes and just follows simple conventions.  But now that this little bit of meaning can be interpreted by spiders and other programs, we’ve actually added a pretty powerful bit of functionality to the web.

Most blog software, including WordPress, includes does microformatting for you.  If install my tag cloud plugin Altocumulous, and view source, you can see for yourself.

For intranet purposes, the hCard and hCalendar microformats look promising.  Take a look at microformats.org to see why I think so.  I’ll write more on it later.