Archive for May, 2009

Sick of compliment spam on your blog?

Sunday, May 31st, 2009

Not amused One of the great things about having a blog is getting comments on your posts. It’s particularly gratifying when someone takes the time to tell you that your post was helpful, entertaining, or well-written.

Spammers know this and exploit it by generating compliment spam. They’ll put together a few lines of general praise and slather them across the web, hoping that bloggers will fall for the trick and post their spammy links.

Abusive social engineering like this really annoys me, so when in doubt I always do a Google exact phrase search to see if the compliment is really for me and not from a bot. This is tedious, so I created a simple WordPress plugin: O RLY Comment Spam Search.

You can get the plugin directly from WordPress.org, where you can also give it a rating to tell other webmasters how great (or non-great) it is. By the way, the plugin browser/installer added in WordPress 2.7 is very cool, and makes it much easier to try out plugins.

Judging by the thousands of blogs my O RLY searches have found, this sort of spam works. But why do spammers do it? Since WordPress (and most major blog systems) nofollow links in comments by default, the spammers can’t expect to gain any PageRank from these links. My guess is most of this spam is either intended to get traffic via clickthroughs or is generated by naive site owners, SEOs and marketers who don’t really understand how things work.

Take a look and let me know if it’s useful in the comments below. Also, let me know if it’s breaking on certain comments or otherwise buggy.

TinyUrl Trouble: Greasemonkey drops the location header in GM_xmlhttpRequest

Thursday, May 21st, 2009

I get a lot of ideas. Most of them wander aimlessly in my head until they become obsolete, but once in a while I’ll get an idea that seems useful and simple enough to do in my free time.

If you’ve used Twitter, you’ve seen the myriad of url shortening services like TinyUrl and Bit.ly. Url shortening services are a kludge and they break one useful, built-in feature of the web, which is the ability to know where you’re going when you click a link.

So I thought, this is something that I could fix in an hour or so with a Greasemonkey script. If you have no idea what I’m talking about, Greasemonkey is a Firefox Plugin that runs in your browser and lets you run your own Javascript on pages you load. Greasemonkey comes with a handy-dandy AJAX function called GM_xmlhttpRequest.

I figured all I have to do is grab all the anchors on the page, see if they match a list of shortener urls, do an xmlhttpRequest for each one and grab the final location (after the service finishes with it’s redirecting) from the headers.

Something along these lines:

function getTargetUrl(short_url) {

  GM_log('Getting '+short_url);

  GM_xmlhttpRequest({
      method: 'GET',
      url: short_url,
      headers: {
          'User-agent': 'Mozilla/4.0 (compatible) Greasemonkey',
          'Accept': 'text/html'
      },
      onload: function(responseDetails) {
          GM_log('Done.  Status ' + responseDetails.status +
                ' Text ' + responseDetails.statusText + '\n\n' +
                ' Headers:\n' + responseDetails.responseHeaders);
      }
  });
}

(more…)

Touring Google’s solar panel installation

Sunday, May 17th, 2009

This past week I got a chance to take an up-close look at the solar panels up on the roof at work. My building doesn’t actually have solar panels yet, but many of the main campus buildings and carports do. I grabbed one of the campus bikes and headed over to meet up with the green committee and take a quick tour. Since I’m writing about my place of employment, the standard disclaimer applies.

Tour of the solar panels at Google

At the time the panels were installed, it was the largest corporate PV project in the country. Coming from Ohio to a sunny state like California, it’s a little surprising that Google’s 1.6MW project in 2007 was so groundbreaking. I think part of the problem has been that it takes a few years to see the return on the investment, and for the past decade or so everyone’s been so caught up in the short term. The company expected it would take 7.5 years for the system to pay for itself, but given the way utilities charge for peak load I hear we’re even ahead of that mark.

Here are a few closer shots of the panels:

(more…)

Seeing more spammers on Twitter lately?

Tuesday, May 12th, 2009

It was inevitable. As Twitter has grown and started pushing into the mainstream, spammers have started ramping up abuse. At first glance, Twitter isn’t the most obvious target – you actually have to follow someone to get content from them, users don’t generally search it for high-cpc stuff like meds and lawyers, and how much spam can you really get into 140 character messages?

But I’m seeing more invites from users like the one below:

Seeing a lot more spammers on Twitter lately...

First: What is Twitterspam? How do I know this is a spammer?

When it comes to spam, most people “know it when they see it,” but it’s helpful to look at the specific signals that this user might not be worth talking to. First off, they have 180 followers and yet haven’t posted a single update. The photo is a dead giveaway. The bio is actually pretty well-done, it’s in English and it’s not outlandish, but the homepage link (http://my-pictures.no.tp/tlow/) – she’s in Portuguese Timor?

Second: Why spam Twitter?

Spammers have two reasons to abuse Twitter: monetary payoff, and because it works.

How can they make money by tweeting a bunch of random people? Well in this case they aren’t, at least not yet. The payoff has to be through the homepage link, which I’m not following and you shouldn’t either. You get a friend invite on a system that, so far, has been a medium of immediate, short, personal communication. Your trust barriers thus weakened, you at least want to see who it is. They don’t have any updates yet, so you click the homepage link and… Virus. Or a maze of PPC affiliate pages and redirections.

Above I said spammers are hitting Twitter because it’s working. How do I know? Look at the number of followers, and the ratio of people followed to followers. About 22 percent of the people spammed so far have responded. I don’t know how many click through to the home page link, but if half the people bother to go that far they’ve got an amazing success rate for spam.

I wish Twitter luck. I know a few people over there, they’ve got their work cut out for them. This sort of thing isn’t easy to fight, it’s an ongoing process. They’ve already taken some visible steps, like using rel=”nofollow” on the Bio link, which at least keeps away blackhat SEOs looking for sources of pagerank. They’ll probably have to do more, most of it on the backend where you and I will never be the wiser. Happy spamfighting!

How spam and malware botnets work – two papers

Tuesday, May 5th, 2009

I read two reports today about large-scale botnets that really pointed out that security is still an open problem on the web. Recently, researchers got access to a nasty botnet, Torpig (original paper: Your Botnet is My Botnet: Analysis of a Botnet Takeover). A few months earlier researchers hijacked the Storm Worm and looked at its profitability (original paper: Spamalytics: An Empirical Analysis of Spam Marketing Conversion). Both papers are fascinating, but terrifying reads.

Some findings:

  • In 10 days, a botnet running on 160,000 machines stole credentials for over 8,000 bank accounts.
  • About 1 in 10 people who open a spam email click through to get infected by the malware.
  • 350 million spam emails resulted in only 28 sales, but the average purchase was $100.

How do these botnets get control of machines? How do they make money? Whether it’s a spammer who needs to get someone to make a purchase on a website or a scammer stealing credit card numbers, passwords, and other information, ultimately you need to get someone to a bad website. Think about all the paths you might take to different sites during the day:

  • Via a web search
  • Clicking on a link in an email
  • Going directly to a favorite site
  • Clicking through an ad

Spammers and scammers try to take advantage of all of those methods, and given the huge volumes of machines at their disposal, it’s a wonder search engines, spam filters, and advertising systems protect users as well as they do now. Between the first and third bullet point above, there’s a huge motivation to hack otherwise good sites to inject drive-by download malware – it can happen to anyone.

So what can we do about it? I think it ultimately comes down to a combination of smarter automated methods, better ways to establish trustworthiness, and removing the economic incentives for spamming, identity theft, and hacking. I have a few posts in mind about some current tools that help with the trust issue and how we might be able to build a social web of trust.

This isn’t a new discussion, Tim Berners-Lee has been writing about the web of trust since the 1990s. But all the work done since then has yet to really solve these problems. And really, so long as a few people are willing to click on a malware link or buy drugs via a spam email, it will never stop.

LED Bulbs vs. Compact Fluorescent: Part II

Friday, May 1st, 2009

DSC_0662 I wanted to revisit an earlier post comparing LEDs, CFLs and traditional incandescent bulbs. I found two different values for the power and light output of the Lemnis Lighting Parox II bulbs, and same folks at work were wondering the same thing.

I decided to bust out my trusty Kill-A-Watt and see how much power the bulb was really drawing.

I watched the meter for a bit and it never went above 4 Watts. So that’s a bit of a bonus. Out of curiosity I decided to plug my CFLs in and see how much power they actually drew.

The 15W CFL spiked to 18W for a second but then settled in at 12W. After a while it climbed up to 13W and would have presumably stayed there. The 7W CFL globe settled at 5W. The incandescent was the odd one of the bunch, measuring 63 W instead of 60W. So when you replace those old lightbulbs, you may be saving a little more than you think.

Here’s the updated spreadsheet:

Again, the total lumen output might not be directly comparable because the LED bulbs really only emit light from a half globe, while the other bulbs cast light in almost all directions. Depending on the fixture this might make the LED seem brighter in comparison.