Getting the word out about spam profiles and other social network abuse

June 28th, 2009

Just a quick post to point out an article I wrote on the Google Webmaster Central Blog, Spam2.0: Fake user accounts and spam profiles. This is a large and growing problem but a lot of folks I’ve talked to didn’t realize they had fake user accounts on their own sites. Excerpt:

Spammers create fake profiles for a number of nefarious purposes. Sometimes they’re just a way to reach users internally on a social networking site. This is somewhat similar to the way email spam works – the point is to send your users messages or friend invites and trick them into following a link, making a purchase, or downloading malware by sending a fake or low-quality proposition.

Spammers are also using spam profiles as yet another avenue to generate webspam on otherwise good domains. They scour the web for opportunities to get their links, redirects, and malware to users. They use your site because it’s no cost to them and they hope to piggyback off your good reputation.

The article got a write up in Information Week, which is pretty cool. Any way to let more people know about the issue.

Sphere: Related Content

Recommendations for an easy, automatic blogging system?

June 15th, 2009

DSC_0066 I’m looking for some help and suggestions, but first a little background on my latest project.

I’m a bit of a map geek – I’m fascinated by maps and how data can be illustrated with maps. I periodically post things on this blog but I actually run across a lot more cool map apps than I can share in mid- to long-form blog posts here.

I use a number of different social bookmarking and social news sites – it’s a research interest of mine, so I probably have accounts on far too many of them. When I come across a blog post on a cool old map or some interesting new real estate geodata site I’ll save/share it in a number of places, including StumbleUpon, Delicious, Reddit, and sometimes others. I also share things via Google Reader.

This is far to diffuse, so I thought I might make a separate mini-blog just for map geekery. But I already spend more than enough time with the blogs and services I’m using now – I’m only able to support another blog if I can automate some part of this giant messy workflow.

This would be pretty similar to how I manage my microblogging / status updates now. I have my Google Reader items posted to FriendFeed, which updates Twitter, which updates Facebook via the Facebook Twitter app. Convoluted, but now that it’s set up I can post something once and have it seen by friends on different services.

I’ve played around with a few different services:

Tumblr – Tumblr makes it very easy to import feeds, which is great for what I’m looking for. The only drawbacks are that so far I can’t narrow down some feeds to really target map bookmarks and I don’t see any easy way to add geodata.

Vox – I’ve only played around with it a bit, but I’m not sure what sets Vox apart from other blog hosts.

Wordpress.com – Actually, I thought this would be perfect given the right plugins, but wordpress.com doesn’t have plugins. Setting up and managing yet another Wordpress instance doesn’t sound too appealing.

Blogger – Blogger is great, and I should probably use it a bit more considering it’s a Google product. Unfortunately everything I saw in a quick search about posting to Blogger from RSS showed up on somewhat questionable SEO blogs, so I’m wary.

So I’m still looking. Any recommendations on what would be the easiest tiny-blog system to use?

Sphere: Related Content

The 5 People Who Could Destroy Twitter

June 5th, 2009

I’m a fan of Twitter – it can be really useful. But status update services and microblogging are relatively young technologies. Twitter is the frontrunner now, but it’s still possible that everything could go south really fast. Here are five people (or more accurately, types of people) who could destroy Twitter and what can be done to stop them.

The list is in no order, except I’ve saved the most dangerous for last.

1. Spammers

Seeing a lot more spammers on Twitter lately... Twitter spam is growing, and my guess is it’s a profitable business to be in. Spammers are getting crazy refollow-rates with very little effort put into their fake profiles. Part of this is a technical problem, with Twitter playing catchup to the collective innovative power of the greediest jerks on the internet. The more difficult part is social – users’ trust barriers are too low. Either Twitter finds ways to deal with this, or people will start treating reply tweets, direct messages, and invites the same way they do unsolicited emails now. One of the reasons I stopped logging in to MySpace was a flurry of fake friend requests that followed every session. Twitter runs that risk, in addition to the risk of service degradation.

What can be done? The good news is that no communication medium can be considered successful until someone has tried to send you unsolicited marketing and scams over it. But the Twitter team needs to redouble their efforts and head off potential problems proactively. For example, there are lots and lots of apps built on top of Twitter’s API – and almost all of them ask for your username and password. How long until one of those apps is compromised, or worse scammers make password-phishing apps of their own? Twitter needs to implement strong API keys or something like OpenID.

2. Anyone who uses url shortening services.

It’s hard to fit both a witty observation and a url in 140 characters, especially given url inflation. Bit.ly, Tinyurl, and the like perform the valuable service of giving you more space. They also cloak the destination of almost all the links on Twitter and get everyone used to following links blindly. I’ve already had friends whose accounts were hacked in order to send out a tweet like: “Check out this hilarious video: http://tiny/innocuousgibberish”. The New York Times’ account has been hacked, among others. Twitter can work on improving security and removing spam, but the more everyone uses url shorteners the more we train our friends to click recklessly. I’m as guilty on this one as anyone.

What can be done? People post links to Twitter frequently enough that maybe it should be separate field with it’s own character limit. If that’s too much complication for the brilliantly simple interface, maybe url previews should be enforced. Clients can do this now, but to be safe it should be done by Twitter.

3. Pirates, ninjas, zombies, and mafia thugs

Ah, I remember logging into Facebook the day I got my first “robots vs. hobos vs. Chuck Norris vs. etc.” request. “Ha,” I thought, “that’s a somewhat entertaining way to extend an internet meme into a social networking site.” Little did I know the horror that was about to unfold.

In all seriousness, the “tag, you’re it” games and gratuitous survey apps didn’t ruin Facebook, but they did make everything a bit more tedious. Those apps still fit within the umbrella of social networking – they don’t work at all in Twitter’s use model. When I log in, I want to see, very quickly, what the people I’m interested in are doing or reading. I don’t want to weed through their halves of various games I’m not interested in.

What can be done? This one is up to us – just don’t do it. Twittering with a hashtag for an event, a theme, etc. is fun and useful to others. Sending around vampire bites is not.

4. Chinese government officials

Think periodic fail whale sightings is bad for Twitter’s reliability? China can (and does) just block the whole site, most recently in advance of the Tienanmen Square anniversary. Why does this matter? China is a huge market, and growing. The days where being big in the U.S. meant major marketshare on the whole web are running short. What’s worse countries with theoretically free speech like Australia are following the Chinese model, proposing national internet content control (i.e. censorship).

What can be done? Many American companies just give up. Even Google has had to bend to government pressure. This is not easy to remedy. Perhaps there’s a way to take advantage of the small byte size of tweets, decentralize serving, and wrap access with something like Tor to get it through the Great Firewall. Let’s hope there’s a grad student or genius hacker out there with the right idea and Twitter is smart enough to hire them.

And finally, the absolute worst, most pressing threat the Twitter’s survival is…

(drumroll….)

5. Your mom

Despite the allure of turning this into one big “your mom” joke, I am completely serious. What happens when your mom joins Twitter? Do you censor yourself? Take your tweets private? Delete off-color tweets from your recent past?

There’s no right answer. Just about any social software eventually runs into this dilemma where the very different ways you communicate personally, professionally, and publicly collide.

What can be done? Some of the problem might fade as the userbase of sites like MySpace, Facebook and Twitter ages. But that will take years, so what can Twitter do now? It might help to have better relationship management. You could at least put your friends in one group and family in another. But in general, this strikes me as the toughest problem of them all – I don’t think there are any real solutions for the general possibility of parental embarrassment, or all efforts of every teenager in the world has yet to reveal discover them.

Disagree? Any threats I missed? Please post in the comments below.

Sphere: Related Content

Sick of compliment spam on your blog?

May 31st, 2009

Not amused One of the great things about having a blog is getting comments on your posts. It’s particularly gratifying when someone takes the time to tell you that your post was helpful, entertaining, or well-written.

Spammers know this and exploit it by generating compliment spam. They’ll put together a few lines of general praise and slather them across the web, hoping that bloggers will fall for the trick and post their spammy links.

Abusive social engineering like this really annoys me, so when in doubt I always do a Google exact phrase search to see if the compliment is really for me and not from a bot. This is tedious, so I created a simple WordPress plugin: O RLY Comment Spam Search.

You can get the plugin directly from WordPress.org, where you can also give it a rating to tell other webmasters how great (or non-great) it is. By the way, the plugin browser/installer added in WordPress 2.7 is very cool, and makes it much easier to try out plugins.

Judging by the thousands of blogs my O RLY searches have found, this sort of spam works. But why do spammers do it? Since WordPress (and most major blog systems) nofollow links in comments by default, the spammers can’t expect to gain any PageRank from these links. My guess is most of this spam is either intended to get traffic via clickthroughs or is generated by naive site owners, SEOs and marketers who don’t really understand how things work.

Take a look and let me know if it’s useful in the comments below. Also, let me know if it’s breaking on certain comments or otherwise buggy.

Sphere: Related Content

TinyUrl Trouble: Greasemonkey drops the location header in GM_xmlhttpRequest

May 21st, 2009

I get a lot of ideas. Most of them wander aimlessly in my head until they become obsolete, but once in a while I’ll get an idea that seems useful and simple enough to do in my free time.

If you’ve used Twitter, you’ve seen the myriad of url shortening services like TinyUrl and Bit.ly. Url shortening services are a kludge and they break one useful, built-in feature of the web, which is the ability to know where you’re going when you click a link.

So I thought, this is something that I could fix in an hour or so with a Greasemonkey script. If you have no idea what I’m talking about, Greasemonkey is a Firefox Plugin that runs in your browser and lets you run your own Javascript on pages you load. Greasemonkey comes with a handy-dandy AJAX function called GM_xmlhttpRequest.

I figured all I have to do is grab all the anchors on the page, see if they match a list of shortener urls, do an xmlhttpRequest for each one and grab the final location (after the service finishes with it’s redirecting) from the headers.

Something along these lines:

function getTargetUrl(short_url) {

  GM_log('Getting '+short_url);

  GM_xmlhttpRequest({
      method: 'GET',
      url: short_url,
      headers: {
          'User-agent': 'Mozilla/4.0 (compatible) Greasemonkey',
          'Accept': 'text/html'
      },
      onload: function(responseDetails) {
          GM_log('Done.  Status ' + responseDetails.status +
                ' Text ' + responseDetails.statusText + '\n\n' +
                ' Headers:\n' + responseDetails.responseHeaders);
      }
  });
}

Read the rest of this entry »

Sphere: Related Content

Touring Google’s solar panel installation

May 17th, 2009

This past week I got a chance to take an up-close look at the solar panels up on the roof at work. My building doesn’t actually have solar panels yet, but many of the main campus buildings and carports do. I grabbed one of the campus bikes and headed over to meet up with the green committee and take a quick tour. Since I’m writing about my place of employment, the standard disclaimer applies.

Tour of the solar panels at Google

At the time the panels were installed, it was the largest corporate PV project in the country. Coming from Ohio to a sunny state like California, it’s a little surprising that Google’s 1.6MW project in 2007 was so groundbreaking. I think part of the problem has been that it takes a few years to see the return on the investment, and for the past decade or so everyone’s been so caught up in the short term. The company expected it would take 7.5 years for the system to pay for itself, but given the way utilities charge for peak load I hear we’re even ahead of that mark.

Here are a few closer shots of the panels:

Read the rest of this entry »

Sphere: Related Content

Seeing more spammers on Twitter lately?

May 12th, 2009

It was inevitable. As Twitter has grown and started pushing into the mainstream, spammers have started ramping up abuse. At first glance, Twitter isn’t the most obvious target – you actually have to follow someone to get content from them, users don’t generally search it for high-cpc stuff like meds and lawyers, and how much spam can you really get into 140 character messages?

But I’m seeing more invites from users like the one below:

Seeing a lot more spammers on Twitter lately...

First: What is Twitterspam? How do I know this is a spammer?

When it comes to spam, most people “know it when they see it,” but it’s helpful to look at the specific signals that this user might not be worth talking to. First off, they have 180 followers and yet haven’t posted a single update. The photo is a dead giveaway. The bio is actually pretty well-done, it’s in English and it’s not outlandish, but the homepage link (http://my-pictures.no.tp/tlow/) – she’s in Portuguese Timor?

Second: Why spam Twitter?

Spammers have two reasons to abuse Twitter: monetary payoff, and because it works.

How can they make money by tweeting a bunch of random people? Well in this case they aren’t, at least not yet. The payoff has to be through the homepage link, which I’m not following and you shouldn’t either. You get a friend invite on a system that, so far, has been a medium of immediate, short, personal communication. Your trust barriers thus weakened, you at least want to see who it is. They don’t have any updates yet, so you click the homepage link and… Virus. Or a maze of PPC affiliate pages and redirections.

Above I said spammers are hitting Twitter because it’s working. How do I know? Look at the number of followers, and the ratio of people followed to followers. About 22 percent of the people spammed so far have responded. I don’t know how many click through to the home page link, but if half the people bother to go that far they’ve got an amazing success rate for spam.

I wish Twitter luck. I know a few people over there, they’ve got their work cut out for them. This sort of thing isn’t easy to fight, it’s an ongoing process. They’ve already taken some visible steps, like using rel=”nofollow” on the Bio link, which at least keeps away blackhat SEOs looking for sources of pagerank. They’ll probably have to do more, most of it on the backend where you and I will never be the wiser. Happy spamfighting!

Sphere: Related Content

How spam and malware botnets work – two papers

May 5th, 2009

I read two reports today about large-scale botnets that really pointed out that security is still an open problem on the web. Recently, researchers got access to a nasty botnet, Torpig (original paper: Your Botnet is My Botnet: Analysis of a Botnet Takeover). A few months earlier researchers hijacked the Storm Worm and looked at its profitability (original paper: Spamalytics: An Empirical Analysis of Spam Marketing Conversion). Both papers are fascinating, but terrifying reads.

Some findings:

  • In 10 days, a botnet running on 160,000 machines stole credentials for over 8,000 bank accounts.
  • About 1 in 10 people who open a spam email click through to get infected by the malware.
  • 350 million spam emails resulted in only 28 sales, but the average purchase was $100.

How do these botnets get control of machines? How do they make money? Whether it’s a spammer who needs to get someone to make a purchase on a website or a scammer stealing credit card numbers, passwords, and other information, ultimately you need to get someone to a bad website. Think about all the paths you might take to different sites during the day:

  • Via a web search
  • Clicking on a link in an email
  • Going directly to a favorite site
  • Clicking through an ad

Spammers and scammers try to take advantage of all of those methods, and given the huge volumes of machines at their disposal, it’s a wonder search engines, spam filters, and advertising systems protect users as well as they do now. Between the first and third bullet point above, there’s a huge motivation to hack otherwise good sites to inject drive-by download malware – it can happen to anyone.

So what can we do about it? I think it ultimately comes down to a combination of smarter automated methods, better ways to establish trustworthiness, and removing the economic incentives for spamming, identity theft, and hacking. I have a few posts in mind about some current tools that help with the trust issue and how we might be able to build a social web of trust.

This isn’t a new discussion, Tim Berners-Lee has been writing about the web of trust since the 1990s. But all the work done since then has yet to really solve these problems. And really, so long as a few people are willing to click on a malware link or buy drugs via a spam email, it will never stop.

Sphere: Related Content

LED Bulbs vs. Compact Fluorescent: Part II

May 1st, 2009

DSC_0662 I wanted to revisit an earlier post comparing LEDs, CFLs and traditional incandescent bulbs. I found two different values for the power and light output of the Lemnis Lighting Parox II bulbs, and same folks at work were wondering the same thing.

I decided to bust out my trusty Kill-A-Watt and see how much power the bulb was really drawing.

I watched the meter for a bit and it never went above 4 Watts. So that’s a bit of a bonus. Out of curiosity I decided to plug my CFLs in and see how much power they actually drew.

The 15W CFL spiked to 18W for a second but then settled in at 12W. After a while it climbed up to 13W and would have presumably stayed there. The 7W CFL globe settled at 5W. The incandescent was the odd one of the bunch, measuring 63 W instead of 60W. So when you replace those old lightbulbs, you may be saving a little more than you think.

Here’s the updated spreadsheet:

Again, the total lumen output might not be directly comparable because the LED bulbs really only emit light from a half globe, while the other bulbs cast light in almost all directions. Depending on the fixture this might make the LED seem brighter in comparison.

Sphere: Related Content

Thoughts on Blog Usability

April 29th, 2009

DSC_0723 I’ve been kicking around the idea of redesigning my homepage and blog, though I’m not sure I really have the free time to do it. To start, I thought I would to put down a few thoughts about applying usability principles when designing blogs.

When you starting thinking about usability it’s temping to jump right into lists of principles and rules of thumb. It’s a little silly applying Fitt’s Law when you haven’t even established what you want your site to accomplish in the first place. So what, generally, do you want your blog to do?

Personal Goals

  • Share thoughts and work with others
  • Collect a body of work to represent myself (like a portfolio)
  • Collect information for later discovery (by myself and others)
  • Provide an outlet to continue practice writing
  • Allow others to communicate with me and comment

If you’re creating or redesigning a blog for a company, the goal set may be very different. Below are some examples that don’t actually apply in my case.

Business goals

  • Communicate with customers
  • Build long term relationships with customers
  • Produce quality content to drive search traffic
  • Generate revenue through advertising
  • Etc.

Many projects don’t even get this far before the graphic designers and web developers are already making mock-ups, but we still have one more important step to do. We know why you’re building a blog, but why are users coming to it?

Read the rest of this entry »

Sphere: Related Content