Tag Archives: Google

Android baby names Facebook Flickr Google Docs Google Webmaster Tools how-to internet iphone maps Photography poll scam social networking spam web-development webspam WordPress

How my site disappeared from Google search

Seen my personal blog lately? Probably not, if you were searching via Google. Major sections of my site have been disappearing from the search index over the past three weeks. My homepage, my blog and many of the most recent articles on it no longer showed up in result pages. I’m no Matt Cutts, but I get a fair number of people coming to my site when searching for info about Google search, avoiding scams, and how to name their baby. All that traffic has been slipping away.

You can probably imagine how you would feel if this was happening to you. Does Google hate me? Was my site hacked? What do I do, and how much will it cost to get this fixed?

I will answer all of those questions, starting with the first:

My site is falling out of the index, does Google hate me?

Probably not. My situation is actually pretty illustrative – I’m pretty sure Google doesn’t hate me and isn’t unfairly slapping my site down because, well, I work at Google.

That’s right, Google was kicking pages from one of its own employees out of search results. I’m sure I’m not the first. Google doesn’t treat my site any differently than anyone else’s. BTW, standard disclaimers apply to this post.

So I knew there was probably a logical reason for the dropped pages, which brings me to the next question:

Continue reading

Important post on the Google blog about Google’s future in China

If you haven’t heard, there’s big news on the Google Blog about malicious attacks and Google’s future in China. Please take a minute to read the post.

I wanted to add three things:

  1. I work for Google fighting abuse, but I’m not involved in this so I can’t tell you anything more than what you see on the blog. If I was involved, then I definitely couldn’t tell you more. Standard disclaimers apply.
  2. I am very proud to work for a company with such a commitment to openness and free speech.
  3. I’ve worked with some folks from our Beijing office, and in my experience they are smart, capable people committed to serving users and helping people get the information they are searching for. I hope everything works out for them.

 

How to get Google search results for academic research

A few years ago, before I was a Googler, I was a grad student doing research on information retrieval. I wanted to compare the results of Google and other search engines with folksonomies form social bookmarking sites. It sounds pretty simple – Google does lots of internal search quality studies, so it’s not too surprising that outside researchers would want to execute lots of queries and use the results in their data.

The way I did it was… not optimal, to say the least. I wrote a bunch of PHP code, spaced out participant sessions, etc. to make sure I could get results back. Google tries to make sure that spammers aren’t scraping search results to generate webspam, so any kind of scraping with cURL, Beautiful Soup, etc. can result in a big pile of failure.

The way I did it wasn’t the right way or the easy way, so when I got the job I made a mental note to ask around for the best way to get search results. Then I forgot all about it until an email exchange with Gary Warner of CyberCrime & Doing Time fame.

It turns out Google has a great University research program and API. You have to apply for registration and let us know who you are, what school you’re affiliated with, and what you plan to study. Assuming everyting checks out you’ll get access to a pretty nice API. There’s a some example Python code but you could just as easily use PHP, Java, or whatever to consume the XML responses.

And that research I was doing? I recently noticed that my paper has been cited 7 or 8 times, according to Google Scholar. I used to joke that I had written the least influential paper in the history of academic publishing, but I guess I can’t claim the title anymore. Scopus only shows 4 citations so I will remain humble anyway.