Archive for June, 2008

Three Reasons to Go Get Firefox 3.

Wednesday, June 18th, 2008

Firefox 3 is officially out, so go and get it.  Wondering why you should be excited about a new web browser?  Here’s three quick reasons why you should got get Firefox 3 now:

1.  It is much, much faster when it comes to complicated javascript, AJAX, and multiple iframes.  I don’t have any benchmarks on me, but I do some pretty intense stuff with Firefox and the improvement is immediately apparent.  This is very important because even normal web browsing is becoming pretty intensive, from Google Maps to Gmail to normal blogs with 100 widgets plastered on their sidebars.

2.  It’s even easier to manage add-ons and downloads.  The real power of Firefox is the ease of creating and installing extensions, and the interface has been improved making it easier to find new add-ons.  The download manager has been polished as well, which should help end the old “where did that file go” blues.

3.  The smart address bar is very cool.  I almost never have to finish typing urls anymore…  and it gives me immediate feedback on typos as well.  Hopefully this will put a damper on lame business models like typosquatting.

The Urge to Deletion: Is Wikipedia is making molehills out of mountains?

Tuesday, June 17th, 2008

Black Mountain Wikipedia is great.  Even now, it’s still kind of amazing that such a huge body of knowledge has been organized ad-hoc by volunteers, most of whom have never met in person. Most social software systems would die for this level of collaboration.

That said, has anyone else gone to a random Wikipedia article from, say, search results and ended up a little depressed?  It seems like every other article I find lately has a big warning label at the top – this article contains too much trivia, this article has too many fictional references for an encyclopedic and academic approach of this topic, and worst one of all: this article has been marked for deletion.

I understand that it must be very difficult to wrangle all the millions of contributions into a consistently high-quality encyclopedia.  Just dealing with all the spam and abuse must be an enormous undertaking, even when distributed among thousands of good samaritans.  But one of the things that was great about Wikipedia was the breadth of coverage and the depth on some particulars, even if it was excessive to the point of comedy.

But a brief look at the list of articles marked for deletion the last few days illustrates my point.

1. Horse Ranch Mountain. You know there’s something wrong when a mountain doesn’t meet the notability requirement.   Here’s the comment opening the deletion on the talk page:

In what way is Horse Ranch Mountain notable? I am quite familiar with the area, and I cannot think of any way in which it is notable. Please convince me otherwise.

I would think it’s notable because it is a mass of millions of tons of rock and earth sticking out of the ground.  One a less sarcastic note, I’m sure I’m not the only one who’s looked at a map, spotted a feature I’ve never heard of, then looked it up online.  Even if it’s not accessible it’s probably helpful to have a reference noting that it’s the highest point in Zion, measured at X meters tall, etc.

2.  List of redundant expressions. I understand the argument that an encyclopedia is not a trivia game or a book of lists, but these sorts of pages used to be one of my favorite features of Wikipedia.  Exhaustive lists of palindromes, English words of Polish origin, etc., give examples, context, and can help connect concepts in language.  Also, the use or omission of redundancy is an important stylistic consideration when writing – it can be used for everything from emphasis to characterization.

3.  Hindu literature. Delete the article on Hindu literature?  Granted, the article needs work.  But isn’t it worrying how the marked for deletion pages are filled with subject matter from outside the U.S. and maybe Europe?

I know the standard answer to complaints like these is that if you feel so strongly, you should participate in the debates and push for things not to be deleted.  Judging by the talk pages I wonder if I would be drowned out by all the “I’m a history major and this is a programming term, never heard of it, not notable” comments.  I’ll admit my contribution to Wikipedia is limited to random spelling and grammar corrections that were obvious enough that even I noticed them, so I could be wrong.  I just feel like some of what made Wikipedia so addictive is slowly being drained away.

Agree?  Think I’m wrong?  Leave me a comment below.  See, it’s kind of like a talk page, but even with consensus you can’t edit my article.   Until the next Wordpress exploit comes out.

An interesting use of Greasemonkey – Troubleshooting other people’s sites

Wednesday, June 11th, 2008

Detriot-Superior and Center Street Bridge I’ve played around with Firefox’s Greasemonkey add-on here and there but never really delved into it until recently.  I found most of the common uses for it to be either too specific to someone else’s use habits or already covered by other extensions.  For example, there are probably a million ad blocking scripts out there, but I already have Adblock.

I’ve grown to appreciate Greasemonkey a lot more since I learned that you can make AJAX calls in scripts – now we can do some real damage.  But this post is not about that, it’s about a totally different use case that I hadn’t thought of before.

If you’re a web developer with any friends or family you’ve probably heard this one before:

“Something’s wrong with my web site, can you take a look?”

Often, though, you won’t have access to a dev server, database, or even a copy of the server-side code.  All you can see is the HTML and Javascript source and the HTTP transactions going back and forth.

Greasemonkey can’t rewrite PHP code on someone else’s server but it does make it really, really easy for you to alter forms, delete and change cookie values, and patch and debug Javascript on the site you’re looking at, without changing any other variables.

This can be really, really useful in some situations.  So now it’s officially added to my volunteer/web-developer/brother-in-law toolbelt.

Why have a website, why create a blog, why Twitter?

Monday, June 9th, 2008

Golden Gate Bridge from the northMy esteemed colleague Beah just started blogging, and opened her blog with a very important question – Why Blog?  I remember people asking a similar question years ago when I registered this domain – why would you want to have a website with your name on it?  Almost the same question has come to my mind recently when playing around with Twitter.

So, why blog?  With all the hundreds of thousands of blogs on the web you might think there’s no need to ask this question.  One of the best things about social science is asking questions about things that everyone takes for granted.  Unfortunately the “science” part of social science is a bit too time-consuming to finish up on a Sunday-evening blog post, so instead we’ll look at a few sites of friends and colleagues and maybe collect some thoughts on what motivates people to blog.

First, why do I blog here?  I try to keep this blog relatively professional, posting mostly on topics that I encounter in my work, in my academic research, and in my side projects (the standard disclaimer, as always, applies).  One of my motivations was sharing some of the research done for classwork – it seemed a shame to write up a report, turn it in to a professor, and then let it gather dust in some corner of my hard drive.  My undergrad degree was in journalism and I do miss writing, so that’s another motive.  Also, having been through some rough patches in my career during the dot-com downturn, I thought blogging might help me establish a bit of a professional brand.  I have my URL on my resume and I would hope that any company looking to hire me would get an idea that I’m knowledgeable and interested in relevant areas.

But I’m not a very random sample, so let’s look at a few other blogs and try to appreciate why they write.  I think I can place them into a few rough categories:

Personal takes on professional / technical interests:

This is largely where my blog falls.  Common post topics will include things like “how to get around an annoying issue with some software/programming language,” “very excited about the new device from Apple,” “report from a conference,” and “very disappointed with the new device from Apple.”

Public journaling to keep in touch with friends and family:

I’ve done this in the past as well – blogs taking the place of those old-fashioned mass emails you used to send out freshman year of college.  If you went to college in the ancient days before blogs and Facebook.  This is a place for both epic travelogues and saved IM conversations filled with inside jokes.

Sharing interests and reviews:

This category runs the gamut from folks who just want to show their friends a funny Youtube video to blogging a season of a TV show to reviewers writing prolifically about a very obscure musical genre.

Artistic or literary expression:

Self-publishing has opened the doors for artists and writers, both amateur and professional, to share their work with whatever audience they find.  This can run from virtual serial galleries shows to community-driven commentary and learning.

Of course these all overlap, and some blogs cover all the bases.  See KooKoo for KokoPuffs for an example

So do we answer our question with a plethora of distinct motives for blogging?  Not necessarily.  There’s one theme that runs throughout all of the above – these are all social activities.  Ultimately blogging is human interaction.

Oh, and that other question – why use Twitter?  No clue.

Got a reason why you use Twitter?  Are you a co-worker angry at me for misconstruing your blog?  Please let me know in the comments below.

Google Earth vs. Reality, Revisited

Friday, June 6th, 2008

Last week I compared some real-life photos with the same scene in Google Earth.  Since I’m a bit of a computer/mapping/photography geek, I couldn’t resist doing a few more.  That actually ended up being a pretty popular post, with thousands of pageviews, which just goes to show I’m not the only combination computer/mapping/photography geek out there.

Here’s a view of San Francisco from Coit Tower on Telegraph Hill.  Follow this link to see larger versions in Flickr.  This one is even better than the two from last week – look how well the streets, buildings, and Golden Gate Bridge match with the photo.

Google Earth vs. Reality - San Francisco from Coit Tower

Now I’ll go a little more international.  Here’s a photo from the site of ancient Mycenae in Greece.  This is above the famous Lion Gate looking out tat the hills surrounding the Argolid plain.  See larger versions in Flickr.  The aerial photograph that Google Earth maps to the topography isn’t as detailed as the real life photo, but even the borders of the olive groves line up.

Google Earth vs. Reality - Mycenae, Greece

These next two are not as identical as the San Francisco cityscapes, but are still impressive because of how well they evoke the real life scenes without 3-d buildings.

The first is from the Acropolis in Athens, looking out over the surrounding neighborhood.  Larger versions in Flickr.

Google Earth vs. Reality - Athens from the Acropolis

Here’s another shot from the Acropolis showing the new Acropolis Museum.  Larger versions in Flickr.

Google Earth vs. Reality - Athens and the new Acropolis Museum

If you feel like making some comparisons of your own, please let me know in the comments below – I’d love to see what other people could come up with.

Scientific proof that Reddit should add a tagging system

Tuesday, June 3rd, 2008

First, a disclaimer: the title of this post is obviously exaggerated. Proof is an awfully big word to throw around, and although I employed pretty good experiment design practices and statistical checks, I can’t really prove that Reddit should do this or that. But I can show that what they are doing now is not working, at least when it comes to search.

So, I got an email the other day letting me know that my article, Tagging and Searching: Search Retrieval Effectiveness of Folkonsomies on the World Wide Web, is being published in the July 2008 issue of Information Processing and Management (here’s the official DOI link to the article). In the study I compared search performance between traditional search engines (like Google), subject directories (like Open Directory), and social bookmarking systems (like Reddit) and their folksonomies.

What’s a folksonomy? The word is a play on the term taxonomy – a taxonomy is a system of organizing and categorizing things, like the Dewey Decimal System. Taxonomies usually follow very strict rules and are controlled by experts. A folksonomy is a system of organization built by large numbers of regular users, who add things to the collection, evaluate them, and usually tag them with keywords.

IR-system-precision-1-20

In my study, the social bookmarking systems with tagging systems did surprisingly well – Del.icio.us was more precise than Open Directory, and at a cut off of 20 results it’s precision was fairly close to that of the search engines.

Reddit, however, did not fare so well. It consistently had the lowest precision, meaning that searches returned very few relevant results. There could be many reasons for this, but the biggest difference between Reddit and the others is the lack of tags.

Now, it’s possible that the folks at Reddit have no interest in search, or information retrieval in general. I think Reddit is very effective at bringing out new and interesting links on a daily basis and encouraging commentary (just my opinion, no stats to back that up). But I think it’s a big missed opportunity not to add tagging and see where it leads.

(One last disclaimer: this post is my personal opinion as someone who enjoys using Reddit and does not reflect on my employer. This post refers to research done independently as a grad student.)

XHTML 2 vs HTML 5 and the href Attribute

Monday, June 2nd, 2008

Spider web window - common motif in the Winchester HouseI wrote a little earlier about what I was looking forward to in HTML 5.  I haven’t had a chance to really collect my thoughts about XHTML 2 vs HTML 5, to be honest I’d be happy to see progress on both fronts.  I do have to say I lost interest in XHTML 2 early on when it seemed they were throwing some baby out with the bathwater.  HTML is not the cleanest, most elegant language but the ease of picking it up is part of why the web grew so quickly.  Even if that has forced browsers to cope with millions of pages of clunky, broken HTML.

Eric Meyer has at least one point in XHTML 2’s favor – the ability to add and href attribute to anything, making it a link.  In addition to making the <a> tag jealous, this would let you do some pretty cool stuff like turn an entire table row into a link in a dynamic data reporting web app without a lot of Javascript or duplicated tags.

By the way Eric is a fellow member of the Cleveland Web Standards Association and a great speaker.  If you get a chance to see a talk by him you should really check it out.