Tag Archives: semantic-web

flat-file-databases information-architecture intranets microformats ontologies Papers relational-databases semantic-network tagging trust web-development web standards WordPress

The power of microformats

Considering a Descent A few months ago I attended a really interesting talk by Eric Meyer where he touched on the use of microformats.  You might know Eric from his excellent O’Reilly Press CSS books.

What are microformats?  Before giving an example, I’ll give a little context.  When Tim Berners-Lee created the web, he tried to make HTML simple, flexible, and meaningful.  He succeeded on the first two counts but the third was quickly left by the wayside – many designers didn’t care what a particular tag meant, so long as it could be used for page layout.  The use of tables to arrange graphic elements instead of holding tabular data is a perfect example.

So Berners-Lee has been talking for years about the next step – the semantic web.  In the semantic web, tags are used to say what a particular piece of content is, with all styling done with stylesheets.  There is, of course, more to the semantic web than just separating content and presentation, after all you can work that way with HTML and CSS now.  One other key component is the web of trust, where people and web sites are able to describe relationships to each other so that search engines can help you find trustworthy content automatically.

Unfortunately, the semantic web has not really taken off.  There have been lots of meetings and XML schemas but it’s all too complicated, the process is too bureaucratic, and everything is being designed from the top down.

This is where microformats come in.  Let’s say you have a blog and you’ve tagged all your articles.  You’d like to let search engines and aggregators like Technorati know what your tags are.  But HTML doesn’t have anything like this:

<tag>semantic web<tag>

So what do you do?  Simple, use the rel-tag microformat:

<a href=”http://example.com/tag/semantic+web” rel=”tag”>semantic web</a>

The microformat makes use of existing html tags and attributes and just follows simple conventions.  But now that this little bit of meaning can be interpreted by spiders and other programs, we’ve actually added a pretty powerful bit of functionality to the web.

Most blog software, including WordPress, includes does microformatting for you.  If install my tag cloud plugin Altocumulous, and view source, you can see for yourself.

For intranet purposes, the hCard and hCalendar microformats look promising.  Take a look at microformats.org to see why I think so.  I’ll write more on it later.

Notes on “Ontologies Come of Age”

Ontologies Come of Age, Deborah L. McGuinness (2002)

One thing I noticed about this reading is the ample use of examples. If you look through all of the points below “Structured Ontologies and Their Uses” you can see what I mean. I find that to be a big problem with a lot of the things I’ve read about ontologies or the semantic web—there’s a lot of terminology and very little illustration. So in that regard, this was a good reading.

On the other hand, the more I play with Protege and read about ontologies, the more it seems to me that all the information science and library science people are moving closer and closer to the way relational databases work, without actually knowing it. For example, each of the classes in an ontology could be thought of as tables. The class/subclass relationship is like a one-to-many foreign key relationship, and since you can have more than one parent for any particular class you can have many-to-one and many-to-many relationships as well. Each of the fields or “slots” is just like a field in a relational table. There are only a few ways in which ontologies and relational databases differ, and they’re only really cosmetic differences. Relational databases have no notion of inheritance, for example, so fields for a table called “Thing” are not passed down to other other tables that have “thing_id” as a foreign key. But database applications and users create views which join the tables and do something similar. Also, Protege allows you to use a class or and instance as the type for a slot, whereas in relational databases it really only makes sense to use an instance.

There must be other people who have noticed this, and since a lot of web pages have relational database back-ends I have to assume semantic web pages will as well.

Notes on the UMLS Semantic Network

UMLS Semantic Network

This was a really interesting reading. Not interesting like a novel or movie, of course, but interesting because I keep hearing about semantic webs without seeing any worthwhile examples.

One thing I was a little surprised to see was the ASCII codes for creating a flat-file database. I would have through they would have either specified it in XML or something a little more modern. And I kind of cringe whenever I hear anyone call an ASCII file a database. Even though it’s technically true, to me ‘database’ means database management system (DBMS), with some mechanisms put into place to allow multiple users, referential integrity, etc. If everything is stored in ASCII files, than any idiot can ruin the whole system and it’s really, really easy to let data get corrupted. You have to do all sorts of extra work to make sure updates to one field cascade through the rest of the file.

Even through this was designed specifically for the medical field, it’s surprising how much their relationships and semantic types could be useful for almost any semantic web. I could only think of a few relationships that were missing, the chief one being requires. This is a relationship that’s very common in computer science, but I think it might apply fairly often in other fields as well. When and entity requires another it cannot exist without it. It’s kind of a mix between part_of and precedes.