Posts Tagged ‘Papers’

Notes on “Ontologies Come of Age”

Friday, April 30th, 2004

Ontologies Come of Age, Deborah L. McGuinness (2002)

One thing I noticed about this reading is the ample use of examples. If you look through all of the points below “Structured Ontologies and Their Uses” you can see what I mean. I find that to be a big problem with a lot of the things I’ve read about ontologies or the semantic web—there’s a lot of terminology and very little illustration. So in that regard, this was a good reading.

On the other hand, the more I play with Protege and read about ontologies, the more it seems to me that all the information science and library science people are moving closer and closer to the way relational databases work, without actually knowing it. For example, each of the classes in an ontology could be thought of as tables. The class/subclass relationship is like a one-to-many foreign key relationship, and since you can have more than one parent for any particular class you can have many-to-one and many-to-many relationships as well. Each of the fields or “slots” is just like a field in a relational table. There are only a few ways in which ontologies and relational databases differ, and they’re only really cosmetic differences. Relational databases have no notion of inheritance, for example, so fields for a table called “Thing” are not passed down to other other tables that have “thing_id” as a foreign key. But database applications and users create views which join the tables and do something similar. Also, Protege allows you to use a class or and instance as the type for a slot, whereas in relational databases it really only makes sense to use an instance.

There must be other people who have noticed this, and since a lot of web pages have relational database back-ends I have to assume semantic web pages will as well.

Sphere: Related Content

Notes on “Metadata Principles and Practicalities”

Friday, April 30th, 2004

Metadata Principles and Practicalities, Duval, Erik, Wayne Hodgins, Stuart Sutton, Stuart L. Weibel (2002).

This was a pretty straight-forward reading. I did like the Lego metaphor for metadata, but it would be nice if it was elaborated on a little bit more. So, kids have no problem combining space ship Lego parts with medieval castle parts, but just because it is possible to do so, is it beneficial or useful in any way? I understand they are just talking about modularity as a quality at the point, but it also gives the impression that metadata schema, properly constructed, can be mixed and matched willy-nilly.

One thing I have not seen discussed very much is where the line is drawn between metadata and regular data. For example, most schema have some sort of author/creator field. I see how the author could be data describing and article, but if you look at articles in a journal or on a web page, the author is almost always presented along with the body text. A more clear example of what I’m trying to say is an article’s abstract. I’ve see some schema that have the abstract as a piece of metadata, but does that mean it is not part of the data (the article) itself? Or is it both?

It all depends on your point of view. From a database designer’s perspective, all of theses metadata fields are just data, and metadata is what describes the structure of the database—field types, lengths, foreign key relationships, etc. I’m not saying that everything should be stored in one big lump. I guess I’m just concerned that depending on the point of view, metadata could mean anything. Really, all of this is just a matter of properly separating different data elements from each other. Obviously author/creator should be a separate field from article body text. And date, titles, etc. should be separate as well. But that doesn’t really separate them from the thing itself, they are still all aspects of the thing.

Sphere: Related Content

Notes on the UMLS Semantic Network

Thursday, April 8th, 2004

UMLS Semantic Network

This was a really interesting reading. Not interesting like a novel or movie, of course, but interesting because I keep hearing about semantic webs without seeing any worthwhile examples.

One thing I was a little surprised to see was the ASCII codes for creating a flat-file database. I would have through they would have either specified it in XML or something a little more modern. And I kind of cringe whenever I hear anyone call an ASCII file a database. Even though it’s technically true, to me ‘database’ means database management system (DBMS), with some mechanisms put into place to allow multiple users, referential integrity, etc. If everything is stored in ASCII files, than any idiot can ruin the whole system and it’s really, really easy to let data get corrupted. You have to do all sorts of extra work to make sure updates to one field cascade through the rest of the file.

Even through this was designed specifically for the medical field, it’s surprising how much their relationships and semantic types could be useful for almost any semantic web. I could only think of a few relationships that were missing, the chief one being requires. This is a relationship that’s very common in computer science, but I think it might apply fairly often in other fields as well. When and entity requires another it cannot exist without it. It’s kind of a mix between part_of and precedes.

Sphere: Related Content

Notes on “Vocabulary as a central concept in Information Science” and additional readings

Thursday, March 18th, 2004

Vocabulary as a Central Concept in Information Science, Michael Buckland (1999)

The role of classification in knowledge representation and discovery, BH Kwasnik - Library Trends, 1999

 

One good point in the Buckland article was that vocabulary can differ between those who are doing the cataloging, the authors and the searcher, even if everyone is within the same field. I’ve read some about these differences before, but they almost always seem to take the form of novice searcher vocabulary vs. expert author vocabulary or natural searcher vocabulary vs. structured system vocab. Those are probably the most clear ways to look at these distinctions—to tell you the truth looking at subtle differences between five different vocabularies does not seem like that much fun to me.

This article gets back to some of the same points we’ve already discussed in class when talking about synonym rings and taxnomies. Even through the author comes at it from a vocabulary point of view, he’s saying the same things everyone else is. If your users want to search for “Vietnam War” but your system uses “Vietnam Conflict,” without pointing the user in the right direction, no purpose has been served. You can be as correct and specific in your phrasing as you want but that’s no guarantee you’ll have a usable system.

The Kwasinik reading was really good at pointing out the strengths and weaknesses of hierarchies, trees and other organization schemes. In doing the AG assignment I ran into the “Lack of complete and comprehensive knowledge” barrier quite often. That’s one of the biggest problems with not just hierarchies, but any project like this where we have some knowledge of the domain—everyone has seen greeting cards—but not of the entire body of AG’s product line or even a representative subset. I wouldn’t want to construct a taxonomy of content object before people started entering data—I would have it be built as the database grew, with specific people in charge of keeping it consistent.

Sphere: Related Content

Notes on “Creating a Controlled Vocabulary”

Thursday, February 19th, 2004

Creating a Controlled Vocabulary

 Fast, Karl, Fred Leise and Mike Steckel (2003)

 

This was a good rundown of the general process of creating a controlled vocabulary, but a lot of this seems pretty apparent to me. I guess I shouldn’t assume that this stuff is obvious, though, given how many companies make web sites or intranets without really bothering to find out how their users use vocabulary for their domain, or even establishing a vocabulary, for that matter.

The two most important points, to me, are number 5, “Establish a record of the rules you are using if you are creating a large thesaurus” and number 8, “Go back and refine. What can be improved?” In fact I think the whole notion of controlled vocabulary is misguided if there’s no clear rationale for it and attempts to update and maintain the terms at all times. Language in any field is constantly changing, and the pace of change is always accelerating. Anyone who was building a directory of Internet services would have left off the World Wide Web in 1989, and any list about self-publishing on the web would probably have left off the term “blog” in 1998. How useful would those pick lists be today?

Controlled vocabulary can be damaging if there’s no mechanism for change, or that mechanism is left unused. I don’t know why, but humanity seems to have some undying urge to compile things around ourselves into grand lists and hierarchies that are supposed to encompass all of what is or ever has been, ignoring our complete ignorance of what the future will bring. It’s not that classification in and of itself is bad, it’s that there’s a tendency to get to the “end” and say, “there, it’s done, and set in stone forever.”

 

 

 

Sphere: Related Content

Software Comparison: ASP.NET vs PHP

Tuesday, February 17th, 2004

ASP.NET and PHP

Virtually every medium or large web site now uses some kind of server-side scripting to generate web pages and interactive features instead of static html. A number of technologies are used for this purpose, including PHP, ASP.NET, Perl, ColdFusion, and JSP. This paper will look at Microsoft’s ASP.NET and an open-source alternative, PHP, and compare them in terms of cost, performance, support, features and ease of use for web development.

 

Comparing ASP and PHP can be difficult because they are not exactly the same class of software. PHP is simply a server-side scripting language. The PHP homepage describes it as “a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML.”1 ASP, more properly ASP.NET, is not a language per se, and allows users to program Microsoft Internet Information Services (IIS) in Jscript, Vbscript, and C#, among others. ASP.NET is a little harder to define than PHP. ASP stands for Active Server Pages, and .NET, according to Microsoft, “is a set of Microsoft software technologies for connecting information, people, systems, and devices. It enables a high level of software integration through the use of Web services—small, discrete, building-block applications that connect to each other as well as to other, larger applications over the Internet.”2

 

Despite major structural differences, the two can and should be compared because they can be used to create the same kinds of medium-to-large, dynamic, often database-driven web sites. Server-side scripting allows sites to easily edit and update information, offer interactive features like forums and personalization, and track user traffic.

  (more…)

Sphere: Related Content

Usability Study: Kent State School of Library Science Website

Wednesday, September 17th, 2003

Kent State University School of Library Science Web Site

Site Design

The most basic level of usability is accessibility. Although it is beyond the scope of this analysis to consider problems that disabled users may have, it is useful to look at the site through the eyes of the Javascript-disabled or the DSL-disabled, those who do not have the latest, most up-to-date browsers with all the options turned on. One thing in the KSU SLIS site’s favor is the lack of any necessary plugins, like Flash or QuickTime VR, which some users might not have installed. The home page and the site’s navigation bar do use Javascript, which some users may have turned off, but disabling Javascript does not completely break the site’s navigation. It does, however, mean the users only have access to the first level of the navigation hierarchy from the homepage, which might make it a little more difficult to figure out which section is the appropriate one to go to.

On the plus side, the site is fairly slow-connection friendly. The entire homepage, including the Javascript rollover images, is only about 163K. The site makes appropriate use of alt tags for images, so anyone using a text-only browser like Lynx or surfing with images off will still be able to get around. Again, they will miss the descriptive second-tier categories for each section. The site is fully navigable in a full-text browser, but there are two problems: first, the homepage has no descriptive text, and second, there’s not always a link back to the homepage, probably because the image that links back has not alt text on most pages.

  (more…)

Sphere: Related Content