Archive for April, 2004

Metadata Schema for Radiological Terrorism Research

Friday, April 30th, 2004

Note: this was a project for a graduate course in Knowledge Organization Systems

 

Metadata schema for radiological terrorism research (MSRTR)

Terrorism research is a complex field dealing with a number of entities, each with their own metadata requirements. This document is an introduction to the kinds of schema that will be necessary for proper cataloging, identification, and retrieval in the radiological terrorism subfield. Schema for radioactive material sources and radiological terrorism responses are presented below, followed by sample records and a crosswalk between the two scheme and the Dublin Core. Schema were made as simple as possible (8 and 6 main fields, with several qualifiers, respectively) in order to make application quick, easy and consistent.

Fields are described in the following format:

  • Field Name
  • Qualifiers: additional subfields
  • Definition: a description of the field and usage
  • Data Values: notes whether data is controlled or uncontrolled, and if controlled the terms allowed or the source of terms allowed.
  • Status: cardinality of the field (recommended, required, optional)
  • Number: number of entries allowed for each record for this field (single, multiple)
  • Dublin Core: corresponding field in the Dublin Core. This information is repeated in the crosswalk as well.

 

Metadata for radioactive material sources (RMS)

 

Identifier

  • Qualifiers: None
  • Definition: Code number used to identify source in database.
  • Data Values (controlled): Auto-generated database identity field values
  • Status: required
  • Number: single
  • Dublin Core: IDENTIFIER

 

Name

  • Qualifiers: None
  • Definition: Way in which a site is identified, for example the name of a hospital, power plant, mine, etc.
  • Data Values (uncontrolled):
  • Status: required
  • Number: multiple
  • Dublin Core: TITLE

 

Owner

  • Qualifiers: None
  • Definition: The name of the person, company, government, or other organization that owns the source.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: CREATOR

 

Location

  • Qualifiers: Location.Coordinates, Location.Country
  • Definition: Physical location of source.
  • Data Values (controlled): ISO 6709:1983 for Coordinates, ISO 3166 Country Codes for Country
  • Status: recommended
  • Number: single
  • Dublin Core: COVERAGE

 

Level

  • Qualifiers: None
  • Definition:
  • Data Values (controlled): high-level, low-level
  • Status: recommended
  • Number: multiple
  • Dublin Core: TYPE

 

Type

  • Qualifiers: None
  • Definition: Radioactive waste is categorized according to its origin and not necessarily according to its level of radioactivity. For example, some low-level waste has the same level of radioactivity as some high-level waste. (http://www.epa.gov/radiation/docs/radwaste/index.html)
  • Data Values (controlled): spent nuclear fuel, transuranic waste mainly from defense programs, uranium mill tailings, low-level waste, naturally occurring and accelerator-produced radioactive materials.
  • Status: recommended
  • Number: multiple
  • Dublin Core: TYPE

 

Description

  • Qualifiers: Description.Facility, Description.Security, Description.Storage
  • Definition: Text description of the facility, security, and storage methods.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: single
  • Dublin Core: DESCRIPTION

 

Isotopes

  • Qualifiers: Isotopes.Present, Isotopes.Potential
  • Definition: Radioactive Isotopes either present or potentially present at the source.
  • Data Values (controlled): IUPAC isotope symbols
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

 

Metadata for radiological terrorism responses (RTR)

 

Identifier

  • Qualifiers: None
  • Definition: Code number used to identify repsonse in database.
  • Data Values (controlled): Auto-generated database identity field values
  • Status: required
  • Number: single
  • Dublin Core: IDENTIFIER

 

Name

  • Qualifiers: None
  • Definition: Way in which a response is identified, a short description of the action taken
  • Data Values (uncontrolled):
  • Status: required
  • Number: single
  • Dublin Core: TITLE

 

Type

  • Qualifiers: None
  • Definition: Denotes the type of response and group primarily responsible for the response.
  • Data Values (controlled): medical, public, military
  • Status: required
  • Number: single
  • Dublin Core: TYPE

 

Description

  • Qualifiers: Description.Process, Description.Event, Description.Other
  • Definition: Text description of the response process, event this is a response to, and other details.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

Expertise

  • Qualifiers: Expertise.Required, Expertise.Other
  • Definition: Types of expertise either required for the response to occur, or possibly useful during course of response.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

Related

  • Qualifiers: Related.Prerequisite, Related.Required_for, Related.Concurrent
  • Definition: Other responses that are either required before this response, require this response to proceed, or may be enacted concurrently with this response in a complimentary way.
  • Data Values (controlled): Identifier values
  • Status: recommended
  • Number: multiple
  • Dublin Core: RELATION

 

 

Sample Record 1, Radiological material source

Identifier 000050060891

Name Springfield Nuclear Power Plant

SNPP

Owner Montgomery Burns

Location.Country US

Level high-level

Type spent nuclear fuel

Description.Facility In 2001, Unit 1 had a capacity factor of 100.5 percent and supplied 7.68 billion kilowatthours of electricity. PWR (pressurized light water reactor).

Description.Security Security meets U.S. DOE standards for this type of power plant.

Description.Storage Storage conforms to U.S. DOE standards for this type of power plant, but there have been a number of accidental releases in recent years.

Isotopes.Present U-235

 

Sample Record 2, Response to radiological terrorism

Identifier 003450666894

Name Duck and cover

Type Public

Description.Process If in a school building, elementary students will get up from their desks in a clam and orderly manner, drop to the floor and place themselves underneath the desk, with their hands covering the backs of their necks. If caught outdoors, students will drop their bicycles and/or jump ropes, find a wall, drop tot the ground next to it and cover the backs of their necks with their hands.

Description.Event This is the appropriate response for any radiological or nuclear event occurring in the 1950s, including the nearby explosion of an atomic bomb.

Description.Other Students are also encouraged not to stare at the brilliant flash of an atomic bomb.

Expertise.Required Propaganda Film-making

Related.Prerequisite 003345345344 (Development of nuclear bomb proof desks)

003345345372 (Promotion of Cold War anxieties in populace)

Related.Concurrent 003345388843 (Prayer)

 

 

Crosswalk

 

RMS: Identifier

RTR: Identifier

Dublin Core: IDENTIFIER

 

RMS: Name

RTR: Name

Dublin Core: TITLE

 

RMS: Owner

RTR: Description.Other

Dublin Core: CREATOR

 

RMS: Location

RTR: Description.Event

Dublin Core: COVERAGE

 

RMS: Level

RTR: Description.Event

Dublin Core: TYPE

 

RMS: Type

RTR: Description.Event

Dublin Core: TYPE

 

RMS: Description

RTR: Description.Other

Dublin Core: DESCRIPTION

 

Isotopes

RTR: Description.Event

Dublin Core: DESCRIPTION

 

RTR: Expertise

RMS: Description

Dublin Core: DESCRIPTION

 

RTR: Related

RMS: Description

Dublin Core: RELATION

 

Sphere: Related Content

Ontology for Radiological Terrorism Research

Friday, April 30th, 2004

Domain

The ontology was created from the Radiological Terrorism Research Thesaurus, specifically constrained to the portions under the term “material sources” and “consequence management” (now called response). Other classes not found in these areas, but referenced by fields in these areas, are included, but not developed—this includes Organization, Event, Expertise, Person, and Material and their subclasses.

Background

Terrorism is an incredibly important issue, and agencies within the US and worldwide need to meet the challenge of compiling and organizing research in a number of fields in order to counter this very real threat. In addition, agencies have been criticized in the past for not sharing information, or maintaining knowledge organization systems (KOS) which are incompatible with each other. Work is often duplicated, and often vital information will be unavailable to some agencies even though it has already been archived by others.

Clearly, there is a need for a large-scale KOS that can be used to organize information efficiently and correctly, allow for complex analysis of information, and allow for easy knowledge sharing between agencies. The most flexible and powerful KOS, and therefore the most appropriate, is an ontology. Classes, subclasses and relationships are developed and then appropriate fields are created for each. This allows for faceted search and display, automated search, hierarchical organization of information, and interoperability with other systems.

Users

This is just a sample of the larger, more complete ontology. The complete ontology would be useful for virtually any person or agency dealing with anti-terrorism, counterterrorism, intelligence or consequence management. The ontology will allow risk assessment officers, for example, to see a list of every high-level material source in the United States and Canada and their coordinates. Medical first responders could use it to catalog and retrieve proper treatments for specific bioterrorism agents. And if widely-adopted, it would greatly reduce the barriers to efficient knowledge-sharing. If the Department of Energy we to license a new Uranium mine in Montana, the information would be immediate available to risk-assessment officers, instead of requiring time for the paperwork to make its way over to the Department of Homeland Security.

 

View and navigate the ontology

Sphere: Related Content

Notes on “Ontologies Come of Age”

Friday, April 30th, 2004

Ontologies Come of Age, Deborah L. McGuinness (2002)

One thing I noticed about this reading is the ample use of examples. If you look through all of the points below “Structured Ontologies and Their Uses” you can see what I mean. I find that to be a big problem with a lot of the things I’ve read about ontologies or the semantic web—there’s a lot of terminology and very little illustration. So in that regard, this was a good reading.

On the other hand, the more I play with Protege and read about ontologies, the more it seems to me that all the information science and library science people are moving closer and closer to the way relational databases work, without actually knowing it. For example, each of the classes in an ontology could be thought of as tables. The class/subclass relationship is like a one-to-many foreign key relationship, and since you can have more than one parent for any particular class you can have many-to-one and many-to-many relationships as well. Each of the fields or “slots” is just like a field in a relational table. There are only a few ways in which ontologies and relational databases differ, and they’re only really cosmetic differences. Relational databases have no notion of inheritance, for example, so fields for a table called “Thing” are not passed down to other other tables that have “thing_id” as a foreign key. But database applications and users create views which join the tables and do something similar. Also, Protege allows you to use a class or and instance as the type for a slot, whereas in relational databases it really only makes sense to use an instance.

There must be other people who have noticed this, and since a lot of web pages have relational database back-ends I have to assume semantic web pages will as well.

Sphere: Related Content

Notes on “Metadata Principles and Practicalities”

Friday, April 30th, 2004

Metadata Principles and Practicalities, Duval, Erik, Wayne Hodgins, Stuart Sutton, Stuart L. Weibel (2002).

This was a pretty straight-forward reading. I did like the Lego metaphor for metadata, but it would be nice if it was elaborated on a little bit more. So, kids have no problem combining space ship Lego parts with medieval castle parts, but just because it is possible to do so, is it beneficial or useful in any way? I understand they are just talking about modularity as a quality at the point, but it also gives the impression that metadata schema, properly constructed, can be mixed and matched willy-nilly.

One thing I have not seen discussed very much is where the line is drawn between metadata and regular data. For example, most schema have some sort of author/creator field. I see how the author could be data describing and article, but if you look at articles in a journal or on a web page, the author is almost always presented along with the body text. A more clear example of what I’m trying to say is an article’s abstract. I’ve see some schema that have the abstract as a piece of metadata, but does that mean it is not part of the data (the article) itself? Or is it both?

It all depends on your point of view. From a database designer’s perspective, all of theses metadata fields are just data, and metadata is what describes the structure of the database—field types, lengths, foreign key relationships, etc. I’m not saying that everything should be stored in one big lump. I guess I’m just concerned that depending on the point of view, metadata could mean anything. Really, all of this is just a matter of properly separating different data elements from each other. Obviously author/creator should be a separate field from article body text. And date, titles, etc. should be separate as well. But that doesn’t really separate them from the thing itself, they are still all aspects of the thing.

Sphere: Related Content

A Thesaurus for Radiological Terrorism Research

Thursday, April 15th, 2004

Changes in this Edition

A number of changes have been made in this revision. Changes to scope notes, terms, and related terms are highlighted throughout this document. These changes should clarify the precise meaning and use. Sturctural changes to broader and narrower term relationships are explained below.

One of the major structural changes is the removal of “radiological terrorism” as a root word for the entire thesaurus. Putting everything under one term was not my initial idea, but the use of the hierarchical display for both input and output lead me to think that was the preferred structure. I have removed “combating radiological terrorism,” “environmental effects,” “radiation protection,” “radioactive isotopes,” “radioactive material sources,” and “radiological injuries” from under “radiological terrorism.”

Still, I think “radiological terrorism goals,” “radiological terrorism scenarios,” and “radiological terrorism requirements” are necessary parts of “radiological terrorism,” so I have kept the first two in the hierarchy and added the third. This leads to multiple inheritance for “radiological terrorism requirements,” which is both a necessary part of “radiological terrorism” and “intelligence.”

Introduction

The CTRS Radiological Terrorism Thesaurus contains descriptive terms used throughout radiological terrorism literature. The terms, their relationships, and their use were culled from several documents, including:

The thesaurus is presented in three forms: first, an alphabetical display of all included terms, including scope notes, preferred terms and synonyms, broader, narrower and related terms, and any scope notes; second, a hierarchical display of preferred terms only; and third, a rotated display of all terms.

Several relationships may be defined for any term in the thesaurus. Scope Notes (SN) are more detailed descriptions of a term’s use when necessary. A preferred term (USE) is a synonym for the term that has been selected for most uses—non-preferred terms do not show up in the hierarchical view. A non-preferred term (UF) is a synonym that may be found in the literature but is not used in the hierarchy. Broader terms (BT) are terms that represent more general classes of the current term. Narrower terms (NT) represent more specific instances or parts of the current term. Finally, related terms (RT) are related to the current term but not in any of the ways already noted.

View the Thesaurus [pdf]

Sphere: Related Content

Notes on the UMLS Semantic Network

Thursday, April 8th, 2004

UMLS Semantic Network

This was a really interesting reading. Not interesting like a novel or movie, of course, but interesting because I keep hearing about semantic webs without seeing any worthwhile examples.

One thing I was a little surprised to see was the ASCII codes for creating a flat-file database. I would have through they would have either specified it in XML or something a little more modern. And I kind of cringe whenever I hear anyone call an ASCII file a database. Even though it’s technically true, to me ‘database’ means database management system (DBMS), with some mechanisms put into place to allow multiple users, referential integrity, etc. If everything is stored in ASCII files, than any idiot can ruin the whole system and it’s really, really easy to let data get corrupted. You have to do all sorts of extra work to make sure updates to one field cascade through the rest of the file.

Even through this was designed specifically for the medical field, it’s surprising how much their relationships and semantic types could be useful for almost any semantic web. I could only think of a few relationships that were missing, the chief one being requires. This is a relationship that’s very common in computer science, but I think it might apply fairly often in other fields as well. When and entity requires another it cannot exist without it. It’s kind of a mix between part_of and precedes.

Sphere: Related Content