Posts Tagged ‘metadata’

Metadata Schema for Radiological Terrorism Research

Friday, April 30th, 2004

Note: this was a project for a graduate course in Knowledge Organization Systems

 

Metadata schema for radiological terrorism research (MSRTR)

Terrorism research is a complex field dealing with a number of entities, each with their own metadata requirements. This document is an introduction to the kinds of schema that will be necessary for proper cataloging, identification, and retrieval in the radiological terrorism subfield. Schema for radioactive material sources and radiological terrorism responses are presented below, followed by sample records and a crosswalk between the two scheme and the Dublin Core. Schema were made as simple as possible (8 and 6 main fields, with several qualifiers, respectively) in order to make application quick, easy and consistent.

Fields are described in the following format:

  • Field Name
  • Qualifiers: additional subfields
  • Definition: a description of the field and usage
  • Data Values: notes whether data is controlled or uncontrolled, and if controlled the terms allowed or the source of terms allowed.
  • Status: cardinality of the field (recommended, required, optional)
  • Number: number of entries allowed for each record for this field (single, multiple)
  • Dublin Core: corresponding field in the Dublin Core. This information is repeated in the crosswalk as well.

 

Metadata for radioactive material sources (RMS)

 

Identifier

  • Qualifiers: None
  • Definition: Code number used to identify source in database.
  • Data Values (controlled): Auto-generated database identity field values
  • Status: required
  • Number: single
  • Dublin Core: IDENTIFIER

 

Name

  • Qualifiers: None
  • Definition: Way in which a site is identified, for example the name of a hospital, power plant, mine, etc.
  • Data Values (uncontrolled):
  • Status: required
  • Number: multiple
  • Dublin Core: TITLE

 

Owner

  • Qualifiers: None
  • Definition: The name of the person, company, government, or other organization that owns the source.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: CREATOR

 

Location

  • Qualifiers: Location.Coordinates, Location.Country
  • Definition: Physical location of source.
  • Data Values (controlled): ISO 6709:1983 for Coordinates, ISO 3166 Country Codes for Country
  • Status: recommended
  • Number: single
  • Dublin Core: COVERAGE

 

Level

  • Qualifiers: None
  • Definition:
  • Data Values (controlled): high-level, low-level
  • Status: recommended
  • Number: multiple
  • Dublin Core: TYPE

 

Type

  • Qualifiers: None
  • Definition: Radioactive waste is categorized according to its origin and not necessarily according to its level of radioactivity. For example, some low-level waste has the same level of radioactivity as some high-level waste. (http://www.epa.gov/radiation/docs/radwaste/index.html)
  • Data Values (controlled): spent nuclear fuel, transuranic waste mainly from defense programs, uranium mill tailings, low-level waste, naturally occurring and accelerator-produced radioactive materials.
  • Status: recommended
  • Number: multiple
  • Dublin Core: TYPE

 

Description

  • Qualifiers: Description.Facility, Description.Security, Description.Storage
  • Definition: Text description of the facility, security, and storage methods.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: single
  • Dublin Core: DESCRIPTION

 

Isotopes

  • Qualifiers: Isotopes.Present, Isotopes.Potential
  • Definition: Radioactive Isotopes either present or potentially present at the source.
  • Data Values (controlled): IUPAC isotope symbols
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

 

Metadata for radiological terrorism responses (RTR)

 

Identifier

  • Qualifiers: None
  • Definition: Code number used to identify repsonse in database.
  • Data Values (controlled): Auto-generated database identity field values
  • Status: required
  • Number: single
  • Dublin Core: IDENTIFIER

 

Name

  • Qualifiers: None
  • Definition: Way in which a response is identified, a short description of the action taken
  • Data Values (uncontrolled):
  • Status: required
  • Number: single
  • Dublin Core: TITLE

 

Type

  • Qualifiers: None
  • Definition: Denotes the type of response and group primarily responsible for the response.
  • Data Values (controlled): medical, public, military
  • Status: required
  • Number: single
  • Dublin Core: TYPE

 

Description

  • Qualifiers: Description.Process, Description.Event, Description.Other
  • Definition: Text description of the response process, event this is a response to, and other details.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

Expertise

  • Qualifiers: Expertise.Required, Expertise.Other
  • Definition: Types of expertise either required for the response to occur, or possibly useful during course of response.
  • Data Values (uncontrolled)
  • Status: recommended
  • Number: multiple
  • Dublin Core: DESCRIPTION

 

Related

  • Qualifiers: Related.Prerequisite, Related.Required_for, Related.Concurrent
  • Definition: Other responses that are either required before this response, require this response to proceed, or may be enacted concurrently with this response in a complimentary way.
  • Data Values (controlled): Identifier values
  • Status: recommended
  • Number: multiple
  • Dublin Core: RELATION

 

 

Sample Record 1, Radiological material source

Identifier 000050060891

Name Springfield Nuclear Power Plant

SNPP

Owner Montgomery Burns

Location.Country US

Level high-level

Type spent nuclear fuel

Description.Facility In 2001, Unit 1 had a capacity factor of 100.5 percent and supplied 7.68 billion kilowatthours of electricity. PWR (pressurized light water reactor).

Description.Security Security meets U.S. DOE standards for this type of power plant.

Description.Storage Storage conforms to U.S. DOE standards for this type of power plant, but there have been a number of accidental releases in recent years.

Isotopes.Present U-235

 

Sample Record 2, Response to radiological terrorism

Identifier 003450666894

Name Duck and cover

Type Public

Description.Process If in a school building, elementary students will get up from their desks in a clam and orderly manner, drop to the floor and place themselves underneath the desk, with their hands covering the backs of their necks. If caught outdoors, students will drop their bicycles and/or jump ropes, find a wall, drop tot the ground next to it and cover the backs of their necks with their hands.

Description.Event This is the appropriate response for any radiological or nuclear event occurring in the 1950s, including the nearby explosion of an atomic bomb.

Description.Other Students are also encouraged not to stare at the brilliant flash of an atomic bomb.

Expertise.Required Propaganda Film-making

Related.Prerequisite 003345345344 (Development of nuclear bomb proof desks)

003345345372 (Promotion of Cold War anxieties in populace)

Related.Concurrent 003345388843 (Prayer)

 

 

Crosswalk

 

RMS: Identifier

RTR: Identifier

Dublin Core: IDENTIFIER

 

RMS: Name

RTR: Name

Dublin Core: TITLE

 

RMS: Owner

RTR: Description.Other

Dublin Core: CREATOR

 

RMS: Location

RTR: Description.Event

Dublin Core: COVERAGE

 

RMS: Level

RTR: Description.Event

Dublin Core: TYPE

 

RMS: Type

RTR: Description.Event

Dublin Core: TYPE

 

RMS: Description

RTR: Description.Other

Dublin Core: DESCRIPTION

 

Isotopes

RTR: Description.Event

Dublin Core: DESCRIPTION

 

RTR: Expertise

RMS: Description

Dublin Core: DESCRIPTION

 

RTR: Related

RMS: Description

Dublin Core: RELATION

 

Sphere: Related Content

Notes on “Metadata Principles and Practicalities”

Friday, April 30th, 2004

Metadata Principles and Practicalities, Duval, Erik, Wayne Hodgins, Stuart Sutton, Stuart L. Weibel (2002).

This was a pretty straight-forward reading. I did like the Lego metaphor for metadata, but it would be nice if it was elaborated on a little bit more. So, kids have no problem combining space ship Lego parts with medieval castle parts, but just because it is possible to do so, is it beneficial or useful in any way? I understand they are just talking about modularity as a quality at the point, but it also gives the impression that metadata schema, properly constructed, can be mixed and matched willy-nilly.

One thing I have not seen discussed very much is where the line is drawn between metadata and regular data. For example, most schema have some sort of author/creator field. I see how the author could be data describing and article, but if you look at articles in a journal or on a web page, the author is almost always presented along with the body text. A more clear example of what I’m trying to say is an article’s abstract. I’ve see some schema that have the abstract as a piece of metadata, but does that mean it is not part of the data (the article) itself? Or is it both?

It all depends on your point of view. From a database designer’s perspective, all of theses metadata fields are just data, and metadata is what describes the structure of the database—field types, lengths, foreign key relationships, etc. I’m not saying that everything should be stored in one big lump. I guess I’m just concerned that depending on the point of view, metadata could mean anything. Really, all of this is just a matter of properly separating different data elements from each other. Obviously author/creator should be a separate field from article body text. And date, titles, etc. should be separate as well. But that doesn’t really separate them from the thing itself, they are still all aspects of the thing.

Sphere: Related Content

Knowledge Organization System for a Greeting Card Company’s Design Studio Archives

Thursday, March 18th, 2004

Note: this was a project for a graduate course in Knowledge Organization Systems

Introduction

The goal of this project is to create a Knowledge Organization System (KOS) for a Greeting Card Company Studio archive so that designers are able to find source artwork and previous designs. This is no small task–Greeting Card Company has been in operation for nearly 100 years and has at least partial archives from the entire period, and today the company employs hundreds of designers and produces thousands of products. There is no question that without an inclusive, accurate, and easy-to-use archive, designers are unable to build on each others ideas and a great deal of work is being duplicated. Also, intellectual property needs to be properly managed and licensed artwork needs to be tracked and protected from accidental misuse.

Currently, all archives are stored in protective containers in the Studio, shelved by year. In addition a vast number of digital files have been compiled on the Studio’s serves and CD and tape backups. This project does not address the physical process of collection and digitization, but instead offers a road map to how items will be classified as they are entered into the system. This KOS also provides a framework for the database and the ultimate user interface.

Below is an analysis of the users and groups, followed by a description of the overall structure of the KOS. After that is a description of each facet, followed by pick lists, synonym rings, and taxonomies for each where applicable.

 

Users

In this analysis three distinct user groups were identified: Archivists, Designers, and Management/Administration. Archivists include the companies current information professionals as well as the interns and temp workers who will be doing the digitization and data entry under their supervision. The KOS has been set up under the assumption that most data entry personnel will be able to properly classify perhaps 80 to 90 percent of all items within each facet, forwarding the rest to more skilled information professionals. The professionals include skilled librarians, art historians, and other researchers who should be adequately prepared to train data entry personnel and classify more difficult items.

The designer group includes artists and graphic designers of varying skill and experience. Nearly all, however, have completed at least a two-year program and the majority have completed a four-year college degree. Taxonomies were developed with this level of expertise in mind. Designers were surveyed and a wide range of thinking about art objects and designs were found. The facets below were designed to cover virtually every way in which a designer might want to look for a piece.

Management and administration also have specific needs. It is for them primarily that the Designer entity described below as well as most facets dealing with licensing and sales have been created.

 

Organization

The archive needs to be broken down into four different logical entities: Art Elements (such as clip art, photographs, sculptures, etc.), Products (such as individual greeting cards, e-cards, etc.), Digital Files, and Designers. Each entity will have a number of associated facets which roughly correspond to the fields in the database and will allow multiple methods of search and organization.

The entity relationships will be defined in the database so that searches will cascade upward. For example, some searching for art elements will be able to find those done by a specific AG department, because Art Elements are related to products which are related to Designers, who have the Department/Team facet. All of this is relatively simple to do with SQL and can be hidden in the interface to make searching easier.

Each facet has an associated type, whether that be a simple constraint on an open text field, a pick list, or a taxonomy. Where lists and taxonomies have been developed the list’s page number is noted as well.

View the KOS, including the entities and their facets, pick lists, and taxonomies [pdf]

Sphere: Related Content