The Adoption of UNIMARC as an Internal and Exchange format and the Process of Conversion
1. The adoption of UNIMARC
The adoption of UNlMARC for the BNI (Italian National Bibliography) was made in 1985, coinciding with the start of the SBN (Italian Library Network) in the BNCF (National Library of Florence).
In order to prepare functional specifications and procedures for dealing with the data in the online SBN network, the BNCF has been involved in the project supported by the Ministero per i Beni Culturali e Ambientali (Ministry for Cultural and Environmental Assets) and coordinated by the ICCU (Central Institute for the Union Catalogue, a body of the Ministry).
The design of the format for SBN has taken into account the principles of UNlMARC:
- by laying the stress on the links between bibliographic entities;
- by trying to avoid data redundancies;
- by providing a structure that is sufficiently analytic and flexible.
For the BNI, the main reasons for deciding to adopt UNlMARC were:
- the convenience of using an international standard format
- and the format's very detailed structure which meets the needs of a national bibliography.
BNI bibliographic records are created in the SBN system through online procedures. They are transferred on a daily basis automatically into the UNlMARC database. The UNlMARC coding is automatically generated through computer programs.
It is interesting to note that the fields and the subfield values are generated by recognition of ISBD punctuation and by converting SBN code values into UNlMARC code values.
BNI records are available, together with the rest of the National Library of Florence's records, in the SBN databases for shared cataloguing in the network that is going to be activated.
Furthermore, since the network has not yet been started, some libraries in Italy use BNI UNIMARC tapes to upload the records into their databases.
2. Formats and links between bibliographic entities
The purpose of this section is to compare different solutions in representing data elements of bibliographic records and their relationships. These simple observations have to be understood as an attempt to discuss one of the most important aspects or features of our bibliographic formats: that is, the fact that they can express relationships between bibliographic entities.
To clarify the use in this context of the term bibliographic entity I would like to start with a quote from the paper Bibliographic Entities and Their Uses by Elaine Svenonius:
“Data elements on bibliographic records might be classified into two categories: those that describe the entity in hand and those that relate the entities to other entities. Thus, in considering data elements to be included on bibliographic records, account needs to be taken not only of those that represent the format attributes of the entities described (descriptive elements) but al so those whose purpose is to organize catalogues and by so doing to structure the bibliographic universe (organizing elements)” [Svenonius, 1992].
We all know how CCF and UNIMARC represent bibliographic records and their relationships.
The CCF format uses record label position 7 (Bibliographic level code) and, if the record contains more than one segment, the field 015 (Bibliographic level of secondary segment) to characterize what kind of entity (item) the record describes:
- s = serial,
- m = single volume monograph,
- c = multi-volume monograph,
- a = component part,
- e = made-up collection.
The segment linking fields 080, 081, 082, 083 and 085 define the kind of relationship. When hierarchical links occur the subfield A contains a code that allows the segment to be ranked. As Mirna Willer notes: “The use of repetition counter and segment indicator in CCF, not present in UNIMARC, shows the basic difference in the concept of two formats” [Bossmeyer-Willer, 1987]
In the UNIMARC format the record label position 7 shows four possible values for the Bibliographic level:
- a = analytic (component part),
- m = monographic,
- s = serial,
- c = collection.
The position 8 of the record label defines the hierarchical level. In the linking entry block 4XX “the tag of the linking field denotes the relationship of the item identified within it to the item for which the record is being made” [UNIMARC, 1987]
Both UNIMARC and CCF allow links between fields in the same record.
In the SBN structure the bibliographic record is a cluster composed of distinct records and their relationships.
It is important to note that the SBN structure has been implemented up to now, as a deliberate policy, in five different kinds of DBMS. To give some idea about the structure we can assume that to make up such a cluster the following files in the database are needed:
- the file/of titles,
- the file of authors,
- the file of links title-title,
- the file of links author-author,
- the file of links title-author,
- and so on, for subjects, classification numbers, etc.
In the file of titles there are two basic (or main) records coded as M = monograph, S = serial.
These records contain coded data, identification numbers and the bibliographic description (corresponding to data from record label, 0XX, 1XX,. 2XX and 3XX blocks of UNIMARC).
Other types of records, such as N = component part, A = uniform title, are linked records and are, naturally, present in the database only if a basic linked record exists. These records contain appropriate coded data, identification numbers and other data elements that vary according to their nature.
Monograph and serial records can stand alone or be linked to each other.
In the file of authors there are two kinds of records:
- A = uniform heading,
- R = variant heading.
These records contain the heading and other data elements in coded form (e.g.: E = corporate body, R = meeting).
The file of links title-title contains, in a coded form, the kind of relationship and other data elements belonging to the link. The links are always bi-directional. For example, a link that denotes hierarchy is coded as "1" and means "upward link" (to be part of) and, when this link is made, an opposite "downward link" (to contain) is automatically created. This relationship is used in SBN to link a monograph to a series, a monograph to another monograph (levels), or a component part to a monograph or a serial. Data elements such as numbering within a series or details on the location of the component part within a host item are recorded as data elements belonging to the link.
The file of links title-author contains in a coded form the kind of responsibility (1 = primary, 2 = alternative, 3 = secondary) and the file of links author-author contains in a coded form the kind of reference (8 = see reference, 4 = see also reference).
As a consequence authority data such as author and uniform title are recorded only once in the database and they are shared by different records.
There are advantages and disadvantages in representing the records in this way:
- the bibliographic records are open. In the SBN it is possible to apply different levels of description and the record can be more easily enhanced by adding other links;
- data redundancy is avoided;
- the structure facilitates a higher level of control in creating new records (authority records especially);
- it will be possible in the network to share records between different libraries (with different levels of responsibility). Not only a complete bibliographic record (a cluster), but even authority data (e.g., a record of an author or a uniform title) can be shared.
- the procedures of creating, updating and searching in the database have to be very well structured; otherwise they can be too demanding for the librarian or difficult for the user;
- SBN as a format is not suitable for data exchange at an international level, and at a national level it is not practicable for those libraries that are not connected to the network.
An examination of the different ways to conceive and to apply the linking technique brings us to the following remarks:
There needs to be an expansion of the categories of links in the UNIMARC context. As Christine Bossmeyer notes: “UNIMARC provides at present only a linking technique for bibliographic records. Links between bibliographic data and authority data are not yet considered” [Bossmeyer-Willer, 1987].
The need for “a map that shows how entities are clustered and where the pathways are between and among them” [Svenonius, 1992] points out the problem that exists regarding the relationship between formats and standards. I agree with Marcelle Beaudiquez that we have to separate “the case of formats - common machine-readable language - and the case of standards communication and exchange tools which must meet the user's needs”: ( ... ) “les formats sont des outils informatiques qui évoluent avec la technique.( ... ) la normalisation par contre est le résultat logique d'une requête intellectuelle” [Library automation ..., 1991].
3. The use of UNIMARC as an "intermediate" format and the conversion from/to different formats
Libraries can use bibliographic records which they do not produce (or which are not produced in a format other than the one that they are using) mainly in two ways:
- Libraries can provide their users with existing bibliographies and catalogues in a variety of media (CD-ROMs, online etc);
- They can upload the bibliographic records into their own catalogue.
In the first case the records do not require conversion but the problem of different user interfaces can provide great difficulties in searching. Considering the fact that user interfaces can differ in spite of the same format used (e.g., UNIMARC), it is extremely important to ensure a standard user interface.
In the second case we need both a conversion program - obviously if the formats are different - and an uploading program.
The BNCF is trying to cover these two requirements.
One of the aims of the UOL project (Users On-line), started last year in the BNCF, is to meet the first need above. It proposes to do so by using a common frame to search bibliographies and catalogues on CD-ROMs, on-line systems, etc. This would be regardless of the source format of the data.
In order to meet the second need (especially in regard to retrospective conversion) the BNCF decided to also use UNIMARC as an "intermediate" format to upload data from different formats into the SBN database.
We have chosen to use UNIMARC for the following reasons:
- Once more, UNIMARC is a very detailed format which allows data to be coded from different formats with suitable precision;
- Furthermore, UNIMARC is very close to SBN. Since SBN records are transferred automatically into the UNIMARC database it is reasonable to assume that the UNIMARC-SBN transfer will be effective.
At least the use of a unique format such as an intermediate format allows us to separate two distinct phases:
- the phase of conversion from different formats;
- the phase of data uploading to SBN that involves a well-known problem: the matching of data.
As a result the uploading program has to be developed only once.
In this field some activities are planned including conversion from/to CCF, conversion from publishers' databases, conversion from ANNAMARC (Italian MARC 1975-1984), and conversion from other local formats (CUBI, etc).
I would like, now, at the end of my paper to talk about the BNCF experience in CCF/UNIMARC conversion.
Here in Florence the CCF format is used by the Associazione Interaccademica, an association of important cultural institutions. We can consider this version of CCF implemented by Associazione Interaccademica as an extension of the CCF format: in particular the structure of the database on CDS/ISIS used to produce the Bibliografia italiana di storia della scienza (Italian Bibliography of the History of Science).
This structure emphasizes both links within bibliographic records and authority data. An author heading is a separate record and the appropriate field in the full bibliographic record contains only a reference number that points to the record of the author.
The reasons for this choice - as declared by those responsible for this project - are to be found in two needs that were previously mentioned: to make a higher level of control in creating authority headings possible and to avoid data redundancy.
Monographs, serials and, above all, component parts are recorded in this Bibliography. In order to avoid duplication of effort, those responsible have planned to use the UNIMARC records for monographs and serials produced by the Italian National Bibliography. The Associazione, in collaboration with BNCF, has developed an ad hoc conversion program from UNIMARC to CCF. It is now in the advanced test phase.
Furthermore, BNCF is considering the opportunity to enhance records in SBN by adding the records of the component parts produced by the Italian Bibliography of the History of Science. To make this possible, a conversion program from CCF to UNlMARC is needed. This is one of the general policy issues in the SBN environment. Namely, to use the specific capabilities inside and outside the network.
- Bossmeyer-Willer, 1987
- Bossmeyer, Christine and Mirna Willer. UNIMARC Conversion Problems. In: The Library of the Future: 11th Library Systems Seminar(European Library Automation Group), Frankfurt 1-3 April 1987 / ed. by Christine Bossmeyer. - Frankfurt : Deutsche Bibliothek, 1987, pp. 205-213.
- Library automation ..., 1991
- Nouvelles techniques, nouvelle normalisation : une évolution pour de nouveaux besoins. In Library Automation and Networking: New Tools for a New Identity : European Conference, 9-11 May 1990, Bruxelles / ed. by Hermann Liebaers and Mare Walckiers. Munchen: Saur, 1991, pp. 196-205.
- Svenonius, 1992
- Svenonius, Elaine. Bibliographic Entities and Their Uses: Proceedings of the Seminar Held in Stockholm,15-16 August 1990 and Sponsored by the IFLA UBCIM Programme and the IFLA Division of Bibliographic Control, ed. by Ross Bourne. Munchen : Saur, 1992, pp. 3-17
- UNIMARC, 1987
- UNIMARC Manual / ed. by Brian P. Holt with the assistance of Sally H. McCallum and A. B. Long. London: IFLA Universal Bibliographic Control and International MARC Programme, 1987.
[Pubblicato orginariamente come - Originally published as:
The adoption of UNIMARC as an internal and exchange format and the process of conversion in UNIMARC/CCF : proceedings of the Workshop held in Florence, 5-7 June 1991 / IFLA-Unesco ; edited by Marie-France Plassard and Diana McLean Brooking. - Munchen [etc.] : K. G. Saur, 1993, pp. 6-14
da OCR aggiornato al 2011 05 01 - from OCR last updated 1 May 2001