Making records from the SUNCAT database openly available: the experience with licensing

The background is explained in an earlier post (July 10 2012).  SUNCAT (Serials UNion CATalogue) aggregates the metadata (bibliographical and holdings information) for serials, no matter the physical format held in (currently) 89 libraries and it was planned (with the agreement of the Contributing Libraries) to make as much of this data as openly available as possible.

It was decided to adopt an opt in policy.  This approach was taken since it was felt that CLs needed to be fully aware of the commitment they were making and to have the opportunity to place any particular restrictions such as limiting the data which could be made open or restricting the number of formats in which the data would be made available.  In the event most of the participants availed themselves of the opportunity to specify, unambiguously, which data they were agreeing to being made open.

Legal advice was taken from the University solicitors and the licence format adopted was Open Data Commons Public Domain Dedication and Licence with reference to the ODC Attribution Share Alike Community Norms.  Staff in quite a number of institutions expressed interest but, in the event, only staff in 6 institutions proceeded as far as signing a licence with EDINA . A copy of the standard agreement may be viewed here.

Since many libraries have acquired some of the metadata records they use in OPACs from one or more third party commercial suppliers, there were very understandable concerns about giving permission for EDINA to make records from these sources openly available.  Accordingly, it was necessary to add an Appendix to the individual Agreements, specifying what particular restrictions should be applied.

The situation applying to each of the libraries is as follows:

British Library Permission was given to publish all serials records but they are not to be made available in MARCXML or MARC21 formats.
National Library of Scotland Permission was given to publish as open data, any NLS record that has ‘StEdNL’ in the MARC field  040$a

and

to publish as open data, the title, ISSN number and holdings data for any serials record in their catalogue.

University of Bristol Library Permission was given to publish as open data, any Bristol record that has “UkBrU-I” in the MARC field 040 $a.e.g.,

040   L $$aUkBrU-I

University of Nottingham Library Permission was given to publish as open data, any Nottingham record that has “UkNtU” in the MARC field 040 $a and $c.

e.g., 040   L $$aUkNtU$$cUkNtU

However, if there is an 035 tag identifying a different library, then do not use this record.

e.g.,

035   L $$a(OCoLC)1754614
035   L $$a(SFX)954925250111
035   L $$a(CONSER)sc-84001881-
040   L $$aUkNtU$$cUkNtU

University of Glasgow Library Permission was given to publish as open data any Glasgow record that is not derived from Serials Solutions as indicated in the MARC field 035 $$a (WaSeSS).
The library of the Society of Antiquaries of London. Permission was given to publish as open data  all serials records.

As mentioned above, staff in quite a few other libraries expressed interest in becoming involved but the short timescale of the project meant that there had to be concentration on those libraries able to sign the licence agreement quite quickly.

Subject to the availability of further funding it is planned to continue discussions with those libraries which have expressed interest but were not able to proceed to signing an agreement.

Negotiating the specific requirements for each of the libraries was a time consuming, although necessary process, and there are concerns about the resources which would be required to carry the negotiation for a rather larger number of libraries than participated in this phase.

Taken together the records which can be made openly available total in excess of 1,000,000; a considerable quantity of serials’ metadata.  Once the data has been released it will be most interesting to monitor the usages made of it.

Details about making the data openly available and the ways in which developers and others can access it are outlined in a separate blog entry.

That library staff have concerns about making available metadata which has been obtained from one or other third party has been well recognised for some time but to date there has been very little progress on resolving these issues at either a national or international level.  In the earlier blog post it was stated that:

“A number of librarians said that it would be a good idea if JISC/EDINA could come to an agreement with organisations such as OCLC and RLUK rather than individual libraries needing to approach them; this is an idea certainly worth pursuing”.

 JISC did commission work to be carried out in this area and there is a website available which provides guidance.  Whilst, clearly, this is very helpful the onus is placed upon staff in individual libraries to look carefully at their licence agreements with third party suppliers: even where this is done what is often found is that the licence agreements are not necessarily clear and unambiguous on what is possible and what is not.

RLUK recently commissioned work to scope the parameters of making RLUK data openly available and the results of that work should make helpful reading even if the focus is just on material in the RLUK database.

It certainly would be of considerable benefit to the HE community as a whole if national bodies including the JISC, SCONUL and RLUK could accept responsibility for initiating discussions with third party suppliers of records with a view to negotiating removing all restrictions on making metadata openly available.  Such an approach would remove the need for individual libraries to investigate their specific local circumstances and would be of enormous potential benefit to the user community.

Licensing SUNCAT serials’ records

The reasons for making bibliographic metadata openly available have been well put by JISC in the Open Bibliographic Data Guide and the Open Knowledge Foundation but whilst many librarians are keen to support making their institutional library metadata available there are issues to be resolved. There can be copyright issues and contractual issues over records in library OPACS which inhibit the release of records. The records in many OPACs will have been obtained from one or more third party organisations (e.g. OCLC, British Library, Ex Libris, Serials Solutions) and even though often the records received from these third parties will have been modified, perhaps quite extensively, there are understandable concerns expressed about the possible repercussions of making them available under an open licence.

SUNCAT is an aggregation of serials’ metadata from (currently) 86 libraries (referred to as Contributing Libraries (CLs)). Whilst much of the metadata will have been created by local library staff and will, therefore be ‘owned’ by the library, some of it will have been purchased from a third party supplier. The metadata is essentially supplied to EDINA on the basis of goodwill and a common understanding about how the data is used and made available. EDINA reached agreement with third party record suppliers that records in MARC21 format could be made available for downloading, but only to staff in CLs.

In the initial project SUNCAT: exploring open metadata (funded under the JISC Capital funded RDTF participation) the decision was taken to adopt an ‘opt in’ approach and, accordingly, an invitation was sent to all the CLs inviting them to participate in making their SUNCAT contributed data openly available under an Open Data Commons Public Domain Dedication and Licence with reference to the ODC Attribution Share Alike Community Norms. Considerable interest was expressed by CLs in becoming involved but concerns, particularly to do with making third party records available, were raised. A number of librarians said that it would be a good idea if JISC/EDINA could come to an agreement with organisations such as OCLC and RLUK rather than individual libraries needing to approach them; this is an idea certainly worth pursuing.

Licences have now been signed by three organisations. They are the British Library (BL), the National Library of Scotland (NLS) and the Society of Antiquaries; discussions are well advanced with a number of additional organisations. After discussion with BL staff, it was agreed that it would be preferable to add an Appendix to an existing contract between EDINA and the BL, and this has been done. All the data supplied to EDINA by the BL can be made openly available, provided records are not made available in either MARC21 or MARCXML formats. In the case of the National Library of Scotland permission has been given to make all the fields available of all records which have been created by NLS (identified by the presence of ‘StEdNL’ in the 040$a field) or to make title, ISSN and holdings information available for the whole of the contribution to SUNCAT. The Society of Antiquaries has placed no restrictions on the use of their contributed records.

Glasgow University has asked for records from a third party supplier to be excluded from the records made available for open usage and this will be done.

Work is now being carried out to make the records from the initial three organisations freely available on the basis described in the licences and as other licences are signed by additional organisations more data will be published for open usage.