Strategies for Open and Permanent Access to Scientific Information in Latin America: Focus on Health and Environmental Information for Sustainable Development

Legal Constraints and Opportunities in Providing Permanent Access to Scientific and Technical Data

Professor Harlan J. Onsrud

The ability to collect and preserve the scientific and technical data collected by others is affected by the potential intellectual property rights that others may have in the data. It is well known that facts are not copyrightable. Thus one hears the argument that data in databases are not protected typically by copyright law and that one is free, absent a contract, to draw data from such databases without acquiring permission from the data gatherer. Yet is all data factual? Does the law assume that all empirical observations and measurements are facts? Are some data, such as cartographic generalizations, protected as expression or is such expression typically "one with the facts" and therefore not protected? If databases are typically selections of only certain values, does the selection and coordination of those values give rise to copyright protection in that selection and coordination? And what about data contained in a file or dataset, as opposed to a database? Software may interact with and copy an entire data file that may incorporate and express creativity, while other software may interact with a database where only certain data elements are extracted or copied. Does the extent and nature of what collection of data that the software copies make a difference in what may be copied legally without asking permission? Reasonable people may disagree concerning any of the contentions above when confronted with a specific scenario involving scientific and technical data. This results in confusion for the typical scientist and science administrator in managing legal rights in data, datasets and databases. The confusion is addressed by default under the legal system on a case-by-case basis. Is there a better way?

Under the predominant scholarly publishing economic model of today, publishers ask scientific authors to transfer their rights to the publisher and the publisher then uses copyright to restrict access to the publication to only those that pay. This approach is being extended to the affiliated datasets upon which the published works are based. Yet transferring of copyright of datasets to private publishers doesn't resolve the above questions for scientists since they are still confronted with them the next time they attempt to access or use the data of other scientists.

One approach to avoid even raising the above questions is for creators of collections of scientific and technical data to always convey to the world any rights they may have in a dataset or database through a public domain or open access license (e.g., Creative Commons licenses). This is likely to require a global networked scientific infrastructure that provides benefits to those who make their data openly available.

