When working with language resources, whether creating, distributing or using them, it is important to be aware of any issues pertaining to intellectual property, i.e. “the intangible rights protecting the products of human intelligence and creation”:

Intellectual property describes a wide variety of property created by musicians, authors, artists, and inventors. The law of intellectual property typically encompasses the areas of copyright, patents, and trademark law. It is intended largely to encourage the development of art, science, and information by granting certain property rights to all artists, which include inventors in the arts and the sciences. These rights allow artists to protect themselves from infringement, or the unauthorized use and misuse of their creations.
Apart from the legal aspects of IPR, there is also the ethical aspect.

In the CLARIN context, IPR essentially can be looked at from three different perspectives: that of the resource creator, the resource distributor, and the resource user.

Resource creator

As a resource creator, esp. when data are involved, make sure that, upon acquisition of the primary data (collecting them in an experiment or obtaining them from a third party), you look into and, where necessary, make the necessary arrangements as regards IPR. It may be that IPR has already been taken care of, as for example when a GNU GPL or Creative Commons License applies, or by arrangement of law (the public’s right to information). In all other cases, a contract in writing, signed by the parties entering in the agreement, should at once prevent any infringement of the owner’s/contributor’s rights, the unauthorized use or misuse of the primary data you were given permission to use, and ensure that you as resource creator may indeed use, re-use, share and/or distribute the data. While arranging IPR can be (and often is found to be) a headache and it may be tempting to simply disregard the IPR issue altogether, you should be aware of the risks you run upon failing to satisfactorily settle IPR: there may be legal suits, the data may not be shared or re-used, or may even have to be destroyed.

Over the past decade or so, in various research programmes and projects in the Netherlands, ample experience has been gathered in successfully negotiating and arranging IPR.

    • Standard IPR: this is the standard version of the IPR agreement that was used for most text providers. It is about ten pages long and arranges every possible dispute.* In order to reassure the text provider, it clearly stated that no competition is intended and that commercial use presupposes the text material to be unrecognizable as such. This implies that all text material can only be accessed via the corpus and that the text cannot be downloaded as such by the end user.
    • IPR for publishers: this agreement is similar to the standard one for commercial use, but here the texts also have to be made partially recognizable for non-commercial purposes, which implies that also for research purposes the texts are only accessible by means of the corpus. Since most publishers feared undue competition, this feature was added to make it acceptable for them.
    • IPR short version: while the negotiations proceeded it became clear that the standard agreement was a bit too long and that too much information on possible infringements was included, which alarmed some text providers. Therefore a short version of the standard IPR agreement was drawn up to simplify and accelerate negotiations by avoiding lengthy contract stipulations.
    • E-mail or letter with permission: when a data provider wanted to participate in the project but was unable to sign an agreement stating this, it was decided that an e-mail or letter with permission could also be accepted. This was only possible in exceptional cases and when little text material was involved.
As a distributor, while making the resource available for end-users, you should take responsibility for safeguarding the interests of the creator of the resource and those who contributed to it. To this end, user-licenses must detail what different types of users (e.g. academic researcher, not-for-profit organization) may use the resource, under what conditions, and for what purposes (e.g. research, commercial, evaluation use).

As a resource user, you may benefit from available resources. It is possible that the resource is publicly and freely available. More commonly, however, some form of license applies. It usually depends on what type of user you are and what you intend to use the data for, whether you will be granted permission to use the data or not, and under what conditions.

