Responsible Conduct in Data Management
Home Topics About the Module Feedback Contact Us ORI
Data Ownership Main Quiz Games Cases Glossary

Data ownership
refers to both the possession of and responsibility for information. Ownership implies power as well as control. The control of information includes not just the ability to access, create, modify, package, derive benefit from, sell or remove data, but also the right to assign these access privileges to others (Loshin, 2002).

Implicit in having control over access to data is the ability to share data with colleagues that promote advancement in a field of investigation (the notable exception to the unqualified sharing of data would be research involving human subjects). Scofield (1998) suggest replacing the term ‘ownership’ with ‘stewardship’, “because it implies a broader responsibility where the user must consider the consequences of making changes over ‘his’ data”.

According to Garner (1999), individuals having intellectual property have rights to control intangible objects that are products of human intellect. The range of these products encompasses the fields of art, industry, and science. Research data is recognized as a form of intellectual property and subject to protection by U.S. law.

Importance of data ownership:

According to Loshin (2002), data has intrinsic value as well as having added value as a byproduct of information processing, “at the core, the degree of ownership (and by corollary, the degree of responsibility) is driven by the value that each interested party derives from the use of that information”.

The general consensus of science emphasizes the principle of openness (Panel Sci. Responsib. Conduct Res. 1992). Thus, sharing data has a number of benefits to society in general and protecting the integrity of scientific data in particular. The Committee on National Statistics’ 1985 report on sharing data (Fienberg, Martin, Straf, 1985) noted that sharing data reinforces open scientific inquiry, encourages a diversity of analyses and conclusions, and permits:

  1. reanalyses to verify or refute reported results
  2. alternative analyses to refine results
  3. analyses to check if the results are robust to varying assumption

The cost and benefits of data sharing should be viewed in ethical, institutional, legal, and professional dimensions. Researchers should clarify at the beginning of a project if data can or cannot be shared, under what circumstances, by and with whom, and for what purposes.

Considerations/issues in data ownership

Researchers should have a full understanding of various issues related to data ownership to be able to make better decisions regarding data ownership. These issues include paradigm of ownership, data hoarding, data ownership policies, balance of obligations, and technology. Each of these issues gives rise to a number of considerations that impact decisions concerning data ownership

Paradigm of OwnershipLoshin (2002) alludes to the complexity of ownership issues by identifying the range of possible paradigms used to claim data ownership. These claims are based on the type and degree of contribution involved in the research endeavor. Loshin (2002) identifies a list of parties laying a potential claim to data:

  • Creator – The party that creates or generate data
  • Consumer – The party that uses the data owns the data
  • Compiler - This is the entity that selects and compiles information from different information sources
  • Enterprise - All data that enters the enterprise or is created within the enterprise is completely owned by the enterprise
  • Funder - the user that commissions the data creation claims ownership
  • Decoder - In environments where information is “locked” inside particular encoded formats, the party that can unlock the information becomes an owner of that information
  • Packager - the party that collects information for a particular use and adds value through formatting the information for a particular market or set of consumers
  • Reader as owner - the value of any data that can be read is subsumed by the reader and, therefore, the reader gains value through adding that information to an information repository
  • Subject as owner - the subject of the data claims ownership of that data, mostly in reaction to another party claiming ownership of the same data
  • Purchaser/Licenser as Owner – the individual or organization that buys or licenses data may stake a claim to ownership

Data Hoarding

This practice is considered antithetical to the general norms of science emphasizing the principle of openness. Factors influencing the decision to withhold access to data could include (Sieber, 1989):

  • (a) proprietary, economic, or security concerns
  • (b) documenting data which can be extremely costly and time consuming
  • (c) providing all the materials needed to understand or extend the research
  • (d) technical obstacles to sharing computer-readable data
  • (e) confidentiality
  • (f) concerns about the qualifications of data requesters
  • (g) personal motives to withhold data
  • (h) costs to the borrowers
  • (i) costs to funders

Data Ownership Policies

Institutional policies lacking specificity, supervision, and formal documentation can increase the risk of compromising data integrity. Before research is initiated, it is important to delineate the rights, obligations, expectations, and roles played by all interested parties. Compromises to data integrity can occur when investigators are not aware of existing data ownership policies and fail to clearly describe rights, and obligations regarding data ownership. Listed below are some scenarios between interested parties that warrant the establishment of data ownership policies

  • Between academic institution and industry (public/private sector) – This refers to the sharing of potential benefits resulting from research conducted by academic staff but funded by corporate sponsors. The failure to clearly delineate data ownership issues early in public/private relationships has created controversy concerning the rights of academic institutions and those of industry sponsors (Foote, 2003).
  • Between academic institution and researcher staff –According to Steneck (2003) research funding is awarded to research institutions and not individual investigators. As recipients of funds, these institutions have responsibilities for overseeing a number of activities including budgets, regulatory compliance, and the management of data. Steneck (2003) notes “To assure that they are able to meet these responsibilities, research institutions claim ownership rights over data collected with funds given to the institution. This means that researchers cannot automatically assume that they can take their data with them if they move to another institution. The research institution that received the funds may have rights and obligations to retain control over the data”. Fishbein (1991) recommended that institutions clearly state their policies regarding ownership of data, and present guidelines for such a policy.
  • Collaboration between research colleagues–This is applicable to collaborative efforts that occur both within and between institutions. Whether collaborations are between faculty peers, students, or staff, all parties should have a clear understanding of who will determine how the data will be distributed and shared (if applicable) even before it is collected.
  • Between authors and journals - To reduce the likelihood of copyright infringement, some publishers require a copyright assignment to the journal at the time of submission of a manuscript. Authors should be aware of the implications of such copyright assignments and clarify the policies involved.

Balance of obligations

Investigators must learn to negotiate the delicate balance that exists between an investigator’s willingness to share data in order to facilitate scientific progress, and the obligation to employer/sponsor, collaborators, and students to preserve and protect data (Last, 2003). Signed agreements of nondisclosure between investigators and their corporate sponsors can circumvent efforts to publish data or share with colleagues. However, in some cases as with human participants data sharing may not be allowed due to confidentiality reasons.


Advances in technology have enabled investigators to explore new avenues of research, enhance productivity, and use data in ways unimagined before. However, careless application of new technologies has the potential to create a slew of unanticipated data ownership problems that can compromise research integrity. The following examples highlight data ownership issues resulting from the careless application of technology:

  • Computer – The use of computer technology has permitted rapid access to many forms of computer-generated data (Veronesi, 1999). This is particularly the case in the medical profession where patient medical record data is becoming increasingly computerized. While this process facilitates data access to health care professionals for diagnostic and research purposes, unauthorized interception and disclosure of medical information can compromise patients’ right of privacy. While the primary justification for collecting medical data is to benefit the patient, Cios and Moore (2002) question whether medical data has a special status based on their applicability to all people.
  • Genetics – Due to advances in technology, i nvestigators of the Human Genome Project have opportunities to make significant contributions by addressing previously untreatable diseases and other human conditions. However, the status of genetic material and genetic information remains unclear (de Witte, Welie, 1997). Wiesenthal and Wiener (1996) discuss the conflict between the rights of the individual for privacy, and the need for societal protection. The critical issues that investigators need to be aware of include the ownership of genetic data, confidentiality rights to such information, and legislation to control genetic testing and its applications (Wiesenthal and Wiener, 1996).

The mentioned data ownership issues serve to highlight potential challenges to preserving data integrity. While the ideal is to promote scientific openness, there are situations where it may not be appropriate (especially in the case of human participants) to share data. The key is for researchers to know various issues impacting ownership and sharing of their research data and make decisions that promote scientific inquiry and protect the interests of the parties involved.


Cios, K. J., Moore, G. W. (2002). Uniqueness of medical mining. Artif Intell Med (Artificial intelligence in medicine), 26(1-2): 1-24.

de Witte, J. I. & Welie, J. V. (1997). The status of genetic material and genetic information in The Netherlands. Soc Sci Med (Social Science & Medicine (1982), 45(1): 45-9.

Fienberg, S. E., Martin, M.E., Straf, M.L. (1985). Sharing Research Data. Washington , DC: National Acad. Press.

Fishbein, E. A. (1991). Ownership of research data. Academic Medicine, 66(3), 129-33.

Foote, M. (2003). Review of current authorship guidelines and the controversy regarding publication of clinical data. Biotechnol Annu Rev (Biotechnology annual review), 9: 303-13.

Garner, B. A. (1999). Black’s Law Dictionary, 7 th edition. West Group, St. Paul, MN.

Last, R. L. (2003). Sandbox ethics in science: sharing of data and materials in plant biology. Plant Physiol (Plant physiology.), 132(1): 17-8.

Loshin, D. (2002). Knowledge Integrity: Data Ownership (Online) June 8, 2004

Panel Sci. Responsib. Conuct Res. (1992). Responsible Science. Ensuring the Integrity of the Research Process. Vol. 1. Comm. Sci. Eng. Public Policy. Washington, DC: Natl. Acad. Press.

Scofield, M. (1998). Issues of Data Ownership (online), retrieved June 10, 2004

Shamoo, A. E., Resnik, D. B. (2002). Intellectual Property. Responsible Conduct of Research. New York: Oxford University Press.

Sieber, J. E. (1989). Sharing scientific data I: new problems for IRBs. IRB (IRB; a Review of Human Subjects Research), 11(6): 4-7.

Steneck, N. H. (2003). ORI Introduction to the Responsible Conduct of Research. Department of Health and Human Services.

Veronesi, J. F. (1999). Ethical issues in computerized medical records. Crit Care Nurs Q (Critical Care Nursing Quarterly), 22(3): 75-80.

Wiesenthal, D. L., Wiener, N. I. (1996). Privacy and the Human Genome Project. Ethics Behav (Ethics & Behavior), 6(3): 189-202.