The Future of Research Communications and e-Scholarship

Close this search box.

Why data citation is a computational problem

Why data citation is a computational problem

Peter Buneman University of Edinburgh;  Susan Davidson University of Pennsylvania;  James Frew University of California, Santa Barbara

February 23, 2016

Abstract Most information is now published in complex, structured, evolving datasets or databases. There is increasing demand that this digital information should be treated in the same way as conventional publications and be appropriately cited. While principles and standards have been developed for data citation, they are unlikely to be used unless we can couple the process of extracting information with that of providing a citation for it. We discuss how to generate citations automatically for data in a database given how the data was obtained – the query – as well as the content – the data. We show how the problem of generating a citation is related to a well-understood problem in databases and describe this in two examples with radically different citation requirements.


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles


FORCE11 annual conference: 1-3 August

Driving transformation in scholarly communications locally and globally


Join the FORCE11 community and take part in our groups, conference, summer school, post on FORCE11, and attend other events.


FORCE2023 Sponsors

The FORCE11 community thanks the following organizations for their financial support of the
FORCE2023 annual conference.