18th April 2016
Chairs: Marguerite Avery
Open Access let the world first see (access), and then use (license) research content. But access doesn’t stop at the point of reading, or even using text. How can we make research content, both text and data, truly useable. Could it even be welcoming? To other researchers? To interested publics, or to students or professionals? And how does all of this change as we move beyond the online article, to data, video, and other rich media? Copyright and permissions issues treat this kind of content differently. Do we risk losing the accessibility and usefulness of research content, just as it becomes richer and more welcoming?
Towards a vocabulary for reporting experimental methods in Food Chemistry
Descriptions of experimental methods in scientific publications are often incomplete or inadequate. In those cases the experimental work cannot be reproduced or verified due to lack of information. To facilitate the documentation of lab methods, in some domains, the minimum information guidelines have been developed. If followed, these guidelines ensure that the information about the method can be easily verified, analyzed and clearly interpreted by the wider scientific community. However, there is an evident lack of automated documentation
tools to create and edit laboratory reports that follow these guidelines and at the same time do not impose a too rigid framework on the scientist. Our ultimate goal is to develop a semantically rich but free-text editor for creating descriptions about experimental methods. The editor should give knowledgebased guidance and semi-automatically add metadata. The first step in developing such editor is to construct supporting vocabularies. Our initial application domain is Food Chemistry. MIAPE-CC is a minimum information guideline for reporting the column chromatography-based experiments. Starting from this guideline we have constructed a vocabulary for this domain. Independently, we selected 62 material and method sections form scientific journal in the domain of food chemistry. Our objective was to find out to what extent the concepts derived from the MIAPE guideline actually occur in these published methods and to check if we can use the MIAPE-CC module as a source for developing vocabulary.
Make it machine readable, or the public (doesn't) get it
Neil Chue Hong
Software Sustainability Institute – University of Edinburgh
Open Access has made research available to everyone by freeing the licensing restrictions on it. But in many cases it is still impenetrable and unusable from people outside the field it is published in. If we really want to open our research, we must make it available for reuse and reproduction by others: which can only be done effectively if it can be made machine understandable. In particular for data, this allows rich new tools and interfaces to be developed, enabling others to explore and understand the underlying information. This relies on researchers having a basic understanding of computational literacy to help make the connections – the modern equivalent of being able to understand how to cite and reference works properly.
Costs and Benefits of Open Data in Biomedical Research
We present preliminary findings of a qualitative study comparing how costs and benefits of biomedical data sharing are perceived in technology-oriented and in scientific-oriented settings within a NIH-funded consortium for data sharing in craniofacial research. The biomedical field has a long and contentious history of data sharing. Today, large-scale genetic databases, such as GenBank, are freely accessible. Among the promised outcomes of sharing biomedical data are to increase the pace of drug discovery, reduce the duplication of clinical trials, enroll fewer research subjects, reduce costs overall, while providing various benefits to patients and clinicians. However, while science stakeholders, from NIH to publishers, encourage data sharing practices, only a small percentage of biomedical data are available via open access. Among the reasons for lack of sharing are lack of credit, lack of incentives, and lack of resources and skills for curating data. Benefits of open data and open collaboration tend to be assumed, rather than studied as hypotheses. Most research on biomedical data sharing has focused on incentives and rewards, whereas our research addresses costs and benefits of data openness, both technical and social. Our investigation focuses on the array of costs and benefits that database creators (software engineers) and data creators (experimentalists; bionformaticians) encounter as participants in the consortium. Points of contention are emerging from our observations. Data sharing is negotiated between technical and scientific actors around dimensions such as the status of the data (raw/clean), the scope of the data (active use/dark archive), the scale of the distribution (self/collective), the labor involved (analysis/maintenance), and the stages at which data are made open (raw, published, unpublished). Our work aims to reveal the tangled web of benefits and costs that characterizes a consortium for data sharing. Stakeholders in open data in biomedicine include researchers, clinicians, patients, funders, librarians, and publishers.
The reaggregation of primary research outputs: a general framework and a case study from bioinformatics
University of North Carolina at Chapel Hill
The primary output of research, such as an article or dataset, may only realize its full potential as raw material for the future production of knowledge when the information within it is curated into a structured form. Curation disaggregates the originally published artifact into independently discoverable and reusable observations, which in turn makes it possible to reaggregate those observations into collections that are suited to new research questions. A large number of resources have been launched in recent years that curate primary scholarly outputs in order to enable such dynamic reaggregation of published observations. Here, I discuss some of the social, legal, technical, and scientific issues that are common to these efforts. In particular, I will focus on whether and how such derived resources can be developed and made available using collaborative, open information systems even when the primary sources are not themselves fully open. As a case study, I describe the Phenoscape project, which is concerned with enabling computation over phenotype information from species across the tree of life culled from the published literature.
Looking Beyond Gold: Open Access to Research Itself
Open access to research papers is like having access to the shortened, condensed versions of a million different novels. While this is enormously beneficial and appears to be a time-saver on the surface, important details and assumptions are often left out of the story, leaving readers to wonder and struggle to reproduce or even understand the premises or conclusions. What if we were able to share the entire journey of the researcher? What if we were able to share the research itself every question, connection, dead end, data point, and analytical thoughtin context, as it happened? What would be the motivations to do this? How would this affect our perception of what research is? Potential benefits as well as current obstacles to sharing more of the story of the research will be explored.