Open Research Tools and Technologies

Frictionless Data for Reproducible Research

AW1.126
Lilly Winfree
Generating insight and conclusions from research data is often not a straightforward process. Data can be hard to find, archived in difficult to use formats, poorly structured and/or incomplete. These issues create “friction” and make it difficult to use, publish and share data. The Frictionless Data initiative (https://frictionlessdata.io/) at Open Knowledge Foundation (http://okfn.org) aims to reduce friction in working with data, with a goal to make it effortless to transport data among different tools and platforms for further analysis, and with an emphasis on reproducible research and open data. The Frictionless Data project is comprised of a set of specifications (https://frictionlessdata.io/specs/) for data and metadata interoperability, accompanied by a collection of open source software libraries (https://frictionlessdata.io/software/) that implement these specifications, and a range of best practices for data management. Over the past year and a half, we have been working specifically with the researcher community to prototype using Frictionless Data’s open source tools to improve researchers’ data workflows and champion reproducibility. This talk will discuss the technical ideas behind Frictionless Data for research and will also showcase recent collaborative use cases, such as how oceanographers implemented Frictionless Data tooling into their data ingest pipelines to integrate disparate data while maintaining quality metadata in an easy to use interface.
Expected prior knowledge / intended audience The audience should be familiar with the themes of researching, using data in various forms from various sources, scientific computing, and the talk is intended for those that are interested in data management, data cleaning, metadata, and using open research data. Speaker bio Lilly Winfree is the Product Owner of the Frictionless Data for Reproducible Research Project at Open Knowledge Foundation, where she solves researchers’ technical data management issues. She has her PhD in neuroscience, and has been active in the open data, open source, and open science communities for four years. Lilly has given numerous conference presentations and workshops over the past decade, and enjoys presenting on technical topics to technical and non-technical audiences. Links to code / slides / material for the talk (optional) https://github.com/frictionlessdata/ http://frictionlessdata.io/software/ Links to previous talks by the speaker Workshop presentation: http://bit.ly/FDepfl Talk from a previous position: https://youtu.be/4Jqu8mBXcmA

Additional information

Type devroom

More sessions

2/1/20
Open Research Tools and Technologies
Jan Grewe
AW1.126
The reproducibility crisis has shocked the scientific community. Different papers describe this issue and the scientific community has taken steps to improve on it. For example, several initiatives have been founded to foster openness and standardisation in different scientific communities (e.g. the INCF[1] for the neurosciences). Journals encourage sharing of the data underlying the presented results, some even make it a requirement. What is the role of open source solutions in this respect? ...
2/1/20
Open Research Tools and Technologies
Julia Sprenger
AW1.126
The approaches used in software development in an industry setting and a scientific environment are exhibit a number of fundamental differences. In the former industry setting modern team development tools and methods are used (version control, continuous integration, Scrum, ...) to develop software in teams with a focus on the final software product. In contrast, in the latter scientific environment a large fraction of scientific code is produced by individual scientists lacking thorough ...
2/1/20
Open Research Tools and Technologies
Aniket Pradhan
AW1.126
NeuroFedora is an initiative to provide a ready to use Fedora-based Free/Open source software platform for neuroscience. We believe that similar to Free software; science should be free for all to use, share, modify, and study. The use of Free software also aids reproducibility, data sharing, and collaboration in the research community. By making the tools used in the scientific process more comfortable to use, NeuroFedora aims to take a step to enable this ideal.
2/1/20
Open Research Tools and Technologies
AW1.126
Health Data is traditionally held and processed in large and complex mazes of hospital information systems. The market is dominated by vendors offering monolithic and proprietary software due to the critical nature of the supported processes and - in some cases - due to legal requirements. The “digital transformation”, “big data” and “artificial intelligence” are some of the hypes that demand for improved exchange of health care data in routine health care and medical research alike. ...
2/1/20
Open Research Tools and Technologies
Michael Hanke
AW1.126
Contemporary sciences are heavily data-driven, but today's data management technologies and sharing practices fall at least a decade behind software ecosystem counterparts. Merely providing file access is insufficient for a simple reason: data are not static. Data often (and should!) continue to evolve; file formats can change, bugs will be fixed, new data are added, and derived data needs to be integrated. While (distributed) version control systems are a de-facto standard for open source ...
2/1/20
Open Research Tools and Technologies
Mateusz Kuzak
AW1.126
ELIXIR is an intergovernmental organization that brings together life science resources across Europe. These resources include databases, software tools, training materials, cloud storage, and supercomputers.
2/1/20
Open Research Tools and Technologies
Antoine Fauchié
AW1.126
As an editor for WYSIWYM text, Stylo is designed to change the entire digital editorial chain of scholarly journals the field of human sciences. Stylo (https://stylo.ecrituresnumeriques.ca) is designed to simplify the writing and editing of scientific articles in the humanities and social sciences. It is intended for authors and publishers engaged in high quality scientific publishing. Although the structuring of documents is fundamental for digital distribution, this aspect is currently delayed ...