Open Research Tools and Technologies

AMENDMENT Transforming scattered analyses into a documented, reproducible and shareable workflow

1. Februar 2020
16:30 – 17:00

AW1.126

Sébastien Rochette

This presentation is a feedback from experience on helping a researcher transforming a series of scattered analyses into a documented, reproducible and shareable workflow. Time allocated by researchers to program / code the analyses required to answer their scientific questions is usually low compared to other tasks. As a result, multiple small experiments are developed and outputs are gathered as best as possible to be presented in a scientific paper. However, science is not only about sharing results but also sharing methods. How can we make our results reproducible when we developed multiple, usually undocumented analyses? What do we do if the program is only applicable to our computer directory architecture? This is always possible to take time to rewrite, re-arrange and document analyses at the time we want/have to share them. Here, I will take the exemple of a "collaboration fest" where we dissected R scripts of a researcher in ecology. We started a reproducible, documented and open-source R-package along with its website, automatically built using continuous integration: https://cesco-lab.github.io/Vigie-Chiro_scripts/. However, can we think, earlier in the process, a better way to use our small programming time slots by adopting a method that will save time in our future? In this aim, I will present a documentation-first method using little time while writing analyses, but saving a lot when the time has come to share your work.

Session type (Lecture or Lightning Talk) Lecture Session length (20-40 min, 10 min for a lightning talk) 30 min Expected prior knowledge / intended audience No prior knowledge expected. Example will be about building documentation for R software but any developper, using any programming language may be interested in the method adopted. Speaker bio Sébastien Rochette has a PhD in marine ecology. After a few years has a researcher in ecology, he joined ThinkR (https://rtask.thinkr.fr), a company giving courses and consultancy around the R-software. Along with commercial activities, he is highly involved in the development of open-source R packages. He also shares his experience with the R-community through free tutorials, blog posts, online help and other conferences. https://statnmap.com/ Links to code / slides / material for the talk (optional) I wrote a blog post in French about what I am planning to present: https://thinkr.fr/transformer-plusieurs-scripts-eparpilles-en-beau-package-r/ This topic is also related to another blog post: https://rtask.thinkr.fr/when-development-starts-with-documentation/ Links to previous talks by the speaker Talks about R are in my Github repository: https://github.com/statnmap/prez/. The "README" lists talks that have a live recorded video. As a researcher, I also gave multiple talks about marine science, modelling and other topics related to my research. Please note that this talk was originally scheduled to be at 17h. The talk originally in this slot was "Developing from the field." by Audrey Baneyx and Robin de Mourat which will now take place at 17h.

Weitere Infos

Format	devroom

Weitere Sessions

01.02.20	The good and the bad sides of developing open source tools for neuroscience Open Research Tools and Technologies Jan Grewe AW1.126 The reproducibility crisis has shocked the scientific community. Different papers describe this issue and the scientific community has taken steps to improve on it. For example, several initiatives have been founded to foster openness and standardisation in different scientific communities (e.g. the INCF[1] for the neurosciences). Journals encourage sharing of the data underlying the presented results, some even make it a requirement. What is the role of open source solutions in this respect? ...
01.02.20	Challenges and opportunities in scientific software development Open Research Tools and Technologies Julia Sprenger AW1.126 The approaches used in software development in an industry setting and a scientific environment are exhibit a number of fundamental differences. In the former industry setting modern team development tools and methods are used (version control, continuous integration, Scrum, ...) to develop software in teams with a focus on the final software product. In contrast, in the latter scientific environment a large fraction of scientific code is produced by individual scientists lacking thorough ...
01.02.20	NeuroFedora: Enabling Free/Open Neuroscience Open Research Tools and Technologies Aniket Pradhan AW1.126 NeuroFedora is an initiative to provide a ready to use Fedora-based Free/Open source software platform for neuroscience. We believe that similar to Free software; science should be free for all to use, share, modify, and study. The use of Free software also aids reproducibility, data sharing, and collaboration in the research community. By making the tools used in the scientific process more comfortable to use, NeuroFedora aims to take a step to enable this ideal.
01.02.20	Spotlight on Free Software Building Blocks for a Secure Health Data Infrastructure Open Research Tools and Technologies AW1.126 Health Data is traditionally held and processed in large and complex mazes of hospital information systems. The market is dominated by vendors offering monolithic and proprietary software due to the critical nature of the supported processes and - in some cases - due to legal requirements. The “digital transformation”, “big data” and “artificial intelligence” are some of the hypes that demand for improved exchange of health care data in routine health care and medical research alike. ...
01.02.20	DataLad Open Research Tools and Technologies Michael Hanke AW1.126 Contemporary sciences are heavily data-driven, but today's data management technologies and sharing practices fall at least a decade behind software ecosystem counterparts. Merely providing file access is insufficient for a simple reason: data are not static. Data often (and should!) continue to evolve; file formats can change, bugs will be fixed, new data are added, and derived data needs to be integrated. While (distributed) version control systems are a de-facto standard for open source ...
01.02.20	Frictionless Data for Reproducible Research Open Research Tools and Technologies Lilly Winfree AW1.126 Generating insight and conclusions from research data is often not a straightforward process. Data can be hard to find, archived in difficult to use formats, poorly structured and/or incomplete. These issues create “friction” and make it difficult to use, publish and share data. The Frictionless Data initiative (https://frictionlessdata.io/) at Open Knowledge Foundation (http://okfn.org) aims to reduce friction in working with data, with a goal to make it effortless to transport data among ...
01.02.20	On the road to sustainable research software. Open Research Tools and Technologies Mateusz Kuzak AW1.126 ELIXIR is an intergovernmental organization that brings together life science resources across Europe. These resources include databases, software tools, training materials, cloud storage, and supercomputers.

FOSDEM 2020

01.02.20 – 02.02.20

Event

FOSS Events

Erstellt von @foss_events 25 Abonnierende

Veranstaltungskalender