Open Research

Data science from the command line: a look back at 2 years of using xan

<p><a href="https://github.com/medialab/xan">Xan</a> is a command-line tool designed to manipulate CSV files directly from the comfort of the terminal. </p> <p>Originally developed within a sociology research lab to perform common operations on very large datasets collected from the web (exploration, sorting, computing frequency tables, joins, aggregations, etc.), it has become a go-to solution for its users for many more use-cases, including lexicometry analysis, plotting histograms, time series or heatmaps, and even generating network graphs. And while the tool was initially created to deal with very large CSV files, it is now also used by people to process small files, and other file formats. The tool was thus included in the daily data manipulation practices of its users, who saw it as an opportunity to never leave their shells, without having to rely on GUIs or notebooks. </p> <p>This presentation, given by a research engineer after two years of regular use, examines the reasons for this appropriation, which relates both to the constraints of research in the Humanities and Social Sciences and to the interface design choices that make xan effective.</p>

Additional information

Live Stream https://live.fosdem.org/watch/aw1120
Type devroom
Language English

More sessions

2/1/26
Open Research
Joe Knapper
AW1.120
<p>The OpenFlexure Microscope is an open source, laboratory-grade robotic microscope, used by a diverse community including academic researchers, engineers, educators, pathologists and hobbyists (https://openflexure.org/, https://openflexure.discourse.group/). Users from over 60 countries have developed and used the device for everything ranging from exploring their garden's wildlife, to training medical students to diagnose cancer. Joe presents his experience as an academic member of the ...
2/1/26
Open Research
Soulaine Theocharides
AW1.120
<p>At the current rate of digitization, it is estimated that it would take hundreds of years to fully digitize the natural science collections of Europe. In the face of the biodiversity crisis, we urgently need to scale up digitization to equip researchers with the tools to tackle this challenge. </p> <p>The Distributed System of Scientific Collections, DiSSCo, is a fully open source European infrastructure that is bringing together over 300 institutions into a unified, digital natural science ...
2/1/26
Open Research
AW1.120
<p>The exponential growth of scientific literature—doubling roughly every nine years—has made it increasingly difficult for researchers and decision-makers to locate, assess, and synthesize the evidence needed for sound policy and practice. Systematic maps and systematic reviews offer robust, unbiased ways to answer “what works?” but today they depend on manual search and screening workflows that are slow, costly, and vulnerable to human error. The result is a bottleneck: high-quality, ...
2/1/26
Open Research
Precious Onyewuchi
AW1.120
<p>AI has become an integral part of modern research, offering tremendous opportunities, but also raising important questions for the Open Science community.</p> <p>With the emergence of the Open Source AI Definition (OSAID) and its emphasis on the four freedoms, the “freedom to study” stands out as a cornerstone for achieving true reproducibility. You can read the OSAID definition here: https://opensource.org/ai/open-source-ai-definition.</p> <p>This talk will explore how researchers can ...
2/1/26
Open Research
Eldar Kurtić
AW1.120
<p>vLLM (https://github.com/vllm-project/vllm) has rapidly become a community-standard open-source engine for LLM inference, backed by a large and growing contributor base and widely adopted for production serving. This talk offers a practical blueprint for scaling inference in vLLM using two complementary techniques, quantization (https://github.com/vllm-project/llm-compressor) and speculative decoding (https://github.com/vllm-project/speculators). Drawing on extensive evaluations across ...
2/1/26
Open Research
AW1.120
<p>Quantum computing creates new opportunities, but building and operating a quantum cloud service remains a complex challenge, often relying on proprietary, black-box solutions. To bridge this gap, we introduce OQTOPUS (Open Quantum Toolchain for OPerators and USers) [1], a comprehensive open-source software stack designed to build and manage full-scale quantum computing systems. OQTOPUS provides a complete cloud architecture for quantum computers, covering three critical layers: 1.Frontend ...
2/1/26
Open Research
AW1.120
<p>NoiseModelling is an open-source platform for simulating environmental noise propagation and generating regulatory-compliant noise maps at urban and regional scales. Leaded since 2008 by the Joint Research Unit in Environmental Acoustics at Gustave Eiffel University, it provides researchers and practitioners with reproducible, transparent, and scalable modelling capabilities for environmental acoustics. As the modelling core of the Noise-Planet framework, NoiseModelling simulates noise ...