Lightning Talks

A lightning intro to re-Isearch

re-Isearch, the 27 year old new kid on the search block
L.lightningtalks
Edward Zimmermann
<p>Project re-isearch is a novel multimodal search and retrieval engine using mathematical models and algorithms different from the all-too-common inverted index. The design allows it to have, in practice, effectively no limits on the frequency of words, term length, number of fields or complexity of structured data and support even overlap--- where fields or structures cross other's boundaries (common examples are quotes, line/sentences, biblical verse, annotations). Its model enables a completely flexible unit of retrieval and modes of search. Developed using a highly portable C++ subset to be RAM efficient, the engine provides also bindings to a number of other languages such as Python, Tcl, Java etc.</p>
“Re-isearch” is a project following in the spirit of the original isearch developed back in the 1990s. Reborn in 2020 in the middle of the global Covid19 pandemic as Project re-Isearch. Like the original, it is not just about textual words but pushes the envelope. re-Isearch is multi-object, multi-modal and with an unharnessed unit of retrieval. Mainstream search engines are about finding any information: "a list of all documents containing a specific word or phrase”. So search engines paradoxically return both too much information (i.e. long lists of links) and too little information (i.e. links to content, not content itself). The re-Isearch engine is, by contrast, about exploiting document structure, both implicit (XML and other markup) and explicit (visual groupings such as paragraph), to zero in on relevant sections of documents, not just links to documents. This concept of search granularity is a radical departure from other designs. With typical text indexers one has the concept of document or record and that is the unit of index and the unit of retrieval. Instead we can have a dynamic search time unit of retrieval: user specified or heuristically determined. The structure of of documents can be exploited to identify which document elements (such as the appropriate chapter or page) to retrieve. Retrieval granularity may be on the level of sub-structures of a given document or page such as line, paragraph but may also be as part of a larger collection. Like the original, it is not just about textual words but the design contains a large number of objects: numerical, range, geospatial etc. It is unique among full-text systems in that it also provides numerous object types with their own methods of search and allows these to be viewed parallel as text--- a date field (of which it will be one of the first to support some key parts of the new ISO-8601:2019 standard date semantics), for instance, can be searched as a date but also a text, searching for the words in the field.

Additional information

Type lightningtalk

More sessions

2/5/22
Lightning Talks
Thomas Lauf
L.lightningtalks
<p>Time tracking is a task many people have to deal with. Be it for writing bills for your client, creating time reports for your company, or simply because you are curious what you are doing with your time all day. Timewarrior is a tool that lets you track your time easily from the command line – it does its job then gets out of your way.</p>
2/5/22
Lightning Talks
Bradly Alicea
L.lightningtalks
<p>As a means of enabling distributed collaboration, open-source enables people from many different disciplinary backgrounds to participate in research projects to which they would otherwise not have access. Additionally, open-source allows for reconfigurable expertise, or the ability to combine people from different backgrounds in ways depending on the task at hand. This talk will discuss the challenges associated with spontaneous interdisciplinary, in addition to opportunities provided by ...
2/5/22
Lightning Talks
Peter Czanik
L.lightningtalks
<p>A desktop thermometer that displays relative humidity is useful, but it does not provide continuous monitoring. In comes the Raspberry Pi: it is small, inexpensive, and has many sensor options, including temperature and relative humidity. It can collect data around the clock, do some alerting, and forward data for analysis.</p>
2/5/22
Lightning Talks
Eric Charles
L.lightningtalks
<p>Jupyter notebook is a tool that allows Data Scientist to analyse dataset. However, it is not easy to create a custom user interface integrated in an existing application.</p> <p><code>Jupyter React</code>, https://github.com/datalayer/jupyter-react, an open-source library, fills that gap and provides components that a developer can easily integrate in any React.js application.</p>
2/5/22
Lightning Talks
Sirio Bolaños Puchet
L.lightningtalks
<p><strong>C%</strong> (from "C with mods") is an experimental meta-programming language that aims to make coding in C more efficient and fun!</p> <p>Together with <strong>cmod</strong>, the reference pre-processor/code generator (written using <strong>C%</strong> itself), this project enables the C programmer with generic meta-programming constructs such as: parameterized verbatim code snippets, mapping code to static data tables (in TSV or JSON format), multi-pass code evaluation (allowing ...
2/5/22
Lightning Talks
Drew DeVault
L.lightningtalks
<p>qbe is an optimizing compiler backend which consumes programs in a simple intermediate language, optimizes them, and emits assembly for x86_64, aarch64, or riscv64, aiming to achieve "70% of the performance" of advanced compilers like LLVM in "10% of the code". This talk will briefly introduce qbe and its intermediate language, explain how it works and what it's capable of, and go over some sample programs which can be written in it.</p>
2/5/22
Lightning Talks
Huy Ngo
L.lightningtalks
<p>InterPlanetary Wheels (IPWHL) are platform-unique, singly-versioned Python built distributions backed by IPFS for security and reproducibility. Using the peer-to-peer file system IPFS, the distributions have the advantage of being easily replicated and not having a single point of failure, thus are more resilient. While this project targets at Python package in particular, the idea can be similarly applied to other software distributions such as Linux distributions.</p>