A couple of years ago I was intrigued by phiresky's post[0] about querying SQLite over HTTP. It made me think that if anyone can publish a database using GitHub Pages, I could probably build a frontend in which users can decide which database to query. TeaTime is like that - when you first visit it, you'll need to choose your database. Everyone can create additional databases[1]. TeaTime then queries it, and fetches files using an IPFS gateway (I'm looking into using Helia so that users are also contributing nodes in the network). Files are then rendered in the website itself. Everything is done in the browser - no users, no cookies, no tracking. LocalStorage and IndexedDB are used for saving your last readings, and your position in each file.
Since TeaTime is a static site, it's super easy (and free) to deploy. GitHub repo tags are used for maintaining a list of public instances[2].
Note that a GitHub repository isn't mandatory for storing the SQLite files or the front end - it's only for the configuration file (config.json) of each database, and for listing instances. Both the instances themselves and the database files can be hosted on Netlify, Cloudflare Pages, your Raspberry Pi, or any other server that can host static files.
I'm curious to see what other kinds of databases people can create, and what other types of files TeaTime could be used for.
[0] https://news.ycombinator.com/item?id=27016630
[1] https://github.com/bjesus/teatime-json-database/
[2] https://github.com/bjesus/teatime/wiki/Creating-a-TeaTime-in...
I love this! Thanks for making this!
I had to look that term up <https://github.com/ipfs/helia#readme> but while sniffing around in their <https://github.com/ipfs/helia/wiki/Projects-using-Helia> I was reminded of https://github.com/orbitdb/orbitdb#readme which seems like it may be much less rolling your own parts
Another interesting use case would be to see if this can replace (or be an addition to) SQLite as the database in which the queries are ran.
sphinx-build: https://www.sphinx-doc.org/en/master/man/sphinx-build.html
There may need to be a different Builder or an extension of sphinxcontrib.serializinghtml.JSONHTMLBuilder which serializes a doctree (basically a DOM document object model) to the output representation: https://www.sphinx-doc.org/en/master/usage/builders/#sphinxc...
datasette and datasette-lite can load CSV, JSON, Parquet, and SQLite databases; supports Full-Text Search; and supports search Faceting. datasette-lite is a WASM build of datasette with the pyodide python distribution.
datasette-lite > Loading SQLite databases: https://github.com/simonw/datasette-lite#loading-sqlite-data...
jupyter-lite is a WASM build of jupyter which also supports sqlite in notebooks in the browser with `import sqlite3` with the python kernel and also with a sqlite kernel: https://jupyter.org/try-jupyter/lab/
jupyterlite/xeus-sqlite-kernel: https://github.com/jupyterlite/xeus-sqlite-kernel
(edit)
xeus-sqlite-kernel > "Loading SQLite databases from a remote URL" https://github.com/jupyterlite/xeus-sqlite-kernel/issues/6#i...
%FETCH <url> <filename> https://github.com/jupyter-xeus/xeus-sqlite/blob/ce5a598bdab...
xlite.cpp > void fetch(const std::string url, const std::string filename) https://github.com/jupyter-xeus/xeus-sqlite/blob/main/src/xl...
I guess it depends on what you mean by "work with TeaTime". TeaTime itself is a static site, generated using Nuxt. Nothing that it does cannot be achieved with another stack - at the end it's just HTML, CSS and JS. I haven't tried sphinx-build or jupyter-book, but there isn't a technical reason why Hugo wouldn't be able to build a TeaTime like website, using the same databases.
> datasette-lite > Loading SQLite databases: https://github.com/simonw/datasette-lite#loading-sqlite-data...
I haven't seen datasette before. What are the biggest benefits you think it has over sql.js-httpvfs (which I'm using now)? Is it about the ability to also use other formats, in addition to SQLite? I got the impression that sql.js-httpvfs was a bit more of a POC, and later some possibly better solutions came out, but I haven't really went that rabbit hole to figure out which one would be best.
Edit: looking a little more into datasette-lite, it seems like one of the nice benefits of sql.js-httpvfs is that it doesn't download the whole SQLite database in order to query it. This makes it possible have a 2GB database but still read it in chunks, skipping around efficiently until you find your data.
datasette-lite > "Could this use the SQLite range header trick?" https://github.com/simonw/datasette-lite/issues/28
From xeus-sqlite-kernel > "Loading SQLite databases from a remote URL" https://github.com/jupyterlite/xeus-sqlite-kernel/issues/6#i... re: "Loading partial SQLite databases over HTTP":
> sql.js-httpvfs: https://github.com/phiresky/sql.js-httpvfs
> sqlite-wasm-http: https://github.com/mmomtchev/sqlite-wasm-http
>> This project is inspired from @phiresky/sql.js-httpvfs but uses the new official SQLite WASM distribution.
Datasette creates a JSON API from a SQLite database, has an optional SQL query editor with canned queries, multi DB query support, docs; https://docs.datasette.io/en/stable/sql_queries.html#cross-d... :
> SQLite has the ability to run queries that join across multiple databases. Up to ten databases can be attached to a single SQLite connection and queried together.
Couldn't this be a security issue, for a bad actors to use this tag?
https://cloudmersive.com/article/Understanding-Embedded-Java...
I do think this might change with the introduction of things like the Direct Sockets API, but for now they are too restricted and not widely supported.
It's my first time working with IPFS and I agree it hasn't always been 100% reliable, but I do hope that if I manage to get TeaTime users to also be contributing nodes, this might actually improve the reliability of the whole network. Once it's possible to use BitTorrent in the browser, I do think it would be a great addition (https://github.com/bjesus/teatime/issues/3).
IPFS certainly has its flaws, but it's been getting much better in the last year. If you want to do P2P retrieval in browsers, it's practically the most advanced from all protocols.
We just published an update on this topic https://blog.ipfs.tech/2024-shipyard-improving-ipfs-on-the-w...
My understanding is that IPFS always requires an IPFS → HTTP gateway to work in browsers, so I wonder what you mean by "P2P retrival in browsers".
Verified Fetch is about verifying content retrieved from gateways on client-side but the retrival is still very much client-server (vs P2P) today, and Service Worker Gateway seems to be relying on gateways too (<trustless-gateway.link> in the demo linked).
Based on IPFS & HTTP: The Blueprint for Radical Interoperability - Lidel [0], I believe the situation is acknowledged by IPFS devs too. In my unsolicited opinion, using a custom protocol (libp2p transports?) for a project targetting web browsers was a mistake, so I'm glad to see that HTTP is being considered now.
By P2P retrieval, I mean retrieval directly from a peer that has the data without additional hops.
Historically HTTP gateways were the only way, because you couldn't dial (most) peers directly from a browser unless they had a CA-signed TLS certificate (needed in secure contexts.)
A couple of things changed that:
- WebTransport and WebRTC-direct allow browser-server(peer) communication without a CA signed certificate. WebTransport is very new and still has some problems in browser implementations.
- We just launched AutoTLS which help public IPFS node get a wildcard let's encrypt TLS certificate, making it "dialable" from browsers
With these, it becomes a whole lot easier to establish connections from browsers to IPFS nodes. But it takes time for the network to upgrade.
Note that Libp2p transports are wrappers over network transports to allow interoperability across runtimes, e.g. TCP, QUIC, WebSockets, WebTransport. These are not custom protocols.
Now you point about custom protocols, by which I presume you are referring to data exchange/retrieval protocols.
There are two IPFS data transfer protocols: - Bitswap: the first and default data transfer protocol which requires a streaming network transport (so HTTP isn't ideal). - HTTP Gateways: Initially, HTTP gateways were servers that would handle retrieval (over Bitswap) for you if they didn't have the data locally (sometimes we refer to these as Recursive Gateways, like trustless-gateway.link and ipfs.io). For all the reasons in Lidel's talk (caching, interop, composability), we extended this notion and made HTTP Gateway a general interface for data retrieval.
Today, there are large providers in the network (pinning services) that expose their data with an HTTP Gateway, which allows browsers to retrieve from directly.
We still have more work to do to expose the gateway in Kubo with the new AutoTLS certificate, but for the time being, Bitswap should work well enough even in browsers over a WebSockets connection. ---
Verified Fetch's aims to ensure you get data is it's available. If it fails to find a direct provider of the data, it will fall back on a recursive gateway like trustless-gateway.link which is configured in the defaults. As more IPFS nodes upgrade to newer versions, reliance on such centralised recursive gateways will become unnecessary.
TL;DR: We're all in on HTTP. The real constraint to P2P is browser vendors and slow standard bodies.
Generally, I see the distributed nature of TeaTime as an instrument for being resilient, and less so a goal by itself. My thinking is that the GitHub database repositories are "resilient enough", especially if they can contain just a config.json file pointing to database files living elsewhere. But I'm not a lawyer and could be wrong.
Besides, HTTP requests against indices should be fast enough for a decent user experience and IPFS (and/or its gateways) aren’t great at that in my experience. I think using GitHub (or any other static hosting providers) was a good call in that regard.
How about a browse feature?
Also, thanks to whoever came up with thebookbay.org instance :)
IPFS contributor here.
I had a look at your code and saw how you handle downloading from multiple gateways.
There's a better way to this which also leverages direct P2P retrieval in browsers, which is now a thing with IPFS! [0] If you just want to fetch, checkout @helia/verified-fetch[1] which gives you a Fetch like API that accepts CIDs. It handles all the content routing and retrieval p2p magic and can help reduce reliance on gateways.
You can also pass it gateways in case it can connect to providers directly (due to transports)
[0] https://blog.ipfs.tech/2024-shipyard-improving-ipfs-on-the-w... [1] https://github.com/ipfs/helia-verified-fetch/tree/main/packa...
Edit: I now realised we previously talked on the Discord channel, thank you for the help there!
Any reason you chose BLAKE2b-256?
BLAKE2/3 are amazing hashing functions and have a lot of cool properties, but they really suck on the web in terms of performance. This came up literally yesterday https://bsky.app/profile/norman.life/post/3lbw2qltokc2i and has been reported by multiple devs recently (who also shared excitement about BLAKE)
You can also pass [@helia/verified-fetch] gateways as a fallback for when it *cannot* connect to providers directly (due to transports).
Doesn't ipfs require a server running? Where is the server code?
I think the app was built on the assumption that someone is providing these CIDs.
> Access to fetch at 'https://bafykbzacedvzdo4hru3wul5excjthwjuz5ggd6jcs4wg77tg7bk...' (redirected from 'https://w3s.link/ipfs/bafykbzacedvzdo4hru3wul5excjthwjuz5ggd...') from origin 'https://bjesus.github.io' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
$ curl -I https://bafykbzacedvzdo4hru3wul5excjthwjuz5ggd6jcs4wg77tg7bkn3c7hlvohq.ipfs.dweb.link/
HTTP/2 200
date: Thu, 28 Nov 2024 16:34:03 GMT
content-type: application/x-rar-compressed
content-length: 192215114
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
Can you try with another file? I do think that once I implement Helia's verified-fetch, issues like this should be less common.I want people to be able to see which books are available and reserve books that are out and get notifications when due back.
And preferable self host as a small site.
Anyone know a small open source physical library management tool like this?