📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/
Go to file
2024-04-13 12:22:36 -07:00
.github/workflows wip 2024-04-10 19:32:07 +08:00
backend Merge remote-tracking branch 'origin/main' 2024-04-13 11:42:21 -07:00
thinapps-shared@9f0fa1dd7f wip 2024-04-10 19:32:07 +08:00
.gitignore chore: rename url2text to reader 2024-04-13 11:42:15 -07:00
.gitmodules wip 2024-04-10 19:32:07 +08:00
LICENSE chore: rename url2text to reader 2024-04-13 11:42:15 -07:00
package-lock.json fix 2024-04-12 12:27:42 +08:00
package.json chore: rename url2text to reader 2024-04-11 15:44:12 -07:00
README.md chore: rename url2text to reader 2024-04-13 12:22:36 -07:00

Reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/. Get improved output for your agent and RAG systems at no cost. Find more at https://jina.ai/reader.

Usage

Standard

To use the Reader, simply prepend https://r.jina.ai/ to any URL. For example, to convert the URL https://en.wikipedia.org/wiki/Artificial_intelligence to an LLM-friendly input, use the following URL:

https://r.jina.ai/https://en.wikipedia.org/wiki/Artificial_intelligence

Streaming mode

Use accept-header to control the streaming behavior:

curl -H "Accept: text/event-stream" https://r.jina.ai/https://en.m.wikipedia.org/wiki/Main_Page

JSON mode

This is still very early and the result is not really a good JSON but three simple field url, title and content. You can use accept-header to control the output format:

curl -H "Accept: application/json" https://r.jina.ai/https://en.m.wikipedia.org/wiki/Main_Page

Install

You will need the following tools to run the project:

  • Node v18 (The build fails for Node version >18)
  • Firebase CLI (npm install -g firebase-tools)

For backend, go to the backend/functions directory and install the npm dependencies.

git clone git@github.com:jina-ai/reader.git
cd backend/functions
npm install

About [thinapps-shared](thinapps-shared)

You might notice a reference to thinapps-shared submodule, an internal package we use to share code across our products. While its not yet open-sourced and isn't integral to the Reader's primary functions, it helps with logging, syntax enhancements, etc. Feel free to disregard it for now.

License

Apache License 2.0