Update examples section
|
@ -0,0 +1,11 @@
|
||||||
|
# Required environment variables
|
||||||
|
FIRECRAWL_API_KEY=
|
||||||
|
|
||||||
|
# Optional environment variables
|
||||||
|
# LangSmith tracing from the web worker.
|
||||||
|
# WARNING: FOR DEVELOPMENT ONLY. DO NOT DEPLOY A LIVE VERSION WITH THESE
|
||||||
|
# VARIABLES SET AS YOU WILL LEAK YOUR LANGCHAIN API KEY.
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_TRACING_V2=
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_API_KEY=
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_PROJECT=
|
||||||
|
|
38
examples/example_web_apps/local-website-chatbot/.gitignore
vendored
Normal file
|
@ -0,0 +1,38 @@
|
||||||
|
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
|
||||||
|
|
||||||
|
# dependencies
|
||||||
|
/node_modules
|
||||||
|
/.pnp
|
||||||
|
.pnp.js
|
||||||
|
|
||||||
|
# testing
|
||||||
|
/coverage
|
||||||
|
|
||||||
|
# next.js
|
||||||
|
/.next/
|
||||||
|
/out/
|
||||||
|
|
||||||
|
# production
|
||||||
|
/build
|
||||||
|
|
||||||
|
# misc
|
||||||
|
.DS_Store
|
||||||
|
*.pem
|
||||||
|
|
||||||
|
# debug
|
||||||
|
npm-debug.log*
|
||||||
|
yarn-debug.log*
|
||||||
|
yarn-error.log*
|
||||||
|
|
||||||
|
# local env files
|
||||||
|
.env*.local
|
||||||
|
.env
|
||||||
|
|
||||||
|
# vercel
|
||||||
|
.vercel
|
||||||
|
|
||||||
|
# typescript
|
||||||
|
*.tsbuildinfo
|
||||||
|
next-env.d.ts
|
||||||
|
|
||||||
|
.yarn
|
|
@ -0,0 +1 @@
|
||||||
|
{}
|
21
examples/example_web_apps/local-website-chatbot/LICENSE
Normal file
|
@ -0,0 +1,21 @@
|
||||||
|
MIT License
|
||||||
|
|
||||||
|
Copyright (c) 2023 Jacob Lee
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||||
|
of this software and associated documentation files (the "Software"), to deal
|
||||||
|
in the Software without restriction, including without limitation the rights
|
||||||
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||||
|
copies of the Software, and to permit persons to whom the Software is
|
||||||
|
furnished to do so, subject to the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included in all
|
||||||
|
copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||||
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||||
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||||
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||||
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||||
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||||
|
SOFTWARE.
|
|
@ -0,0 +1,7 @@
|
||||||
|
Copyright <YEAR> <COPYRIGHT HOLDER>
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
72
examples/example_web_apps/local-website-chatbot/README.md
Normal file
|
@ -0,0 +1,72 @@
|
||||||
|
# Local Chat With Websites
|
||||||
|
|
||||||
|
Welcome to the Local Web Chatbot! This is a direct fork of [Jacob Lee' fully local PDF chatbot](https://github.com/jacoblee93/fully-local-pdf-chatbot) replacing the chat with PDF functionality with chat with website support powered by [Firecrawl](https://www.firecrawl.dev/). It is a simple chatbot that allows you to ask questions about a website by embedding it and running queries against the vector store using a local LLM and embeddings.
|
||||||
|
|
||||||
|
## 🦙 Ollama
|
||||||
|
|
||||||
|
You can run more powerful, general models outside the browser using [Ollama's desktop app](https://ollama.ai). Users will need to download and set up then run the following commands to allow the site access to a locally running Mistral instance:
|
||||||
|
|
||||||
|
### Mac/Linux
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ OLLAMA_ORIGINS=https://webml-demo.vercel.app OLLAMA_HOST=127.0.0.1:11435 ollama serve
|
||||||
|
```
|
||||||
|
|
||||||
|
Then, in another terminal window:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ OLLAMA_HOST=127.0.0.1:11435 ollama pull mistral
|
||||||
|
```
|
||||||
|
|
||||||
|
### Windows
|
||||||
|
|
||||||
|
```cmd
|
||||||
|
$ set OLLAMA_ORIGINS=https://webml-demo.vercel.app
|
||||||
|
set OLLAMA_HOST=127.0.0.1:11435
|
||||||
|
ollama serve
|
||||||
|
```
|
||||||
|
|
||||||
|
Then, in another terminal window:
|
||||||
|
|
||||||
|
```cmd
|
||||||
|
$ set OLLAMA_HOST=127.0.0.1:11435
|
||||||
|
ollama pull mistral
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔥 Firecrawl
|
||||||
|
|
||||||
|
Additionally, you will need a Firecrawl API key for website embedding. Signing up for [Firecrawl](https://www.firecrawl.dev/) is easy and you get 500 credits free. Enter your API key into the box below the URL in the embedding form.
|
||||||
|
|
||||||
|
## ⚡ Stack
|
||||||
|
|
||||||
|
It uses the following:
|
||||||
|
|
||||||
|
- [Voy](https://github.com/tantaraio/voy) as the vector store, fully WASM in the browser.
|
||||||
|
- [Ollama](https://ollama.ai/).
|
||||||
|
- [LangChain.js](https://js.langchain.com) to call the models, perform retrieval, and generally orchestrate all the pieces.
|
||||||
|
- [Transformers.js](https://huggingface.co/docs/transformers.js/index) to run open source [Nomic](https://www.nomic.ai/) embeddings in the browser.
|
||||||
|
- For higher-quality embeddings, switch to `"nomic-ai/nomic-embed-text-v1"` in `app/worker.ts`.
|
||||||
|
- [Firecrawl](https://www.firecrawl.dev/) to scrape the webpages and deliver them in markdown format.
|
||||||
|
|
||||||
|
## 🔱 Forking
|
||||||
|
|
||||||
|
To run/deploy this yourself, simply fork this repo and install the required dependencies with `yarn`.
|
||||||
|
|
||||||
|
There are no required environment variables, but you can optionally set up [LangSmith tracing](https://smith.langchain.com/) while developing locally to help debug the prompts and the chain. Copy the `.env.example` file into a `.env.local` file:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
# No environment variables required!
|
||||||
|
|
||||||
|
# LangSmith tracing from the web worker.
|
||||||
|
# WARNING: FOR DEVELOPMENT ONLY. DO NOT DEPLOY A LIVE VERSION WITH THESE
|
||||||
|
# VARIABLES SET AS YOU WILL LEAK YOUR LANGCHAIN API KEY.
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_TRACING_V2="true"
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_API_KEY=
|
||||||
|
NEXT_PUBLIC_LANGCHAIN_PROJECT=
|
||||||
|
```
|
||||||
|
|
||||||
|
Just make sure you don't set this in production, as your LangChain API key will be public on the frontend!
|
||||||
|
|
||||||
|
## 🙏 Thank you!
|
||||||
|
|
||||||
|
Huge thanks to Jacob Lee and the other contributors of the repo for making this happen! Be sure to give him a follow on Twitter [@Hacubu](https://x.com/hacubu)!
|
|
@ -0,0 +1,74 @@
|
||||||
|
@tailwind base;
|
||||||
|
@tailwind components;
|
||||||
|
@tailwind utilities;
|
||||||
|
|
||||||
|
body {
|
||||||
|
color: #f8f8f8;
|
||||||
|
background: #131318;
|
||||||
|
}
|
||||||
|
|
||||||
|
body input,
|
||||||
|
body textarea {
|
||||||
|
color: black;
|
||||||
|
}
|
||||||
|
|
||||||
|
a {
|
||||||
|
color: #5ba4f8;
|
||||||
|
}
|
||||||
|
|
||||||
|
a:hover {
|
||||||
|
border-bottom: 1px solid;
|
||||||
|
}
|
||||||
|
|
||||||
|
p {
|
||||||
|
margin: 8px 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
code,
|
||||||
|
pre {
|
||||||
|
color: #ffa500;
|
||||||
|
}
|
||||||
|
|
||||||
|
pre {
|
||||||
|
background-color: black;
|
||||||
|
color: #39ff14;
|
||||||
|
}
|
||||||
|
|
||||||
|
li {
|
||||||
|
padding: 4px;
|
||||||
|
}
|
||||||
|
|
||||||
|
@layer base {
|
||||||
|
label {
|
||||||
|
@apply h-6 relative inline-block;
|
||||||
|
}
|
||||||
|
|
||||||
|
[type="checkbox"] {
|
||||||
|
@apply w-11 h-0 cursor-pointer inline-block;
|
||||||
|
@apply focus:outline-0 dark:focus:outline-0;
|
||||||
|
@apply border-0 dark:border-0;
|
||||||
|
@apply focus:ring-offset-transparent dark:focus:ring-offset-transparent;
|
||||||
|
@apply focus:ring-transparent dark:focus:ring-transparent;
|
||||||
|
@apply focus-within:ring-0 dark:focus-within:ring-0;
|
||||||
|
@apply focus:shadow-none dark:focus:shadow-none;
|
||||||
|
|
||||||
|
@apply after:absolute before:absolute;
|
||||||
|
@apply after:top-0 before:top-0;
|
||||||
|
@apply after:block before:inline-block;
|
||||||
|
@apply before:rounded-full after:rounded-full;
|
||||||
|
|
||||||
|
@apply after:content-[''] after:w-5 after:h-5 after:mt-0.5 after:ml-0.5;
|
||||||
|
@apply after:shadow-md after:duration-100;
|
||||||
|
|
||||||
|
@apply before:content-[''] before:w-10 before:h-full;
|
||||||
|
@apply before:shadow-[inset_0_0_#000];
|
||||||
|
|
||||||
|
@apply after:bg-white dark:after:bg-gray-50;
|
||||||
|
@apply before:bg-gray-300 dark:before:bg-gray-600;
|
||||||
|
@apply before:checked:bg-lime-500 dark:before:checked:bg-lime-500;
|
||||||
|
@apply checked:after:duration-300 checked:after:translate-x-4;
|
||||||
|
|
||||||
|
@apply disabled:after:bg-opacity-75 disabled:cursor-not-allowed;
|
||||||
|
@apply disabled:checked:before:bg-opacity-40;
|
||||||
|
}
|
||||||
|
}
|
|
@ -0,0 +1,49 @@
|
||||||
|
import "./globals.css";
|
||||||
|
import { Public_Sans } from "next/font/google";
|
||||||
|
|
||||||
|
import { Navbar } from "@/components/Navbar";
|
||||||
|
|
||||||
|
const publicSans = Public_Sans({ subsets: ["latin"] });
|
||||||
|
|
||||||
|
export default function RootLayout({
|
||||||
|
children,
|
||||||
|
}: {
|
||||||
|
children: React.ReactNode;
|
||||||
|
}) {
|
||||||
|
return (
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<title>Fully In-Browser Chat Over Documents</title>
|
||||||
|
<link rel="shortcut icon" href="/images/favicon.ico" />
|
||||||
|
<meta
|
||||||
|
name="description"
|
||||||
|
content="Upload a PDF, then ask questions about it - without a single remote request!"
|
||||||
|
/>
|
||||||
|
<meta
|
||||||
|
property="og:title"
|
||||||
|
content="Fully In-Browser Chat Over Documents"
|
||||||
|
/>
|
||||||
|
<meta
|
||||||
|
property="og:description"
|
||||||
|
content="Upload a PDF, then ask questions about it - without a single remote request!"
|
||||||
|
/>
|
||||||
|
<meta property="og:image" content="/images/og-image.png" />
|
||||||
|
<meta property="og:image:width" content="1200" />
|
||||||
|
<meta property="og:image:height" content="630" />
|
||||||
|
<meta name="twitter:card" content="summary_large_image" />
|
||||||
|
<meta
|
||||||
|
name="twitter:title"
|
||||||
|
content="Fully In-Browser Chat Over Documents"
|
||||||
|
/>
|
||||||
|
<meta
|
||||||
|
name="twitter:description"
|
||||||
|
content="Upload a PDF, then ask questions about it - without a single remote request!"
|
||||||
|
/>
|
||||||
|
<meta name="twitter:image" content="/images/og-image.png" />
|
||||||
|
</head>
|
||||||
|
<body className={publicSans.className}>
|
||||||
|
<div className="flex flex-col p-4 md:p-12 h-[100vh]">{children}</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
);
|
||||||
|
}
|
|
@ -0,0 +1,7 @@
|
||||||
|
import { ChatWindow } from "@/components/ChatWindow";
|
||||||
|
|
||||||
|
export default function Home() {
|
||||||
|
return (
|
||||||
|
<ChatWindow placeholder="Try asking something about the document you just uploaded!"></ChatWindow>
|
||||||
|
);
|
||||||
|
}
|
232
examples/example_web_apps/local-website-chatbot/app/worker.ts
Normal file
|
@ -0,0 +1,232 @@
|
||||||
|
import { ChatWindowMessage } from "@/schema/ChatWindowMessage";
|
||||||
|
|
||||||
|
import { Voy as VoyClient } from "voy-search";
|
||||||
|
|
||||||
|
import { createRetrievalChain } from "langchain/chains/retrieval";
|
||||||
|
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
|
||||||
|
import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
|
||||||
|
|
||||||
|
import { FireCrawlLoader } from "@langchain/community/document_loaders/web/firecrawl";
|
||||||
|
|
||||||
|
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
|
||||||
|
import { VoyVectorStore } from "@langchain/community/vectorstores/voy";
|
||||||
|
import {
|
||||||
|
ChatPromptTemplate,
|
||||||
|
MessagesPlaceholder,
|
||||||
|
PromptTemplate,
|
||||||
|
} from "@langchain/core/prompts";
|
||||||
|
import { RunnableSequence, RunnablePick } from "@langchain/core/runnables";
|
||||||
|
import {
|
||||||
|
AIMessage,
|
||||||
|
type BaseMessage,
|
||||||
|
HumanMessage,
|
||||||
|
} from "@langchain/core/messages";
|
||||||
|
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
|
||||||
|
import type { BaseChatModel } from "@langchain/core/language_models/chat_models";
|
||||||
|
import type { LanguageModelLike } from "@langchain/core/language_models/base";
|
||||||
|
|
||||||
|
import { LangChainTracer } from "@langchain/core/tracers/tracer_langchain";
|
||||||
|
import { Client } from "langsmith";
|
||||||
|
|
||||||
|
import { ChatOllama } from "@langchain/community/chat_models/ollama";
|
||||||
|
|
||||||
|
const embeddings = new HuggingFaceTransformersEmbeddings({
|
||||||
|
modelName: "Xenova/all-MiniLM-L6-v2",
|
||||||
|
});
|
||||||
|
|
||||||
|
const voyClient = new VoyClient();
|
||||||
|
const vectorstore = new VoyVectorStore(voyClient, embeddings);
|
||||||
|
|
||||||
|
const OLLAMA_RESPONSE_SYSTEM_TEMPLATE = `You are an experienced researcher, expert at interpreting and answering questions based on provided sources. Using the provided context, answer the user's question to the best of your ability using the resources provided.
|
||||||
|
Generate a concise answer for a given question based solely on the provided search results. You must only use information from the provided search results. Use an unbiased and journalistic tone. Combine search results together into a coherent answer. Do not repeat text.
|
||||||
|
If there is nothing in the context relevant to the question at hand, just say "Hmm, I'm not sure." Don't try to make up an answer.
|
||||||
|
Anything between the following \`context\` html blocks is retrieved from a knowledge bank, not part of the conversation with the user.
|
||||||
|
<context>
|
||||||
|
{context}
|
||||||
|
<context/>
|
||||||
|
|
||||||
|
REMEMBER: If there is no relevant information within the context, just say "Hmm, I'm not sure." Don't try to make up an answer. Anything between the preceding 'context' html blocks is retrieved from a knowledge bank, not part of the conversation with the user.`;
|
||||||
|
|
||||||
|
const _formatChatHistoryAsMessages = async (
|
||||||
|
chatHistory: ChatWindowMessage[],
|
||||||
|
) => {
|
||||||
|
return chatHistory.map((chatMessage) => {
|
||||||
|
if (chatMessage.role === "human") {
|
||||||
|
return new HumanMessage(chatMessage.content);
|
||||||
|
} else {
|
||||||
|
return new AIMessage(chatMessage.content);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
const embedWebsite = async (url: string, firecrawlApiKey: string) => {
|
||||||
|
|
||||||
|
const webLoader = new FireCrawlLoader({
|
||||||
|
url: url,
|
||||||
|
apiKey: firecrawlApiKey,
|
||||||
|
mode: "scrape",
|
||||||
|
});
|
||||||
|
|
||||||
|
const docs = await webLoader.load();
|
||||||
|
|
||||||
|
const splitter = new RecursiveCharacterTextSplitter({
|
||||||
|
chunkSize: 500,
|
||||||
|
chunkOverlap: 50,
|
||||||
|
});
|
||||||
|
|
||||||
|
const splitDocs = await splitter.splitDocuments(docs);
|
||||||
|
|
||||||
|
self.postMessage({
|
||||||
|
type: "log",
|
||||||
|
data: splitDocs,
|
||||||
|
});
|
||||||
|
|
||||||
|
await vectorstore.addDocuments(splitDocs);
|
||||||
|
};
|
||||||
|
|
||||||
|
const queryVectorStore = async (
|
||||||
|
messages: ChatWindowMessage[],
|
||||||
|
{
|
||||||
|
chatModel,
|
||||||
|
modelProvider,
|
||||||
|
devModeTracer,
|
||||||
|
}: {
|
||||||
|
chatModel: LanguageModelLike;
|
||||||
|
modelProvider: "ollama";
|
||||||
|
devModeTracer?: LangChainTracer;
|
||||||
|
},
|
||||||
|
) => {
|
||||||
|
const text = messages[messages.length - 1].content;
|
||||||
|
const chatHistory = await _formatChatHistoryAsMessages(messages.slice(0, -1));
|
||||||
|
|
||||||
|
const responseChainPrompt = ChatPromptTemplate.fromMessages<{
|
||||||
|
context: string;
|
||||||
|
chat_history: BaseMessage[];
|
||||||
|
question: string;
|
||||||
|
}>([
|
||||||
|
["system", OLLAMA_RESPONSE_SYSTEM_TEMPLATE],
|
||||||
|
new MessagesPlaceholder("chat_history"),
|
||||||
|
["user", `{input}`],
|
||||||
|
]);
|
||||||
|
|
||||||
|
const documentChain = await createStuffDocumentsChain({
|
||||||
|
llm: chatModel,
|
||||||
|
prompt: responseChainPrompt,
|
||||||
|
documentPrompt: PromptTemplate.fromTemplate(
|
||||||
|
`<doc>\n{page_content}\n</doc>`,
|
||||||
|
),
|
||||||
|
});
|
||||||
|
|
||||||
|
const historyAwarePrompt = ChatPromptTemplate.fromMessages([
|
||||||
|
new MessagesPlaceholder("chat_history"),
|
||||||
|
["user", "{input}"],
|
||||||
|
[
|
||||||
|
"user",
|
||||||
|
"Given the above conversation, generate a natural language search query to look up in order to get information relevant to the conversation. Do not respond with anything except the query.",
|
||||||
|
],
|
||||||
|
]);
|
||||||
|
|
||||||
|
const historyAwareRetrieverChain = await createHistoryAwareRetriever({
|
||||||
|
llm: chatModel,
|
||||||
|
retriever: vectorstore.asRetriever(),
|
||||||
|
rephrasePrompt: historyAwarePrompt,
|
||||||
|
});
|
||||||
|
|
||||||
|
const retrievalChain = await createRetrievalChain({
|
||||||
|
combineDocsChain: documentChain,
|
||||||
|
retriever: historyAwareRetrieverChain,
|
||||||
|
});
|
||||||
|
|
||||||
|
const fullChain = RunnableSequence.from([
|
||||||
|
retrievalChain,
|
||||||
|
new RunnablePick("answer"),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const stream = await fullChain.stream(
|
||||||
|
{
|
||||||
|
input: text,
|
||||||
|
chat_history: chatHistory,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
callbacks: devModeTracer !== undefined ? [devModeTracer] : [],
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
for await (const chunk of stream) {
|
||||||
|
if (chunk) {
|
||||||
|
self.postMessage({
|
||||||
|
type: "chunk",
|
||||||
|
data: chunk,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
self.postMessage({
|
||||||
|
type: "complete",
|
||||||
|
data: "OK",
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
// Listen for messages from the main thread
|
||||||
|
self.addEventListener("message", async (event: { data: any }) => {
|
||||||
|
self.postMessage({
|
||||||
|
type: "log",
|
||||||
|
data: `Received data!`,
|
||||||
|
});
|
||||||
|
|
||||||
|
let devModeTracer;
|
||||||
|
if (
|
||||||
|
event.data.DEV_LANGCHAIN_TRACING !== undefined &&
|
||||||
|
typeof event.data.DEV_LANGCHAIN_TRACING === "object"
|
||||||
|
) {
|
||||||
|
devModeTracer = new LangChainTracer({
|
||||||
|
projectName: event.data.DEV_LANGCHAIN_TRACING.LANGCHAIN_PROJECT,
|
||||||
|
client: new Client({
|
||||||
|
apiKey: event.data.DEV_LANGCHAIN_TRACING.LANGCHAIN_API_KEY,
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
if (event.data.url) {
|
||||||
|
try {
|
||||||
|
self.postMessage({
|
||||||
|
type: "log",
|
||||||
|
data: `Embedding website now: ${event.data.url} with Firecrawl API Key: ${event.data.firecrawlApiKey}`,
|
||||||
|
});
|
||||||
|
await embedWebsite(event.data.url, event.data.firecrawlApiKey);
|
||||||
|
self.postMessage({
|
||||||
|
type: "log",
|
||||||
|
data: `Embedded website: ${event.data.url} complete`,
|
||||||
|
});
|
||||||
|
} catch (e: any) {
|
||||||
|
self.postMessage({
|
||||||
|
type: "error",
|
||||||
|
error: e.message,
|
||||||
|
});
|
||||||
|
throw e;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
const modelProvider = event.data.modelProvider;
|
||||||
|
const modelConfig = event.data.modelConfig;
|
||||||
|
let chatModel: BaseChatModel | LanguageModelLike;
|
||||||
|
chatModel = new ChatOllama(modelConfig);
|
||||||
|
try {
|
||||||
|
await queryVectorStore(event.data.messages, {
|
||||||
|
devModeTracer,
|
||||||
|
modelProvider,
|
||||||
|
chatModel,
|
||||||
|
});
|
||||||
|
} catch (e: any) {
|
||||||
|
self.postMessage({
|
||||||
|
type: "error",
|
||||||
|
error: `${e.message}. Make sure you are running Ollama.`,
|
||||||
|
});
|
||||||
|
throw e;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
self.postMessage({
|
||||||
|
type: "complete",
|
||||||
|
data: "OK",
|
||||||
|
});
|
||||||
|
});
|
|
@ -0,0 +1,125 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { toast } from 'react-toastify';
|
||||||
|
import 'react-toastify/dist/ReactToastify.css';
|
||||||
|
|
||||||
|
import { ChatWindowMessage } from '@/schema/ChatWindowMessage';
|
||||||
|
|
||||||
|
import { useState, type FormEvent } from "react";
|
||||||
|
import { Feedback } from 'langsmith';
|
||||||
|
|
||||||
|
export function ChatMessageBubble(props: {
|
||||||
|
message: ChatWindowMessage;
|
||||||
|
aiEmoji?: string;
|
||||||
|
onRemovePressed?: () => void;
|
||||||
|
}) {
|
||||||
|
const { role, content, runId } = props.message;
|
||||||
|
|
||||||
|
const colorClassName =
|
||||||
|
role === "human" ? "bg-sky-600" : "bg-slate-50 text-black";
|
||||||
|
const alignmentClassName =
|
||||||
|
role === "human" ? "ml-auto" : "mr-auto";
|
||||||
|
const prefix = role === "human" ? "🧑" : props.aiEmoji;
|
||||||
|
|
||||||
|
const [isLoading, setIsLoading] = useState(false);
|
||||||
|
const [feedback, setFeedback] = useState<Feedback | null>(null);
|
||||||
|
const [comment, setComment] = useState("");
|
||||||
|
const [showCommentForm, setShowCommentForm] = useState(false);
|
||||||
|
|
||||||
|
async function handleScoreButtonPress(e: React.MouseEvent<HTMLButtonElement, MouseEvent>, score: number) {
|
||||||
|
e.preventDefault();
|
||||||
|
setComment("");
|
||||||
|
await sendFeedback(score);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleCommentSubmission(e: FormEvent<HTMLFormElement>) {
|
||||||
|
e.preventDefault();
|
||||||
|
const score = typeof feedback?.score === "number" ? feedback.score : 0;
|
||||||
|
await sendFeedback(score);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function sendFeedback(score: number) {
|
||||||
|
if (isLoading) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsLoading(true);
|
||||||
|
|
||||||
|
const response = await fetch("api/feedback", {
|
||||||
|
method: feedback?.id ? "PUT" : "POST",
|
||||||
|
body: JSON.stringify({
|
||||||
|
id: feedback?.id,
|
||||||
|
run_id: runId,
|
||||||
|
score,
|
||||||
|
comment,
|
||||||
|
})
|
||||||
|
});
|
||||||
|
|
||||||
|
const json = await response.json();
|
||||||
|
|
||||||
|
if (json.error) {
|
||||||
|
toast(json.error, {
|
||||||
|
theme: "dark"
|
||||||
|
});
|
||||||
|
return;
|
||||||
|
} else if (feedback?.id && comment) {
|
||||||
|
toast("Response recorded! Go to https://smith.langchain.com and check it out in under your run's \"Feedback\" pane.", {
|
||||||
|
theme: "dark",
|
||||||
|
autoClose: 3000,
|
||||||
|
});
|
||||||
|
setComment("");
|
||||||
|
setShowCommentForm(false);
|
||||||
|
} else {
|
||||||
|
setShowCommentForm(true);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (json.feedback) {
|
||||||
|
setFeedback(json.feedback);
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsLoading(false);
|
||||||
|
}
|
||||||
|
return (
|
||||||
|
<div
|
||||||
|
className={`${alignmentClassName} ${colorClassName} rounded px-4 py-2 max-w-[80%] mb-8 flex flex-col`}
|
||||||
|
>
|
||||||
|
<div className="flex hover:group group">
|
||||||
|
<div className="mr-2">
|
||||||
|
{prefix}
|
||||||
|
</div>
|
||||||
|
<div className="whitespace-pre-wrap">
|
||||||
|
{/* TODO: Remove. Hacky fix, stop sequences don't seem to work with WebLLM yet. */}
|
||||||
|
{content.trim().split("\nInstruct:")[0].split("\nInstruction:")[0]}
|
||||||
|
</div>
|
||||||
|
<div className="cursor-pointer opacity-0 hover:opacity-100 relative left-2 bottom-1" onMouseUp={props?.onRemovePressed}>
|
||||||
|
✖️
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div className={`${!runId ? "hidden" : ""} ml-auto mt-2`}>
|
||||||
|
<button className={`p-2 border text-3xl rounded hover:bg-green-400 ${feedback && feedback.score === 1 ? "bg-green-400" : ""}`} onMouseUp={(e) => handleScoreButtonPress(e, 1)}>
|
||||||
|
👍
|
||||||
|
</button>
|
||||||
|
<button className={`p-2 border text-3xl rounded ml-4 hover:bg-red-400 ${feedback && feedback.score === 0 ? "bg-red-400" : ""}`} onMouseUp={(e) => handleScoreButtonPress(e, 0)}>
|
||||||
|
👎
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
<div className={`${(feedback && showCommentForm) ? "" : "hidden"} min-w-[480px]`}>
|
||||||
|
<form onSubmit={handleCommentSubmission} className="relative">
|
||||||
|
<input
|
||||||
|
className="mr-8 p-4 rounded w-full border mt-2"
|
||||||
|
value={comment}
|
||||||
|
placeholder={feedback?.score === 1 ? "Anything else you'd like to add about this response?" : "What would the correct or preferred response have been?"}
|
||||||
|
onChange={(e) => setComment(e.target.value)}
|
||||||
|
/>
|
||||||
|
<div role="status" className={`${isLoading ? "" : "hidden"} flex justify-center absolute top-[24px] right-[16px]`}>
|
||||||
|
<svg aria-hidden="true" className="w-6 h-6 text-slate-200 animate-spin dark:text-slate-200 fill-sky-800" viewBox="0 0 100 101" fill="none" xmlns="http://www.w3.org/2000/svg">
|
||||||
|
<path d="M100 50.5908C100 78.2051 77.6142 100.591 50 100.591C22.3858 100.591 0 78.2051 0 50.5908C0 22.9766 22.3858 0.59082 50 0.59082C77.6142 0.59082 100 22.9766 100 50.5908ZM9.08144 50.5908C9.08144 73.1895 27.4013 91.5094 50 91.5094C72.5987 91.5094 90.9186 73.1895 90.9186 50.5908C90.9186 27.9921 72.5987 9.67226 50 9.67226C27.4013 9.67226 9.08144 27.9921 9.08144 50.5908Z" fill="currentColor"/>
|
||||||
|
<path d="M93.9676 39.0409C96.393 38.4038 97.8624 35.9116 97.0079 33.5539C95.2932 28.8227 92.871 24.3692 89.8167 20.348C85.8452 15.1192 80.8826 10.7238 75.2124 7.41289C69.5422 4.10194 63.2754 1.94025 56.7698 1.05124C51.7666 0.367541 46.6976 0.446843 41.7345 1.27873C39.2613 1.69328 37.813 4.19778 38.4501 6.62326C39.0873 9.04874 41.5694 10.4717 44.0505 10.1071C47.8511 9.54855 51.7191 9.52689 55.5402 10.0491C60.8642 10.7766 65.9928 12.5457 70.6331 15.2552C75.2735 17.9648 79.3347 21.5619 82.5849 25.841C84.9175 28.9121 86.7997 32.2913 88.1811 35.8758C89.083 38.2158 91.5421 39.6781 93.9676 39.0409Z" fill="currentFill"/>
|
||||||
|
</svg>
|
||||||
|
<span className="sr-only">Loading...</span>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
|
@ -0,0 +1,422 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { Id, ToastContainer, toast } from "react-toastify";
|
||||||
|
import "react-toastify/dist/ReactToastify.css";
|
||||||
|
|
||||||
|
import { useRef, useState, useEffect } from "react";
|
||||||
|
import type { FormEvent } from "react";
|
||||||
|
|
||||||
|
import { ChatMessageBubble } from "@/components/ChatMessageBubble";
|
||||||
|
import { ChatWindowMessage } from "@/schema/ChatWindowMessage";
|
||||||
|
|
||||||
|
export function ChatWindow(props: { placeholder?: string }) {
|
||||||
|
const { placeholder } = props;
|
||||||
|
const [messages, setMessages] = useState<ChatWindowMessage[]>([]);
|
||||||
|
const [input, setInput] = useState("");
|
||||||
|
const [isLoading, setIsLoading] = useState(true);
|
||||||
|
|
||||||
|
const [selectedURL, setSelectedURL] = useState<string | null>(null);
|
||||||
|
const [firecrawlApiKey, setFirecrawlApiKey] = useState("");
|
||||||
|
const [readyToChat, setReadyToChat] = useState(false);
|
||||||
|
const initProgressToastId = useRef<Id | null>(null);
|
||||||
|
const titleText = "Local Chat With Websites";
|
||||||
|
const emoji = "🔥";
|
||||||
|
|
||||||
|
const worker = useRef<Worker | null>(null);
|
||||||
|
|
||||||
|
async function queryStore(messages: ChatWindowMessage[]) {
|
||||||
|
if (!worker.current) {
|
||||||
|
throw new Error("Worker is not ready.");
|
||||||
|
}
|
||||||
|
return new ReadableStream({
|
||||||
|
start(controller) {
|
||||||
|
if (!worker.current) {
|
||||||
|
controller.close();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const ollamaConfig = {
|
||||||
|
baseUrl: "http://localhost:11435",
|
||||||
|
temperature: 0.3,
|
||||||
|
model: "mistral",
|
||||||
|
};
|
||||||
|
const payload: Record<string, any> = {
|
||||||
|
messages,
|
||||||
|
modelProvider: "ollama",
|
||||||
|
modelConfig: ollamaConfig,
|
||||||
|
};
|
||||||
|
if (
|
||||||
|
process.env.NEXT_PUBLIC_LANGCHAIN_TRACING_V2 === "true" &&
|
||||||
|
process.env.NEXT_PUBLIC_LANGCHAIN_API_KEY !== undefined
|
||||||
|
) {
|
||||||
|
console.warn(
|
||||||
|
"[WARNING]: You have set your LangChain API key publicly. This should only be done in local devlopment - remember to remove it before deploying!",
|
||||||
|
);
|
||||||
|
payload.DEV_LANGCHAIN_TRACING = {
|
||||||
|
LANGCHAIN_TRACING_V2: "true",
|
||||||
|
LANGCHAIN_API_KEY: process.env.NEXT_PUBLIC_LANGCHAIN_API_KEY,
|
||||||
|
LANGCHAIN_PROJECT: process.env.NEXT_PUBLIC_LANGCHAIN_PROJECT,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
worker.current?.postMessage(payload);
|
||||||
|
const onMessageReceived = async (e: any) => {
|
||||||
|
switch (e.data.type) {
|
||||||
|
case "log":
|
||||||
|
console.log(e.data);
|
||||||
|
break;
|
||||||
|
case "init_progress":
|
||||||
|
if (initProgressToastId.current === null) {
|
||||||
|
initProgressToastId.current = toast(
|
||||||
|
"Loading model weights... This may take a while",
|
||||||
|
{
|
||||||
|
progress: e.data.data.progress || 0.01,
|
||||||
|
theme: "dark",
|
||||||
|
},
|
||||||
|
);
|
||||||
|
} else {
|
||||||
|
if (e.data.data.progress === 1) {
|
||||||
|
await new Promise((resolve) => setTimeout(resolve, 2000));
|
||||||
|
}
|
||||||
|
toast.update(initProgressToastId.current, {
|
||||||
|
progress: e.data.data.progress || 0.01,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
case "chunk":
|
||||||
|
controller.enqueue(e.data.data);
|
||||||
|
break;
|
||||||
|
case "error":
|
||||||
|
worker.current?.removeEventListener("message", onMessageReceived);
|
||||||
|
console.log(e.data.error);
|
||||||
|
const error = new Error(e.data.error);
|
||||||
|
controller.error(error);
|
||||||
|
break;
|
||||||
|
case "complete":
|
||||||
|
worker.current?.removeEventListener("message", onMessageReceived);
|
||||||
|
controller.close();
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
worker.current?.addEventListener("message", onMessageReceived);
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async function sendMessage(e: FormEvent<HTMLFormElement>) {
|
||||||
|
e.preventDefault();
|
||||||
|
|
||||||
|
if (isLoading || !input) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const initialInput = input;
|
||||||
|
const initialMessages = [...messages];
|
||||||
|
const newMessages = [
|
||||||
|
...initialMessages,
|
||||||
|
{ role: "human" as const, content: input },
|
||||||
|
];
|
||||||
|
|
||||||
|
setMessages(newMessages);
|
||||||
|
setIsLoading(true);
|
||||||
|
setInput("");
|
||||||
|
|
||||||
|
try {
|
||||||
|
const stream = await queryStore(newMessages);
|
||||||
|
const reader = stream.getReader();
|
||||||
|
|
||||||
|
let chunk = await reader.read();
|
||||||
|
|
||||||
|
const aiResponseMessage: ChatWindowMessage = {
|
||||||
|
content: "",
|
||||||
|
role: "ai" as const,
|
||||||
|
};
|
||||||
|
|
||||||
|
setMessages([...newMessages, aiResponseMessage]);
|
||||||
|
|
||||||
|
while (!chunk.done) {
|
||||||
|
aiResponseMessage.content = aiResponseMessage.content + chunk.value;
|
||||||
|
setMessages([...newMessages, aiResponseMessage]);
|
||||||
|
chunk = await reader.read();
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsLoading(false);
|
||||||
|
} catch (e: any) {
|
||||||
|
setMessages(initialMessages);
|
||||||
|
setIsLoading(false);
|
||||||
|
setInput(initialInput);
|
||||||
|
toast(`There was an issue with querying your website: ${e.message}`, {
|
||||||
|
theme: "dark",
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// We use the `useEffect` hook to set up the worker as soon as the `App` component is mounted.
|
||||||
|
useEffect(() => {
|
||||||
|
if (!worker.current) {
|
||||||
|
// Create the worker if it does not yet exist.
|
||||||
|
worker.current = new Worker(
|
||||||
|
new URL("../app/worker.ts", import.meta.url),
|
||||||
|
{
|
||||||
|
type: "module",
|
||||||
|
},
|
||||||
|
);
|
||||||
|
setIsLoading(false);
|
||||||
|
}
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
async function embedWebsite(e: FormEvent<HTMLFormElement>) {
|
||||||
|
console.log(e);
|
||||||
|
console.log(selectedURL);
|
||||||
|
console.log(firecrawlApiKey);
|
||||||
|
e.preventDefault();
|
||||||
|
// const reader = new FileReader();
|
||||||
|
if (selectedURL === null) {
|
||||||
|
toast(`You must enter a URL to embed.`, {
|
||||||
|
theme: "dark",
|
||||||
|
});
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
setIsLoading(true);
|
||||||
|
worker.current?.postMessage({
|
||||||
|
url: selectedURL,
|
||||||
|
firecrawlApiKey: firecrawlApiKey,
|
||||||
|
});
|
||||||
|
const onMessageReceived = (e: any) => {
|
||||||
|
switch (e.data.type) {
|
||||||
|
case "log":
|
||||||
|
console.log(e.data);
|
||||||
|
break;
|
||||||
|
case "error":
|
||||||
|
worker.current?.removeEventListener("message", onMessageReceived);
|
||||||
|
setIsLoading(false);
|
||||||
|
console.log(e.data.error);
|
||||||
|
toast(`There was an issue embedding your website: ${e.data.error}`, {
|
||||||
|
theme: "dark",
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
case "complete":
|
||||||
|
worker.current?.removeEventListener("message", onMessageReceived);
|
||||||
|
setIsLoading(false);
|
||||||
|
setReadyToChat(true);
|
||||||
|
toast(
|
||||||
|
`Embedding successful! Now try asking a question about your website.`,
|
||||||
|
{
|
||||||
|
theme: "dark",
|
||||||
|
},
|
||||||
|
);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
worker.current?.addEventListener("message", onMessageReceived);
|
||||||
|
}
|
||||||
|
|
||||||
|
const chooseDataComponent = (
|
||||||
|
<>
|
||||||
|
<div className="p-4 md:p-8 rounded bg-[#25252d] w-full max-h-[85%] overflow-hidden flex flex-col">
|
||||||
|
<h1 className="text-3xl md:text-4xl mb-2 ml-auto mr-auto">
|
||||||
|
{emoji} Local Chat With Websites {emoji}
|
||||||
|
</h1>
|
||||||
|
<ul>
|
||||||
|
<li className="text-l">
|
||||||
|
🏡
|
||||||
|
<span className="ml-2">
|
||||||
|
Welcome to the Local Web Chatbot!
|
||||||
|
<br></br>
|
||||||
|
<br></br>
|
||||||
|
This is a direct fork of{" "}
|
||||||
|
<a href="https://github.com/jacoblee93/fully-local-pdf-chatbot">
|
||||||
|
Jacob Lee's fully local PDF chatbot
|
||||||
|
</a>{" "}
|
||||||
|
replacing the chat with PDF functionality with website support. It
|
||||||
|
is a simple chatbot that allows you to ask questions about a
|
||||||
|
website by embedding it and running queries against the vector
|
||||||
|
store using a local LLM and embeddings.
|
||||||
|
</span>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
⚙️
|
||||||
|
<span className="ml-2">
|
||||||
|
The default LLM is Mistral-7B run locally by Ollama. You'll
|
||||||
|
need to install{" "}
|
||||||
|
<a target="_blank" href="https://ollama.ai">
|
||||||
|
the Ollama desktop app
|
||||||
|
</a>{" "}
|
||||||
|
and run the following commands to give this site access to the
|
||||||
|
locally running model:
|
||||||
|
<br />
|
||||||
|
<pre className="inline-flex px-2 py-1 my-2 rounded">
|
||||||
|
$ OLLAMA_ORIGINS=https://webml-demo.vercel.app
|
||||||
|
OLLAMA_HOST=127.0.0.1:11435 ollama serve
|
||||||
|
</pre>
|
||||||
|
<br />
|
||||||
|
Then, in another window:
|
||||||
|
<br />
|
||||||
|
<pre className="inline-flex px-2 py-1 my-2 rounded">
|
||||||
|
$ OLLAMA_HOST=127.0.0.1:11435 ollama pull mistral
|
||||||
|
</pre>
|
||||||
|
<br />
|
||||||
|
Additionally, you will need a Firecrawl API key for website
|
||||||
|
embedding. Signing up at{" "}
|
||||||
|
<a target="_blank" href="https://firecrawl.dev">
|
||||||
|
firecrawl.dev
|
||||||
|
</a>{" "}
|
||||||
|
is easy and you get 500 credits free. Enter your API key into the
|
||||||
|
box below the URL in the embedding form.
|
||||||
|
</span>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li className="text-l">
|
||||||
|
🐙
|
||||||
|
<span className="ml-2">
|
||||||
|
Both this template and Jacob Lee's template are open source -
|
||||||
|
you can see the source code and deploy your own version{" "}
|
||||||
|
<a
|
||||||
|
href="https://github.com/ericciarla/local-web-chatbot"
|
||||||
|
target="_blank"
|
||||||
|
>
|
||||||
|
from the GitHub repo
|
||||||
|
</a>
|
||||||
|
or Jacob's{" "}
|
||||||
|
<a href="https://github.com/jacoblee93/fully-local-pdf-chatbot">
|
||||||
|
original GitHub repo
|
||||||
|
</a>
|
||||||
|
!
|
||||||
|
</span>
|
||||||
|
</li>
|
||||||
|
<li className="text-l">
|
||||||
|
👇
|
||||||
|
<span className="ml-2">
|
||||||
|
Try embedding a website below, then asking questions! You can even
|
||||||
|
turn off your WiFi after the website is scraped.
|
||||||
|
</span>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<form
|
||||||
|
onSubmit={embedWebsite}
|
||||||
|
className="mt-4 flex flex-col justify-between items-center w-full"
|
||||||
|
>
|
||||||
|
<input
|
||||||
|
id="url_input"
|
||||||
|
type="text"
|
||||||
|
placeholder="Enter a URL to scrape"
|
||||||
|
className="text-black mb-2 w-[300px] px-4 py-2 rounded-lg"
|
||||||
|
onChange={(e) => setSelectedURL(e.target.value)}
|
||||||
|
></input>
|
||||||
|
<input
|
||||||
|
id="api_key_input"
|
||||||
|
type="text"
|
||||||
|
placeholder="Enter your Firecrawl API Key"
|
||||||
|
className="text-black mb-2 w-[300px] px-4 py-2 rounded-lg"
|
||||||
|
onChange={(e) => setFirecrawlApiKey(e.target.value)}
|
||||||
|
></input>
|
||||||
|
<button
|
||||||
|
type="submit"
|
||||||
|
className="shrink-0 px-4 py-4 bg-sky-600 rounded w-42"
|
||||||
|
>
|
||||||
|
<div
|
||||||
|
role="status"
|
||||||
|
className={`${isLoading ? "" : "hidden"} flex justify-center`}
|
||||||
|
>
|
||||||
|
<svg
|
||||||
|
aria-hidden="true"
|
||||||
|
className="w-6 h-6 text-white animate-spin dark:text-white fill-sky-800"
|
||||||
|
viewBox="0 0 100 101"
|
||||||
|
fill="none"
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
>
|
||||||
|
<path
|
||||||
|
d="M100 50.5908C100 78.2051 77.6142 100.591 50 100.591C22.3858 100.591 0 78.2051 0 50.5908C0 22.9766 22.3858 0.59082 50 0.59082C77.6142 0.59082 100 22.9766 100 50.5908ZM9.08144 50.5908C9.08144 73.1895 27.4013 91.5094 50 91.5094C72.5987 91.5094 90.9186 73.1895 90.9186 50.5908C90.9186 27.9921 72.5987 9.67226 50 9.67226C27.4013 9.67226 9.08144 27.9921 9.08144 50.5908Z"
|
||||||
|
fill="currentColor"
|
||||||
|
/>
|
||||||
|
<path
|
||||||
|
d="M93.9676 39.0409C96.393 38.4038 97.8624 35.9116 97.0079 33.5539C95.2932 28.8227 92.871 24.3692 89.8167 20.348C85.8452 15.1192 80.8826 10.7238 75.2124 7.41289C69.5422 4.10194 63.2754 1.94025 56.7698 1.05124C51.7666 0.367541 46.6976 0.446843 41.7345 1.27873C39.2613 1.69328 37.813 4.19778 38.4501 6.62326C39.0873 9.04874 41.5694 10.4717 44.0505 10.1071C47.8511 9.54855 51.7191 9.52689 55.5402 10.0491C60.8642 10.7766 65.9928 12.5457 70.6331 15.2552C75.2735 17.9648 79.3347 21.5619 82.5849 25.841C84.9175 28.9121 86.7997 32.2913 88.1811 35.8758C89.083 38.2158 91.5421 39.6781 93.9676 39.0409Z"
|
||||||
|
fill="currentFill"
|
||||||
|
/>
|
||||||
|
</svg>
|
||||||
|
<span className="sr-only">Loading...</span>
|
||||||
|
</div>
|
||||||
|
<span className={isLoading ? "hidden" : ""}>Embed Website</span>
|
||||||
|
</button>
|
||||||
|
</form>
|
||||||
|
</>
|
||||||
|
);
|
||||||
|
|
||||||
|
const chatInterfaceComponent = (
|
||||||
|
<>
|
||||||
|
<div className="flex flex-col-reverse w-full mb-4 overflow-auto grow">
|
||||||
|
{messages.length > 0
|
||||||
|
? [...messages].reverse().map((m, i) => (
|
||||||
|
<ChatMessageBubble
|
||||||
|
key={i}
|
||||||
|
message={m}
|
||||||
|
aiEmoji={emoji}
|
||||||
|
onRemovePressed={() =>
|
||||||
|
setMessages((previousMessages) => {
|
||||||
|
const displayOrderedMessages = previousMessages.reverse();
|
||||||
|
return [
|
||||||
|
...displayOrderedMessages.slice(0, i),
|
||||||
|
...displayOrderedMessages.slice(i + 1),
|
||||||
|
].reverse();
|
||||||
|
})
|
||||||
|
}
|
||||||
|
></ChatMessageBubble>
|
||||||
|
))
|
||||||
|
: ""}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<form onSubmit={sendMessage} className="flex w-full flex-col">
|
||||||
|
<div className="flex w-full mt-4">
|
||||||
|
<input
|
||||||
|
className="grow mr-8 p-4 rounded"
|
||||||
|
value={input}
|
||||||
|
placeholder={placeholder ?? "What's it like to be a pirate?"}
|
||||||
|
onChange={(e) => setInput(e.target.value)}
|
||||||
|
/>
|
||||||
|
<button
|
||||||
|
type="submit"
|
||||||
|
className="shrink-0 px-8 py-4 bg-sky-600 rounded w-28"
|
||||||
|
>
|
||||||
|
<div
|
||||||
|
role="status"
|
||||||
|
className={`${isLoading ? "" : "hidden"} flex justify-center`}
|
||||||
|
>
|
||||||
|
<svg
|
||||||
|
aria-hidden="true"
|
||||||
|
className="w-6 h-6 text-white animate-spin dark:text-white fill-sky-800"
|
||||||
|
viewBox="0 0 100 101"
|
||||||
|
fill="none"
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
>
|
||||||
|
<path
|
||||||
|
d="M100 50.5908C100 78.2051 77.6142 100.591 50 100.591C22.3858 100.591 0 78.2051 0 50.5908C0 22.9766 22.3858 0.59082 50 0.59082C77.6142 0.59082 100 22.9766 100 50.5908ZM9.08144 50.5908C9.08144 73.1895 27.4013 91.5094 50 91.5094C72.5987 91.5094 90.9186 73.1895 90.9186 50.5908C90.9186 27.9921 72.5987 9.67226 50 9.67226C27.4013 9.67226 9.08144 27.9921 9.08144 50.5908Z"
|
||||||
|
fill="currentColor"
|
||||||
|
/>
|
||||||
|
<path
|
||||||
|
d="M93.9676 39.0409C96.393 38.4038 97.8624 35.9116 97.0079 33.5539C95.2932 28.8227 92.871 24.3692 89.8167 20.348C85.8452 15.1192 80.8826 10.7238 75.2124 7.41289C69.5422 4.10194 63.2754 1.94025 56.7698 1.05124C51.7666 0.367541 46.6976 0.446843 41.7345 1.27873C39.2613 1.69328 37.813 4.19778 38.4501 6.62326C39.0873 9.04874 41.5694 10.4717 44.0505 10.1071C47.8511 9.54855 51.7191 9.52689 55.5402 10.0491C60.8642 10.7766 65.9928 12.5457 70.6331 15.2552C75.2735 17.9648 79.3347 21.5619 82.5849 25.841C84.9175 28.9121 86.7997 32.2913 88.1811 35.8758C89.083 38.2158 91.5421 39.6781 93.9676 39.0409Z"
|
||||||
|
fill="currentFill"
|
||||||
|
/>
|
||||||
|
</svg>
|
||||||
|
<span className="sr-only">Loading...</span>
|
||||||
|
</div>
|
||||||
|
<span className={isLoading ? "hidden" : ""}>Send</span>
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</>
|
||||||
|
);
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div
|
||||||
|
className={`flex flex-col items-center p-4 md:p-8 rounded grow overflow-hidden ${
|
||||||
|
readyToChat ? "border" : ""
|
||||||
|
}`}
|
||||||
|
>
|
||||||
|
<h2 className={`${readyToChat ? "" : "hidden"} text-2xl`}>
|
||||||
|
{emoji} {titleText}
|
||||||
|
</h2>
|
||||||
|
{readyToChat ? chatInterfaceComponent : chooseDataComponent}
|
||||||
|
<ToastContainer />
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
|
@ -0,0 +1,16 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { usePathname } from 'next/navigation';
|
||||||
|
|
||||||
|
export function Navbar() {
|
||||||
|
const pathname = usePathname();
|
||||||
|
return (
|
||||||
|
<nav className="mb-4">
|
||||||
|
<a className={`mr-4 ${pathname === "/" ? "text-white border-b" : ""}`} href="/">🏴☠️ Chat</a>
|
||||||
|
<a className={`mr-4 ${pathname === "/structured_output" ? "text-white border-b" : ""}`} href="/structured_output">🧱 Structured Output</a>
|
||||||
|
<a className={`mr-4 ${pathname === "/agents" ? "text-white border-b" : ""}`} href="/agents">🦜 Agents</a>
|
||||||
|
<a className={`mr-4 ${pathname === "/retrieval" ? "text-white border-b" : ""}`} href="/retrieval">🐶 Retrieval</a>
|
||||||
|
<a className={`mr-4 ${pathname === "/retrieval_agents" ? "text-white border-b" : ""}`} href="/retrieval_agents">🤖 Retrieval Agents</a>
|
||||||
|
</nav>
|
||||||
|
);
|
||||||
|
}
|
|
@ -0,0 +1,39 @@
|
||||||
|
/** @type {import('next').NextConfig} */
|
||||||
|
const nextConfig = {
|
||||||
|
// (Optional) Export as a static site
|
||||||
|
// See https://nextjs.org/docs/pages/building-your-application/deploying/static-exports#configuration
|
||||||
|
output: 'export', // Feel free to modify/remove this option
|
||||||
|
|
||||||
|
// Override the default webpack configuration
|
||||||
|
webpack: (config, { isServer }) => {
|
||||||
|
// See https://webpack.js.org/configuration/resolve/#resolvealias
|
||||||
|
config.resolve.alias = {
|
||||||
|
...config.resolve.alias,
|
||||||
|
"sharp$": false,
|
||||||
|
"onnxruntime-node$": false,
|
||||||
|
}
|
||||||
|
config.experiments = {
|
||||||
|
...config.experiments,
|
||||||
|
topLevelAwait: true,
|
||||||
|
asyncWebAssembly: true,
|
||||||
|
};
|
||||||
|
config.module.rules.push({
|
||||||
|
test: /\.md$/i,
|
||||||
|
use: "raw-loader",
|
||||||
|
});
|
||||||
|
// Fixes npm packages that depend on `fs` module
|
||||||
|
if (!isServer) {
|
||||||
|
config.resolve.fallback = {
|
||||||
|
...config.resolve.fallback, // if you miss it, all the other options in fallback, specified
|
||||||
|
// by next.js will be dropped. Doesn't make much sense, but how it is
|
||||||
|
fs: false, // the solution
|
||||||
|
"node:fs/promises": false,
|
||||||
|
module: false,
|
||||||
|
perf_hooks: false,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return config;
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
module.exports = nextConfig
|
47
examples/example_web_apps/local-website-chatbot/package.json
Normal file
|
@ -0,0 +1,47 @@
|
||||||
|
{
|
||||||
|
"name": "local-website-chatbot",
|
||||||
|
"version": "0.0.0",
|
||||||
|
"private": true,
|
||||||
|
"scripts": {
|
||||||
|
"dev": "next dev",
|
||||||
|
"build": "next build",
|
||||||
|
"start": "next start",
|
||||||
|
"lint": "next lint",
|
||||||
|
"format": "prettier --write \"app\""
|
||||||
|
},
|
||||||
|
"engines": {
|
||||||
|
"node": ">=18"
|
||||||
|
},
|
||||||
|
"dependencies": {
|
||||||
|
"@langchain/community": "^0.2.9",
|
||||||
|
"@langchain/weaviate": "^0.0.4",
|
||||||
|
"@mendable/firecrawl-js": "^0.0.26",
|
||||||
|
"@mlc-ai/web-llm": "^0.2.42",
|
||||||
|
"@types/node": "20.4.5",
|
||||||
|
"@types/react": "18.2.17",
|
||||||
|
"@types/react-dom": "18.2.7",
|
||||||
|
"@xenova/transformers": "^2.16.0",
|
||||||
|
"autoprefixer": "10.4.14",
|
||||||
|
"encoding": "^0.1.13",
|
||||||
|
"eslint": "8.46.0",
|
||||||
|
"eslint-config-next": "13.4.12",
|
||||||
|
"jest": "^29.7.0",
|
||||||
|
"langchain": "^0.2.5",
|
||||||
|
"next": "13.4.12",
|
||||||
|
"pdf-parse": "^1.1.1",
|
||||||
|
"postcss": "8.4.27",
|
||||||
|
"react": "18.2.0",
|
||||||
|
"react-dom": "18.2.0",
|
||||||
|
"react-toastify": "^10.0.5",
|
||||||
|
"tailwindcss": "3.3.3",
|
||||||
|
"ts-node": "^10.9.2",
|
||||||
|
"typescript": "^5.4.5",
|
||||||
|
"voy-search": "^0.6.3"
|
||||||
|
},
|
||||||
|
"devDependencies": {
|
||||||
|
"prettier": "3.0.0"
|
||||||
|
},
|
||||||
|
"resolutions": {
|
||||||
|
"@langchain/core": "0.2.6"
|
||||||
|
}
|
||||||
|
}
|
|
@ -0,0 +1,6 @@
|
||||||
|
module.exports = {
|
||||||
|
plugins: {
|
||||||
|
tailwindcss: {},
|
||||||
|
autoprefixer: {},
|
||||||
|
},
|
||||||
|
}
|
After Width: | Height: | Size: 4.0 MiB |
After Width: | Height: | Size: 15 KiB |
After Width: | Height: | Size: 308 KiB |
|
@ -0,0 +1,6 @@
|
||||||
|
export type ChatWindowMessage = {
|
||||||
|
content: string;
|
||||||
|
role: "human" | "ai";
|
||||||
|
runId?: string;
|
||||||
|
traceUrl?: string;
|
||||||
|
}
|
|
@ -0,0 +1,18 @@
|
||||||
|
/** @type {import('tailwindcss').Config} */
|
||||||
|
module.exports = {
|
||||||
|
content: [
|
||||||
|
'./pages/**/*.{js,ts,jsx,tsx,mdx}',
|
||||||
|
'./components/**/*.{js,ts,jsx,tsx,mdx}',
|
||||||
|
'./app/**/*.{js,ts,jsx,tsx,mdx}',
|
||||||
|
],
|
||||||
|
theme: {
|
||||||
|
extend: {
|
||||||
|
backgroundImage: {
|
||||||
|
'gradient-radial': 'radial-gradient(var(--tw-gradient-stops))',
|
||||||
|
'gradient-conic':
|
||||||
|
'conic-gradient(from 180deg at 50% 50%, var(--tw-gradient-stops))',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
plugins: [],
|
||||||
|
}
|
|
@ -0,0 +1,28 @@
|
||||||
|
{
|
||||||
|
"compilerOptions": {
|
||||||
|
"target": "es5",
|
||||||
|
"lib": ["dom", "dom.iterable", "esnext"],
|
||||||
|
"allowJs": true,
|
||||||
|
"skipLibCheck": true,
|
||||||
|
"strict": true,
|
||||||
|
"forceConsistentCasingInFileNames": true,
|
||||||
|
"noEmit": true,
|
||||||
|
"esModuleInterop": true,
|
||||||
|
"module": "esnext",
|
||||||
|
"moduleResolution": "bundler",
|
||||||
|
"resolveJsonModule": true,
|
||||||
|
"isolatedModules": true,
|
||||||
|
"jsx": "preserve",
|
||||||
|
"incremental": true,
|
||||||
|
"plugins": [
|
||||||
|
{
|
||||||
|
"name": "next"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"paths": {
|
||||||
|
"@/*": ["./*"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
|
||||||
|
"exclude": ["node_modules"]
|
||||||
|
}
|
5596
examples/example_web_apps/local-website-chatbot/yarn.lock
Normal file
|
@ -0,0 +1,3 @@
|
||||||
|
{
|
||||||
|
"extends": "next/core-web-vitals"
|
||||||
|
}
|
Before Width: | Height: | Size: 7.8 KiB After Width: | Height: | Size: 7.8 KiB |
Before Width: | Height: | Size: 23 KiB After Width: | Height: | Size: 23 KiB |
Before Width: | Height: | Size: 7.0 KiB After Width: | Height: | Size: 7.0 KiB |
Before Width: | Height: | Size: 444 KiB After Width: | Height: | Size: 444 KiB |
Before Width: | Height: | Size: 492 B After Width: | Height: | Size: 492 B |
Before Width: | Height: | Size: 997 B After Width: | Height: | Size: 997 B |
Before Width: | Height: | Size: 15 KiB After Width: | Height: | Size: 15 KiB |
Before Width: | Height: | Size: 1.3 KiB After Width: | Height: | Size: 1.3 KiB |
Before Width: | Height: | Size: 262 KiB After Width: | Height: | Size: 262 KiB |
Before Width: | Height: | Size: 629 B After Width: | Height: | Size: 629 B |
Before Width: | Height: | Size: 15 KiB After Width: | Height: | Size: 15 KiB |
11
examples/scrape_and_analyze_airbnb_data_e2b/.env.template
Normal file
|
@ -0,0 +1,11 @@
|
||||||
|
# TODO: Get your E2B API key from https://e2b.dev/docs
|
||||||
|
E2B_API_KEY=""
|
||||||
|
|
||||||
|
# TODO: Get your Firecrawl API key from https://firecrawl.dev
|
||||||
|
FIRECRAWL_API_KEY=""
|
||||||
|
|
||||||
|
# TODO: Get your Anthropic API key from https://anthropic.com
|
||||||
|
ANTHROPIC_API_KEY=""
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,2 @@
|
||||||
|
# Ignore artifacts:
|
||||||
|
node_modules
|
31
examples/scrape_and_analyze_airbnb_data_e2b/README.md
Normal file
|
@ -0,0 +1,31 @@
|
||||||
|
# Scrape and Analyze Airbnb Data with Firecrawl and E2B
|
||||||
|
|
||||||
|
This example demonstrates how to scrape Airbnb data and analyze it using [Firecrawl](https://www.firecrawl.dev/) and the [Code Interpreter SDK](https://github.com/e2b-dev/code-interpreter) from E2B.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Node.js installed on your machine
|
||||||
|
- An E2B API key
|
||||||
|
- A Firecrawl API key
|
||||||
|
- A Anthropic API key
|
||||||
|
|
||||||
|
## Setup & run
|
||||||
|
|
||||||
|
### 1. Install dependencies
|
||||||
|
|
||||||
|
```
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Set up `.env`
|
||||||
|
|
||||||
|
1. Copy `.env.template` to `.env`
|
||||||
|
2. Get [E2B API key](https://e2b.dev/docs/getting-started/api-key)
|
||||||
|
3. Get [Firecrawl API key](https://firecrawl.dev)
|
||||||
|
4. Get [Anthropic API key](https://anthropic.com)
|
||||||
|
|
||||||
|
### 3. Run the example
|
||||||
|
|
||||||
|
```
|
||||||
|
npm run start
|
||||||
|
```
|
453
examples/scrape_and_analyze_airbnb_data_e2b/airbnb_listings.json
Normal file
|
@ -0,0 +1,453 @@
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"title": "2br Victorian House with Breathtaking views",
|
||||||
|
"price_per_night": 356,
|
||||||
|
"location": "Potrero Hill",
|
||||||
|
"rating": 4.98,
|
||||||
|
"reviews": 184
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "543c - convenient cozy private bedroom for 1 person",
|
||||||
|
"price_per_night": 52,
|
||||||
|
"location": "Inner Richmond",
|
||||||
|
"rating": 4.72,
|
||||||
|
"reviews": 68
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Clean Bright Airy Private Apt in the Heart of SF",
|
||||||
|
"price_per_night": 269,
|
||||||
|
"location": "Marina District",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 239
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Garden Suite by Golden Gate Park, Private Bathrm",
|
||||||
|
"price_per_night": 79,
|
||||||
|
"location": "Outer Richmond",
|
||||||
|
"rating": 4.82,
|
||||||
|
"reviews": 1113
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "#2 private bathroom next to The Ritz- Carlton",
|
||||||
|
"price_per_night": 98,
|
||||||
|
"location": "Union Square",
|
||||||
|
"rating": 4.96,
|
||||||
|
"reviews": 494
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Central, cozy one-bedroom condo in San Francisco",
|
||||||
|
"price_per_night": 262,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.98,
|
||||||
|
"reviews": 46
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Large Light Filled Quiet Artist built 2BR Apt",
|
||||||
|
"price_per_night": 273,
|
||||||
|
"location": "Mission District",
|
||||||
|
"rating": 4.99,
|
||||||
|
"reviews": 132
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Oceanside Getaway",
|
||||||
|
"price_per_night": 160,
|
||||||
|
"location": "Outer Sunset",
|
||||||
|
"rating": 4.93,
|
||||||
|
"reviews": 559
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Perfect getaway near Japantown",
|
||||||
|
"price_per_night": 159,
|
||||||
|
"location": "Japantown",
|
||||||
|
"rating": 4.92,
|
||||||
|
"reviews": 515
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Style & Comfort-Private Suite near UCSF and GGPark",
|
||||||
|
"price_per_night": 155,
|
||||||
|
"location": "Inner Sunset",
|
||||||
|
"rating": 4.98,
|
||||||
|
"reviews": 439
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Central quiet Victorian Flat",
|
||||||
|
"price_per_night": 224,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 15
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Palm Trees private room near Ocean Beach Zoo GGPK",
|
||||||
|
"price_per_night": 76,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 200
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Spacious 1BR in the Mission w/ huge living room",
|
||||||
|
"price_per_night": 195,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 7
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Modern Hilltop Studio - Private Entry and Garden",
|
||||||
|
"price_per_night": 230,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.94,
|
||||||
|
"reviews": 196
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Bright Modern Private Soma Studio Street Entrance",
|
||||||
|
"price_per_night": 125,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.9,
|
||||||
|
"reviews": 214
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Castro private room & bath VIEW (no cleaning fee)",
|
||||||
|
"price_per_night": 112,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.94,
|
||||||
|
"reviews": 440
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Nob Hill Studio",
|
||||||
|
"price_per_night": 148,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 42
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Spacious and Sunny Noe Valley Gem!",
|
||||||
|
"price_per_night": 115,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 68
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "SF Ocean Beach In-law Suite",
|
||||||
|
"price_per_night": 162,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.91,
|
||||||
|
"reviews": 646
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Comfortable, cozy, private studio - Bernal Heights",
|
||||||
|
"price_per_night": 145,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.9,
|
||||||
|
"reviews": 866
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Casa Pinudo Queen Bed Room & City-View Roofdeck",
|
||||||
|
"price_per_night": 100,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.87,
|
||||||
|
"reviews": 47
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Bright bedroom in Victorian home",
|
||||||
|
"price_per_night": 114,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 183
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "#1 SF 24th ave& kirkham st Master king room",
|
||||||
|
"price_per_night": 104,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 59
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Cheerful 1 bedroom",
|
||||||
|
"price_per_night": 137,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.79,
|
||||||
|
"reviews": 111
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Luxury Studio Near SFO, SFSU ,BART, Walk to shops!",
|
||||||
|
"price_per_night": 116,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.96,
|
||||||
|
"reviews": 139
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "#4 SF Sunset 24th ave&kirkham st Deluxe king room",
|
||||||
|
"price_per_night": 104,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.96,
|
||||||
|
"reviews": 74
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Modern room & loft, private entrance",
|
||||||
|
"price_per_night": 78,
|
||||||
|
"location": "San Bruno",
|
||||||
|
"rating": 4.88,
|
||||||
|
"reviews": 868
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "1 Queen bedded room w/full bath",
|
||||||
|
"price_per_night": 117,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.93,
|
||||||
|
"reviews": 120
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Charming Noe Valley Garden Oasis",
|
||||||
|
"price_per_night": 249,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.89,
|
||||||
|
"reviews": 199
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Beautiful Garden Studio in heart of Nopa",
|
||||||
|
"price_per_night": 343,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.76,
|
||||||
|
"reviews": 259
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "#4 SF Sunset 24th ave&kirkham st Deluxe king room",
|
||||||
|
"price_per_night": 175,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.96,
|
||||||
|
"reviews": 74
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Noteworthy Large Private Bedroom - Best Location",
|
||||||
|
"price_per_night": 159,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.86,
|
||||||
|
"reviews": 63
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Primary Suite Golden Gate Bridge view Private Deck",
|
||||||
|
"price_per_night": 317,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.93,
|
||||||
|
"reviews": 445
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Private Room: Escape to the Mission",
|
||||||
|
"price_per_night": 186,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.72,
|
||||||
|
"reviews": 501
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "#1 SF 24th ave& kirkham st Master king room",
|
||||||
|
"price_per_night": 176,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 59
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Private Suite",
|
||||||
|
"price_per_night": 154,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.91,
|
||||||
|
"reviews": 77
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "501 Post road, San Francisco 94102",
|
||||||
|
"price_per_night": 267,
|
||||||
|
"location": "San Francisco"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Most Desired Vacation Spot in San Francisco.",
|
||||||
|
"price_per_night": 650,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 78
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "The House Protects The Dreamer - 2BR/1BA Victorian",
|
||||||
|
"price_per_night": 376,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.92,
|
||||||
|
"reviews": 179
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Private Junior Room in Artist's Flat",
|
||||||
|
"price_per_night": 130,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.7,
|
||||||
|
"reviews": 165
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Serenity by the Park , Your Golden Gate Getaway",
|
||||||
|
"price_per_night": 238,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.86,
|
||||||
|
"reviews": 21
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Golden Getaway • Spacious Private Room 15m to SFO",
|
||||||
|
"price_per_night": 113,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.91,
|
||||||
|
"reviews": 169
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Modern studio next to beach, G.G Park & transport",
|
||||||
|
"price_per_night": 300,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.87,
|
||||||
|
"reviews": 318
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Affordable Cozy Private Room near San Francisco",
|
||||||
|
"price_per_night": 88,
|
||||||
|
"location": "San Francisco"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Affordable Room w/ Great View in San Francisco",
|
||||||
|
"price_per_night": 134,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.79,
|
||||||
|
"reviews": 121
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Nob Hill Studio",
|
||||||
|
"price_per_night": 250,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 42
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Cozy Sunset suite",
|
||||||
|
"price_per_night": 120,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.66,
|
||||||
|
"reviews": 98
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Stay with Dongmei",
|
||||||
|
"price_per_night": 48,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.76,
|
||||||
|
"reviews": 343
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Mediterranean style private studio",
|
||||||
|
"price_per_night": 139,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.93,
|
||||||
|
"reviews": 489
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Cozy Garden Unit with All Amenities",
|
||||||
|
"price_per_night": 122,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 446
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Private Sunset Getaway",
|
||||||
|
"price_per_night": 88,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.85,
|
||||||
|
"reviews": 26
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Pacifica studio- ocean view from deck",
|
||||||
|
"price_per_night": 91,
|
||||||
|
"location": "Pacifica",
|
||||||
|
"rating": 4.76,
|
||||||
|
"reviews": 352
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Sweet Suite w/ EV charger & Parking, 5 min to SFSU",
|
||||||
|
"price_per_night": 167,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.96,
|
||||||
|
"reviews": 141
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Light-Filled Hillside Studio Apartment",
|
||||||
|
"price_per_night": 128,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.79,
|
||||||
|
"reviews": 148
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Stay with Benoîte",
|
||||||
|
"price_per_night": 72,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.87,
|
||||||
|
"reviews": 131
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Serene King Suite with Jacuzzi & Fireplace",
|
||||||
|
"price_per_night": 225,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.89,
|
||||||
|
"reviews": 18
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Stay with Ryan",
|
||||||
|
"price_per_night": 225,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.89,
|
||||||
|
"reviews": 18
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Perfectly located Castro",
|
||||||
|
"price_per_night": 99,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.93,
|
||||||
|
"reviews": 488
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Sweet garden suite with free parking",
|
||||||
|
"price_per_night": 169,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.99,
|
||||||
|
"reviews": 226
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Bright+Modern Brand New Guest House-Great Location",
|
||||||
|
"price_per_night": 118,
|
||||||
|
"location": "South San Francisco",
|
||||||
|
"rating": 5,
|
||||||
|
"reviews": 37
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Garden studio - Presidio, Baker Beach",
|
||||||
|
"price_per_night": 194,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.91,
|
||||||
|
"reviews": 68
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Single Small private 1br 1ba no clean fee!",
|
||||||
|
"price_per_night": 106,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.78,
|
||||||
|
"reviews": 310
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Private Cozy room 3",
|
||||||
|
"price_per_night": 53,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.86,
|
||||||
|
"reviews": 332
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Central Mission Potrero 1BED-1BATH",
|
||||||
|
"price_per_night": 164,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.81,
|
||||||
|
"reviews": 324
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Cozy Remodeled Suite in Oceanview With Parking",
|
||||||
|
"price_per_night": 151,
|
||||||
|
"location": "San Francisco",
|
||||||
|
"rating": 4.95,
|
||||||
|
"reviews": 310
|
||||||
|
}
|
||||||
|
]
|
After Width: | Height: | Size: 20 KiB |
|
@ -0,0 +1,26 @@
|
||||||
|
import { CodeInterpreter } from '@e2b/code-interpreter'
|
||||||
|
|
||||||
|
export async function codeInterpret(
|
||||||
|
codeInterpreter: CodeInterpreter,
|
||||||
|
code: string
|
||||||
|
) {
|
||||||
|
console.log(
|
||||||
|
`\n${'='.repeat(50)}\n> Running following AI-generated code:\n${code}\n${'='.repeat(50)}`
|
||||||
|
)
|
||||||
|
|
||||||
|
const exec = await codeInterpreter.notebook.execCell(code, {
|
||||||
|
// You can stream logs from the code interpreter
|
||||||
|
// onStderr: (stderr: string) => console.log("\n[Code Interpreter stdout]", stderr),
|
||||||
|
// onStdout: (stdout: string) => console.log("\n[Code Interpreter stderr]", stdout),
|
||||||
|
//
|
||||||
|
// You can also stream additional results like charts, images, etc.
|
||||||
|
// onResult: ...
|
||||||
|
})
|
||||||
|
|
||||||
|
if (exec.error) {
|
||||||
|
console.log('[Code Interpreter error]', exec.error) // Runtime error
|
||||||
|
return undefined
|
||||||
|
}
|
||||||
|
|
||||||
|
return exec
|
||||||
|
}
|
118
examples/scrape_and_analyze_airbnb_data_e2b/index.ts
Normal file
|
@ -0,0 +1,118 @@
|
||||||
|
// @ts-ignore
|
||||||
|
import * as fs from 'fs'
|
||||||
|
|
||||||
|
import 'dotenv/config'
|
||||||
|
import { CodeInterpreter, Execution } from '@e2b/code-interpreter'
|
||||||
|
import Anthropic from '@anthropic-ai/sdk'
|
||||||
|
import { Buffer } from 'buffer'
|
||||||
|
|
||||||
|
import { MODEL_NAME, SYSTEM_PROMPT, tools } from './model'
|
||||||
|
|
||||||
|
import { codeInterpret } from './codeInterpreter'
|
||||||
|
import { scrapeAirbnb } from './scraping'
|
||||||
|
|
||||||
|
const anthropic = new Anthropic()
|
||||||
|
|
||||||
|
async function chat(
|
||||||
|
codeInterpreter: CodeInterpreter,
|
||||||
|
userMessage: string
|
||||||
|
): Promise<Execution | undefined> {
|
||||||
|
console.log('Waiting for Claude...')
|
||||||
|
|
||||||
|
const msg = await anthropic.beta.tools.messages.create({
|
||||||
|
model: MODEL_NAME,
|
||||||
|
system: SYSTEM_PROMPT,
|
||||||
|
max_tokens: 4096,
|
||||||
|
messages: [{ role: 'user', content: userMessage }],
|
||||||
|
tools,
|
||||||
|
})
|
||||||
|
|
||||||
|
console.log(
|
||||||
|
`\n${'='.repeat(50)}\nModel response: ${msg.content}\n${'='.repeat(50)}`
|
||||||
|
)
|
||||||
|
console.log(msg)
|
||||||
|
|
||||||
|
if (msg.stop_reason === 'tool_use') {
|
||||||
|
const toolBlock = msg.content.find((block) => block.type === 'tool_use')
|
||||||
|
// @ts-ignore
|
||||||
|
const toolName = toolBlock?.name ?? ''
|
||||||
|
// @ts-ignore
|
||||||
|
const toolInput = toolBlock?.input ?? ''
|
||||||
|
|
||||||
|
console.log(
|
||||||
|
`\n${'='.repeat(50)}\nUsing tool: ${toolName}\n${'='.repeat(50)}`
|
||||||
|
)
|
||||||
|
|
||||||
|
if (toolName === 'execute_python') {
|
||||||
|
const code = toolInput.code
|
||||||
|
return codeInterpret(codeInterpreter, code)
|
||||||
|
}
|
||||||
|
return undefined
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function run() {
|
||||||
|
// Load the Airbnb prices data from the JSON file
|
||||||
|
let data
|
||||||
|
const readDataFromFile = () => {
|
||||||
|
try {
|
||||||
|
return fs.readFileSync('airbnb_listings.json', 'utf8')
|
||||||
|
} catch (err) {
|
||||||
|
if (err.code === 'ENOENT') {
|
||||||
|
console.log('File not found, scraping data...')
|
||||||
|
return null
|
||||||
|
} else {
|
||||||
|
throw err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const fetchData = async () => {
|
||||||
|
data = readDataFromFile()
|
||||||
|
if (!data || data.trim() === '[]') {
|
||||||
|
console.log('File is empty or contains an empty list, scraping data...')
|
||||||
|
data = await scrapeAirbnb()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
await fetchData()
|
||||||
|
|
||||||
|
// Parse the JSON data
|
||||||
|
const prices = JSON.parse(data)
|
||||||
|
|
||||||
|
// Convert prices array to a string representation of a Python list
|
||||||
|
const pricesList = JSON.stringify(prices)
|
||||||
|
|
||||||
|
const userMessage = `
|
||||||
|
Load the Airbnb prices data from the airbnb listing below and visualize the distribution of prices with a histogram. Listing data: ${pricesList}
|
||||||
|
`
|
||||||
|
|
||||||
|
const codeInterpreter = await CodeInterpreter.create()
|
||||||
|
const codeOutput = await chat(codeInterpreter, userMessage)
|
||||||
|
if (!codeOutput) {
|
||||||
|
console.log('No code output')
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
const logs = codeOutput.logs
|
||||||
|
console.log(logs)
|
||||||
|
|
||||||
|
if (codeOutput.results.length == 0) {
|
||||||
|
console.log('No results')
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
const firstResult = codeOutput.results[0]
|
||||||
|
console.log(firstResult.text)
|
||||||
|
|
||||||
|
if (firstResult.png) {
|
||||||
|
const pngData = Buffer.from(firstResult.png, 'base64')
|
||||||
|
const filename = 'airbnb_prices_chart.png'
|
||||||
|
fs.writeFileSync(filename, pngData)
|
||||||
|
console.log(`✅ Saved chart to ${filename}`)
|
||||||
|
}
|
||||||
|
|
||||||
|
await codeInterpreter.close()
|
||||||
|
}
|
||||||
|
|
||||||
|
run()
|
33
examples/scrape_and_analyze_airbnb_data_e2b/model.ts
Normal file
|
@ -0,0 +1,33 @@
|
||||||
|
import { Tool } from '@anthropic-ai/sdk/src/resources/beta/tools'
|
||||||
|
|
||||||
|
export const MODEL_NAME = 'claude-3-opus-20240229'
|
||||||
|
|
||||||
|
export const SYSTEM_PROMPT = `
|
||||||
|
## your job & context
|
||||||
|
you are a python data scientist. you are given tasks to complete and you run python code to solve them.
|
||||||
|
- the python code runs in jupyter notebook.
|
||||||
|
- every time you call \`execute_python\` tool, the python code is executed in a separate cell. it's okay to multiple calls to \`execute_python\`.
|
||||||
|
- display visualizations using matplotlib or any other visualization library directly in the notebook. don't worry about saving the visualizations to a file.
|
||||||
|
- you have access to the internet and can make api requests.
|
||||||
|
- you also have access to the filesystem and can read/write files.
|
||||||
|
- you can install any pip package (if it exists) if you need to but the usual packages for data analysis are already preinstalled.
|
||||||
|
- you can run any python code you want, everything is running in a secure sandbox environment.
|
||||||
|
`
|
||||||
|
|
||||||
|
export const tools: Tool[] = [
|
||||||
|
{
|
||||||
|
name: 'execute_python',
|
||||||
|
description:
|
||||||
|
'Execute python code in a Jupyter notebook cell and returns any result, stdout, stderr, display_data, and error.',
|
||||||
|
input_schema: {
|
||||||
|
type: 'object',
|
||||||
|
properties: {
|
||||||
|
code: {
|
||||||
|
type: 'string',
|
||||||
|
description: 'The python code to execute in a single cell.',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
required: ['code'],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
]
|
1035
examples/scrape_and_analyze_airbnb_data_e2b/package-lock.json
generated
Normal file
26
examples/scrape_and_analyze_airbnb_data_e2b/package.json
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
{
|
||||||
|
"name": "hello-world",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"description": "",
|
||||||
|
"main": "index.js",
|
||||||
|
"scripts": {
|
||||||
|
"start": "tsx index.ts",
|
||||||
|
"test": "echo \"Error: no test specified\" && exit 1"
|
||||||
|
},
|
||||||
|
"keywords": [],
|
||||||
|
"author": "",
|
||||||
|
"license": "ISC",
|
||||||
|
"devDependencies": {
|
||||||
|
"@types/node": "^20.12.12",
|
||||||
|
"prettier": "3.2.5",
|
||||||
|
"tsx": "^4.7.3",
|
||||||
|
"typescript": "^5.4.5"
|
||||||
|
},
|
||||||
|
"dependencies": {
|
||||||
|
"@anthropic-ai/sdk": "^0.20.7",
|
||||||
|
"@e2b/code-interpreter": "^0.0.2",
|
||||||
|
"@mendable/firecrawl-js": "^0.0.21",
|
||||||
|
"buffer": "^6.0.3",
|
||||||
|
"dotenv": "^16.4.5"
|
||||||
|
}
|
||||||
|
}
|
|
@ -0,0 +1,11 @@
|
||||||
|
// prettier.config.js, .prettierrc.js, prettier.config.mjs, or .prettierrc.mjs
|
||||||
|
|
||||||
|
/** @type {import("prettier").Config} */
|
||||||
|
const config = {
|
||||||
|
trailingComma: 'es5',
|
||||||
|
tabWidth: 2,
|
||||||
|
semi: false,
|
||||||
|
singleQuote: true,
|
||||||
|
}
|
||||||
|
|
||||||
|
export default config
|
98
examples/scrape_and_analyze_airbnb_data_e2b/scraping.ts
Normal file
|
@ -0,0 +1,98 @@
|
||||||
|
//@ts-ignore
|
||||||
|
import * as fs from 'fs'
|
||||||
|
import FirecrawlApp from '@mendable/firecrawl-js'
|
||||||
|
import 'dotenv/config'
|
||||||
|
import { config } from 'dotenv'
|
||||||
|
import { z } from 'zod'
|
||||||
|
|
||||||
|
config()
|
||||||
|
|
||||||
|
export async function scrapeAirbnb() {
|
||||||
|
try {
|
||||||
|
// Initialize the FirecrawlApp with your API key
|
||||||
|
const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY })
|
||||||
|
|
||||||
|
// Define the URL to crawl
|
||||||
|
const listingsUrl =
|
||||||
|
'https://www.airbnb.com/s/San-Francisco--CA--United-States/homes'
|
||||||
|
|
||||||
|
const baseUrl = 'https://www.airbnb.com'
|
||||||
|
// Define schema to extract pagination links
|
||||||
|
const paginationSchema = z.object({
|
||||||
|
page_links: z
|
||||||
|
.array(
|
||||||
|
z.object({
|
||||||
|
link: z.string(),
|
||||||
|
})
|
||||||
|
)
|
||||||
|
.describe('Pagination links in the bottom of the page.'),
|
||||||
|
})
|
||||||
|
|
||||||
|
const params2 = {
|
||||||
|
pageOptions: {
|
||||||
|
onlyMainContent: false,
|
||||||
|
},
|
||||||
|
extractorOptions: { extractionSchema: paginationSchema },
|
||||||
|
timeout: 50000, // if needed, sometimes airbnb stalls...
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start crawling to get pagination links
|
||||||
|
const linksData = await app.scrapeUrl(listingsUrl, params2)
|
||||||
|
console.log(linksData.data['llm_extraction'])
|
||||||
|
|
||||||
|
let paginationLinks = linksData.data['llm_extraction'].page_links.map(
|
||||||
|
(link) => baseUrl + link.link
|
||||||
|
)
|
||||||
|
|
||||||
|
// Just in case is not able to get the pagination links
|
||||||
|
if (paginationLinks.length === 0) {
|
||||||
|
paginationLinks = [listingsUrl]
|
||||||
|
}
|
||||||
|
|
||||||
|
// Define schema to extract listings
|
||||||
|
const schema = z.object({
|
||||||
|
listings: z
|
||||||
|
.array(
|
||||||
|
z.object({
|
||||||
|
title: z.string(),
|
||||||
|
price_per_night: z.number(),
|
||||||
|
location: z.string(),
|
||||||
|
rating: z.number().optional(),
|
||||||
|
reviews: z.number().optional(),
|
||||||
|
})
|
||||||
|
)
|
||||||
|
.describe('Airbnb listings in San Francisco'),
|
||||||
|
})
|
||||||
|
|
||||||
|
const params = {
|
||||||
|
pageOptions: {
|
||||||
|
onlyMainContent: false,
|
||||||
|
},
|
||||||
|
extractorOptions: { extractionSchema: schema },
|
||||||
|
}
|
||||||
|
|
||||||
|
// Function to scrape a single URL
|
||||||
|
const scrapeListings = async (url) => {
|
||||||
|
const result = await app.scrapeUrl(url, params)
|
||||||
|
return result.data['llm_extraction'].listings
|
||||||
|
}
|
||||||
|
|
||||||
|
// Scrape all pagination links in parallel
|
||||||
|
const listingsPromises = paginationLinks.map((link) => scrapeListings(link))
|
||||||
|
const listingsResults = await Promise.all(listingsPromises)
|
||||||
|
|
||||||
|
// Flatten the results
|
||||||
|
const allListings = listingsResults.flat()
|
||||||
|
|
||||||
|
// Save the listings to a file
|
||||||
|
fs.writeFileSync(
|
||||||
|
'airbnb_listings.json',
|
||||||
|
JSON.stringify(allListings, null, 2)
|
||||||
|
)
|
||||||
|
// Read the listings from the file
|
||||||
|
const listingsData = fs.readFileSync('airbnb_listings.json', 'utf8')
|
||||||
|
return listingsData
|
||||||
|
} catch (error) {
|
||||||
|
console.error('An error occurred:', error.message)
|
||||||
|
}
|
||||||
|
}
|