Commit Graph

1000 Commits

Author SHA1 Message Date
Nicolas
f10f3f886b
Merge pull request #410 from mendableai/feat/fire-engine-chrome-cdp
Support chrome-cdp and restructure sitemap fire-engine support.
2024-07-18 13:52:08 -04:00
Nicolas
9a1a227797 Update crawl-cancel.ts 2024-07-18 13:49:51 -04:00
Nicolas
11768571ed Update crawl-cancel.ts 2024-07-18 13:43:03 -04:00
Nicolas
ce804d3c20 Update crawl-cancel.ts 2024-07-18 13:40:24 -04:00
Nicolas
d2de01d342 Nick: fixes 2024-07-18 13:19:44 -04:00
Gergo Moricz
0b8047c7a0 fix(WebScraper): infinite regex leading to fly.io instance hangs 2024-07-18 19:13:43 +02:00
Nicolas
f11137352c Merge branch 'main' into feat/fire-engine-chrome-cdp 2024-07-18 12:48:42 -04:00
Nicolas
6d1d46a987
Merge pull request #433 from mendableai/mog/js-sdk-tests-fix
fix(js-sdk): transform tests with ts-jest and configure node
2024-07-18 12:40:59 -04:00
Nicolas
01b5e8fc73
Merge pull request #429 from mendableai/mog/fix-job-stuck-2
Fix queue stuck bug via lock settings changes
2024-07-18 12:39:21 -04:00
Nicolas
b134ba92bc
Merge pull request #427 from mendableai/docs/update-docs
[Docs] Updating docs
2024-07-18 11:49:08 -04:00
rafaelsideguide
f13ef02a08 Update openapi.json 2024-07-18 10:34:03 -03:00
Gergo Moricz
a23b125471 fix(js-sdk): transform tests with ts-jest and configure node 2024-07-18 14:20:51 +02:00
Gergo Moricz
361269974e fix(js-sdk): remove autogenerated index.d.ts from git and add to gitignore 2024-07-18 13:48:39 +02:00
Gergo Moricz
2e62de4f8b fix(js-sdk): remove built files from repo and add to gitignore 2024-07-18 13:45:51 +02:00
Gergo Moricz
a0b8a6cad3 feat(js-sdk): build both cjs and esm versions 2024-07-18 13:43:36 +02:00
Nicolas
2fab2d8d29 Update scrape.ts 2024-07-17 20:44:34 -04:00
Nicolas
6609c1b6e5
Update .env.local 2024-07-17 16:22:27 -04:00
Nicolas
17a1f9b55f
Update .env.example 2024-07-17 16:22:04 -04:00
rafaelsideguide
eda616d728 Merge remote-tracking branch 'origin/main' into docs/update-docs 2024-07-17 16:44:51 -03:00
rafaelsideguide
2b4ce12097 Update openapi.json 2024-07-17 16:43:22 -03:00
Gergo Moricz
8160c311c0 fix queue stuck bug via lock setting changes 2024-07-17 21:31:25 +02:00
Caleb Peffer
8d5ebc9b9f
Merge pull request #423 from mendableai/cjp/linksOnPage
Caleb: Return a list of links on a page by default
2024-07-17 12:36:07 -06:00
Caleb Peffer
5b24d26c84 Caleb; fixed test 2024-07-17 11:33:12 -07:00
Caleb Peffer
c5d1e7260d Caleb: made changes per Rafaels requests 2024-07-17 11:29:05 -07:00
rafaelsideguide
205cd63c2f Update openapi.json 2024-07-17 15:07:06 -03:00
Rafael Miller
f020048a46
Merge pull request #420 from mendableai/bugfix/empty-tags
Small fix for empty pageOptions
2024-07-17 10:10:24 -03:00
Caleb Peffer
da3c6bca37 Caleb: added a simple test 2024-07-16 21:23:22 -07:00
Caleb Peffer
0b3c0ede49 Added tests per @nicks request 2024-07-16 21:15:59 -07:00
Caleb Peffer
98c788ca7a Caleb: added a test to ensure links on page exists and isn't zero on mendable 2024-07-16 21:13:52 -07:00
Nicolas
d7f185428f
Merge pull request #424 from mendableai/nsc/seperate-rate-limit
Redis Health Checks
2024-07-16 22:53:28 -04:00
Nicolas
3c3412e893 Update rate-limiter.test.ts 2024-07-16 22:45:12 -04:00
Nicolas
ffc3b7c5fb Update index.ts 2024-07-16 22:42:40 -04:00
Nicolas
c9073a747c Nick: 2024-07-16 22:41:13 -04:00
Caleb Peffer
d39d3be649 Caleb: now extracting and returning a list of all links on the page for a customer 2024-07-16 18:38:03 -07:00
rafaelsideguide
dba1fb2dc8 Update removeUnwantedElements.ts 2024-07-16 18:22:56 -03:00
Rafael Miller
db0545014f
Merge pull request #391 from jhoseph88/feat/issue-387
[Feat] Pass along current, total, current_step, and current_url in js sdk
2024-07-16 15:56:42 -03:00
Nicolas
92202de12b Update rate-limiter.ts 2024-07-16 10:09:49 -04:00
Nicolas
4ef47f7765
Update models.ts 2024-07-15 22:52:17 -04:00
rentianyue-jk
1b7ae5457f support custom models 2024-07-16 10:22:54 +08:00
Thomas Kosmas
5c65ec58e5 Support chrome-cdp and restructure sitemap fire-engine support. 2024-07-15 18:40:43 +03:00
Nicolas
949791049f Nick: 2024-07-12 23:20:26 -04:00
Nicolas
d0c8d3ecde Merge branch 'main' into nsc/sitemap-fix-fire-engine 2024-07-12 22:15:06 -04:00
Nicolas
a3b1703b68 Update fireEngine.ts 2024-07-12 22:15:00 -04:00
Nicolas
09bc2c7a9c
Merge pull request #394 from mendableai/nsc/small-fe-print
Log Fire-engine page errors
2024-07-12 22:14:04 -04:00
Nicolas
e098e88ea7 Nick: 2024-07-12 22:02:08 -04:00
Nicolas
bfc7f5882e Update index.ts 2024-07-12 19:57:12 -04:00
Nicolas
436e8922a7 Nick: doing on the ci instead 2024-07-12 19:49:38 -04:00
Nicolas
fc3328f3d1 Update index.ts 2024-07-12 19:12:56 -04:00
Nicolas
fd18f2269b Nick: slack alerts 2024-07-12 19:07:59 -04:00
rafaelsideguide
f453bcf17c bugfix docker self hosting 2024-07-12 16:51:20 -03:00
Nicolas
0ddaac6ae0 Nick: fixed the other instances as well 2024-07-12 15:39:10 -04:00
Nicolas
5da03a8fbd Update fireEngine.ts 2024-07-12 14:59:49 -04:00
Kuniaki Shimizu
bd986a453c fix USE_DB_AUTHENTICATION checks 2024-07-13 03:50:46 +09:00
Nicolas
b5b75086c1 Update index.ts 2024-07-12 10:44:14 -04:00
Gergo Moricz
0d3e09e798 fix: try-catch job removal 2024-07-12 16:35:50 +02:00
Gergő Móricz
69d724714f
Merge branch 'main' into mog/job-stuck-fix 2024-07-12 16:33:34 +02:00
Nicolas
c3eecf7b9f Update index.ts 2024-07-12 10:22:06 -04:00
Gergo Moricz
10957b748b fix(bull): requeue jobs after restart 2024-07-12 13:55:53 +02:00
Nicolas
961b27811d
Merge pull request #386 from mendableai/feat/fire-engine-fallback-for-sitemap
[Feat] Added fire-engine fallback for getting sitemaps
2024-07-11 20:38:01 -04:00
Nicolas
84de63dbeb
Merge pull request #375 from StefanTerdell/self-host-qol
Self-hosting quality of life fixes
2024-07-11 20:37:39 -04:00
Nicolas
30c1118713
Merge pull request #326 from mendableai/feat/save-docs-on-supabase
[Feat] Added implementation for saving docs on supabase
2024-07-11 20:27:41 -04:00
jhoseph88
68828a5b5c Pass along current, total, current_step, and current_url in js sdk 2024-07-11 19:37:09 -04:00
Gergo Moricz
7e3a368684 fix: unpause globally 2024-07-12 00:05:35 +02:00
Gergo Moricz
ee1d41406e feat: unpause by http request 2024-07-11 23:56:36 +02:00
Gergo Moricz
f64a2d8668 fix: rename fly tomls to original 2024-07-11 23:21:02 +02:00
Gergo Moricz
bd84290b9e fix: reenable hyperdx 2024-07-11 23:20:51 +02:00
Gergo Moricz
09bca05b20 feat: fix iteration 3 (actually works) 2024-07-11 23:14:15 +02:00
Gergo Moricz
9cd7d79b64 feat: avoid double SIGINT crashing 2024-07-11 20:35:15 +02:00
Gergo Moricz
eaa8db4b19 fix(fly): raise kill timeout for graceful shutdown 2024-07-11 20:09:06 +02:00
Gergo Moricz
bffb9f8fd0 feat: stuck job restoration iteration 2 2024-07-11 20:08:21 +02:00
rafaelsideguide
86d0e88a91 removed hyperdx (they also have graceful shutdown) and tried to change the process for running on server. It didn't work. 2024-07-10 18:29:55 -03:00
rafaelsideguide
9ad06fdf56 added fire-engine fallback for getting sitemaps 2024-07-09 16:07:53 -03:00
Gergo Moricz
1a07e9d23b feat: pick up and commit interrupted jobs from/to DB 2024-07-09 15:57:38 +02:00
Gergo Moricz
77aa46588f feat: graceful exit handler 2024-07-09 14:29:32 +02:00
Nicolas
fcc67a3c9e
Merge pull request #370 from mendableai/bug/fixing-cicd
dependabot for security checks, fixed crawl test
2024-07-08 19:26:18 -03:00
Eric Ciarla
afb49e21e7 Update SDKs to MIT license 2024-07-08 13:37:53 -04:00
Stefan Terdell
188fe56203 Optional jobId webhook URL templating 2024-07-07 15:11:45 +02:00
Stefan Terdell
a2ae5f81d9 Only check Supabase if configured to 2024-07-07 15:06:31 +02:00
rafaelsideguide
c2bba54b4f Added veeva to special case params 2024-07-05 16:58:07 -03:00
rafaelsideguide
a2cdc520e6 dependabot for security checks, fixed crawl test 2024-07-05 14:49:03 -03:00
rafaelsideguide
0ab6cef471 Merge remote-tracking branch 'origin/main' into dependabot/npm_and_yarn/apps/api/prod-deps-5b38a50718 2024-07-05 14:00:10 -03:00
Nicolas
914897c9d2 Merge branch 'main' into feat/save-docs-on-supabase 2024-07-05 12:27:22 -03:00
rafaelsideguide
538dc63035 Fixing rate-limiter-flexible package version
Redis version <3.0.2 throws TS bug:
https://github.com/animir/node-rate-limiter-flexible/issues/228
2024-07-05 12:12:00 -03:00
Rafael Miller
c570fa92cf
Merge pull request #347 from mendableai/dependabot/npm_and_yarn/apps/test-suite/prod-deps-d16537e256
apps/test-suite(deps): bump the prod-deps group in /apps/test-suite with 6 updates
2024-07-05 10:18:35 -03:00
Nicolas
8f46b8218a
Merge pull request #361 from snippet/ts-playwright-service-docker
setting up docker to ts playwright service
2024-07-04 17:47:41 -03:00
Nicolas
32849b017f Nick: 2024-07-03 20:18:11 -03:00
Nicolas
5ecd9cb6f5
Merge pull request #363 from mendableai/nsc/logging-scrapers
Logging for all scraper methods
2024-07-03 18:47:22 -03:00
Nicolas
066d92f643 Update single_url.ts 2024-07-03 18:38:17 -03:00
Nicolas
f5b2fbd7e8 Nick: revision 2024-07-03 18:06:53 -03:00
Nicolas
2d30cc6117 Nick: comments 2024-07-03 18:01:54 -03:00
Nicolas
90c54c32fd Nick: refactor 2024-07-03 18:01:17 -03:00
Nicolas
90cf799a3c Update single_url.ts 2024-07-03 17:56:21 -03:00
Nicolas
b36406e465 Nick: log scrpaers 2024-07-03 17:28:53 -03:00
Jeff Pereira
b4292c1ea3 setting up docker to ts playwright service 2024-07-03 11:55:39 -07:00
Nicolas
abb44bb112
Merge pull request #346 from mendableai/dependabot/pip/apps/playwright-service/prod-deps-8f04296377
apps/playwright-service(deps): bump the prod-deps group in /apps/playwright-service with 3 updates
2024-07-03 01:07:09 -03:00
Nicolas
f967daddcb
Merge pull request #325 from snippet/playwright-scraper-api
new playwright service
2024-07-03 01:04:52 -03:00
Eric Ciarla
2d0d5ac392 Update for llm-extraction-from-raw-html 2024-07-02 14:05:42 -04:00
rafaelsideguide
0175152577 Fixed PDF match custom scraping
Now it's working for both `https://getgc.ai/privacy` and `https://prairie.cards/products/wood-designs` usecases.
2024-07-02 11:25:17 -03:00
rafaelsideguide
96de948d6b Update index.test.ts 2024-07-02 11:04:09 -03:00
rafaelsideguide
7b7154ba1e bugfixed pageStatusCode 2024-07-02 10:51:35 -03:00
Rafael Miller
50eecf04a9
Update licence pyproject.toml
Closes #345
2024-07-02 10:01:49 -03:00
dependabot[bot]
c2e00d1998
apps/api(deps): bump the prod-deps group in /apps/api with 28 updates
Bumps the prod-deps group in /apps/api with 28 updates:

| Package | From | To |
| --- | --- | --- |
| [@anthropic-ai/sdk](https://github.com/anthropics/anthropic-sdk-typescript) | `0.20.9` | `0.24.3` |
| [@bull-board/api](https://github.com/felixmosh/bull-board/tree/HEAD/packages/api) | `5.19.2` | `5.20.5` |
| [@bull-board/express](https://github.com/felixmosh/bull-board/tree/HEAD/packages/express) | `5.19.2` | `5.20.5` |
| [@hyperdx/node-opentelemetry](https://github.com/hyperdxio/hyperdx-js) | `0.7.0` | `0.8.0` |
| [@nangohq/node](https://github.com/NangoHQ/nango/tree/HEAD/packages/node-client) | `0.36.101` | `0.40.8` |
| [@sentry/node](https://github.com/getsentry/sentry-javascript) | `7.116.0` | `8.13.0` |
| [@supabase/supabase-js](https://github.com/supabase/supabase-js) | `2.43.4` | `2.44.2` |
| [ajv](https://github.com/ajv-validator/ajv) | `8.15.0` | `8.16.0` |
| [async-mutex](https://github.com/DirtyHairy/async-mutex) | `0.4.1` | `0.5.0` |
| [bull](https://github.com/OptimalBits/bull) | `4.12.9` | `4.15.0` |
| [date-fns](https://github.com/date-fns/date-fns) | `2.30.0` | `3.6.0` |
| [express-rate-limit](https://github.com/express-rate-limit/express-rate-limit) | `6.11.2` | `7.3.1` |
| [glob](https://github.com/isaacs/node-glob) | `10.4.1` | `10.4.2` |
| [json-schema-to-zod](https://github.com/StefanTerdell/json-schema-to-zod) | `2.1.0` | `2.3.0` |
| [keyword-extractor](https://github.com/michaeldelorenzo/keyword-extractor) | `0.0.25` | `0.0.28` |
| [langchain](https://github.com/langchain-ai/langchainjs) | `0.1.37` | `0.2.8` |
| [logsnag](https://github.com/LogSnag/logsnag.js) | `0.1.8` | `1.0.0` |
| [mongoose](https://github.com/Automattic/mongoose) | `8.4.1` | `8.4.4` |
| [natural](https://github.com/NaturalNode/natural) | `6.12.0` | `7.0.7` |
| [openai](https://github.com/openai/openai-node) | `4.47.3` | `4.52.2` |
| [promptable](https://github.com/promptable/Promptable.js) | `0.0.9` | `0.0.10` |
| [puppeteer](https://github.com/puppeteer/puppeteer) | `22.10.0` | `22.12.1` |
| [rate-limiter-flexible](https://github.com/animir/node-rate-limiter-flexible) | `2.4.2` | `5.0.3` |
| [resend](https://github.com/resendlabs/resend-node) | `3.2.0` | `3.4.0` |
| [stripe](https://github.com/stripe/stripe-node) | `12.18.0` | `16.1.0` |
| [unstructured-client](https://github.com/Unstructured-IO/unstructured-js-client) | `0.9.4` | `0.11.3` |
| [uuid](https://github.com/uuidjs/uuid) | `9.0.1` | `10.0.0` |
| [zod-to-json-schema](https://github.com/StefanTerdell/zod-to-json-schema) | `3.23.0` | `3.23.1` |


Updates `@anthropic-ai/sdk` from 0.20.9 to 0.24.3
- [Release notes](https://github.com/anthropics/anthropic-sdk-typescript/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-typescript/compare/sdk-v0.20.9...sdk-v0.24.3)

Updates `@bull-board/api` from 5.19.2 to 5.20.5
- [Release notes](https://github.com/felixmosh/bull-board/releases)
- [Changelog](https://github.com/felixmosh/bull-board/blob/master/CHANGELOG.md)
- [Commits](https://github.com/felixmosh/bull-board/commits/v5.20.5/packages/api)

Updates `@bull-board/express` from 5.19.2 to 5.20.5
- [Release notes](https://github.com/felixmosh/bull-board/releases)
- [Changelog](https://github.com/felixmosh/bull-board/blob/master/CHANGELOG.md)
- [Commits](https://github.com/felixmosh/bull-board/commits/v5.20.5/packages/express)

Updates `@hyperdx/node-opentelemetry` from 0.7.0 to 0.8.0
- [Release notes](https://github.com/hyperdxio/hyperdx-js/releases)
- [Commits](https://github.com/hyperdxio/hyperdx-js/compare/@hyperdx/node-opentelemetry@0.7.0...@hyperdx/node-opentelemetry@0.8.0)

Updates `@nangohq/node` from 0.36.101 to 0.40.8
- [Release notes](https://github.com/NangoHQ/nango/releases)
- [Changelog](https://github.com/NangoHQ/nango/blob/master/CHANGELOG.md)
- [Commits](https://github.com/NangoHQ/nango/commits/v0.40.8/packages/node-client)

Updates `@sentry/node` from 7.116.0 to 8.13.0
- [Release notes](https://github.com/getsentry/sentry-javascript/releases)
- [Changelog](https://github.com/getsentry/sentry-javascript/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-javascript/compare/7.116.0...8.13.0)

Updates `@supabase/supabase-js` from 2.43.4 to 2.44.2
- [Release notes](https://github.com/supabase/supabase-js/releases)
- [Changelog](https://github.com/supabase/supabase-js/blob/master/RELEASE.md)
- [Commits](https://github.com/supabase/supabase-js/compare/v2.43.4...v2.44.2)

Updates `ajv` from 8.15.0 to 8.16.0
- [Release notes](https://github.com/ajv-validator/ajv/releases)
- [Commits](https://github.com/ajv-validator/ajv/compare/v8.15.0...v8.16.0)

Updates `async-mutex` from 0.4.1 to 0.5.0
- [Changelog](https://github.com/DirtyHairy/async-mutex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DirtyHairy/async-mutex/compare/v0.4.1...v0.5.0)

Updates `bull` from 4.12.9 to 4.15.0
- [Release notes](https://github.com/OptimalBits/bull/releases)
- [Changelog](https://github.com/OptimalBits/bull/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/OptimalBits/bull/compare/v4.12.9...v4.15.0)

Updates `date-fns` from 2.30.0 to 3.6.0
- [Release notes](https://github.com/date-fns/date-fns/releases)
- [Changelog](https://github.com/date-fns/date-fns/blob/main/CHANGELOG.md)
- [Commits](https://github.com/date-fns/date-fns/compare/v2.30.0...v3.6.0)

Updates `express-rate-limit` from 6.11.2 to 7.3.1
- [Release notes](https://github.com/express-rate-limit/express-rate-limit/releases)
- [Commits](https://github.com/express-rate-limit/express-rate-limit/compare/v6.11.2...v7.3.1)

Updates `glob` from 10.4.1 to 10.4.2
- [Changelog](https://github.com/isaacs/node-glob/blob/main/changelog.md)
- [Commits](https://github.com/isaacs/node-glob/compare/v10.4.1...v10.4.2)

Updates `json-schema-to-zod` from 2.1.0 to 2.3.0
- [Commits](https://github.com/StefanTerdell/json-schema-to-zod/commits)

Updates `keyword-extractor` from 0.0.25 to 0.0.28
- [Release notes](https://github.com/michaeldelorenzo/keyword-extractor/releases)
- [Commits](https://github.com/michaeldelorenzo/keyword-extractor/compare/0.0.25...0.0.28)

Updates `langchain` from 0.1.37 to 0.2.8
- [Release notes](https://github.com/langchain-ai/langchainjs/releases)
- [Changelog](https://github.com/langchain-ai/langchainjs/blob/main/release_workspace.js)
- [Commits](https://github.com/langchain-ai/langchainjs/compare/0.1.37...0.2.8)

Updates `logsnag` from 0.1.8 to 1.0.0
- [Commits](https://github.com/LogSnag/logsnag.js/compare/v0.1.8...v1.0.0)

Updates `mongoose` from 8.4.1 to 8.4.4
- [Release notes](https://github.com/Automattic/mongoose/releases)
- [Changelog](https://github.com/Automattic/mongoose/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Automattic/mongoose/compare/8.4.1...8.4.4)

Updates `natural` from 6.12.0 to 7.0.7
- [Release notes](https://github.com/NaturalNode/natural/releases)
- [Commits](https://github.com/NaturalNode/natural/compare/v6.12.0...v7.0.7)

Updates `openai` from 4.47.3 to 4.52.2
- [Release notes](https://github.com/openai/openai-node/releases)
- [Changelog](https://github.com/openai/openai-node/blob/master/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-node/compare/v4.47.3...v4.52.2)

Updates `promptable` from 0.0.9 to 0.0.10
- [Commits](https://github.com/promptable/Promptable.js/commits)

Updates `puppeteer` from 22.10.0 to 22.12.1
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/release-please-config.json)
- [Commits](https://github.com/puppeteer/puppeteer/compare/puppeteer-v22.10.0...puppeteer-v22.12.1)

Updates `rate-limiter-flexible` from 2.4.2 to 5.0.3
- [Release notes](https://github.com/animir/node-rate-limiter-flexible/releases)
- [Commits](https://github.com/animir/node-rate-limiter-flexible/commits/v5.0.3)

Updates `resend` from 3.2.0 to 3.4.0
- [Release notes](https://github.com/resendlabs/resend-node/releases)
- [Commits](https://github.com/resendlabs/resend-node/compare/v3.2.0...v3.4.0)

Updates `stripe` from 12.18.0 to 16.1.0
- [Release notes](https://github.com/stripe/stripe-node/releases)
- [Changelog](https://github.com/stripe/stripe-node/blob/master/CHANGELOG.md)
- [Commits](https://github.com/stripe/stripe-node/compare/v12.18.0...v16.1.0)

Updates `unstructured-client` from 0.9.4 to 0.11.3
- [Release notes](https://github.com/Unstructured-IO/unstructured-js-client/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured-js-client/blob/main/RELEASES.md)
- [Commits](https://github.com/Unstructured-IO/unstructured-js-client/compare/v0.9.4...v0.11.3)

Updates `uuid` from 9.0.1 to 10.0.0
- [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md)
- [Commits](https://github.com/uuidjs/uuid/compare/v9.0.1...v10.0.0)

Updates `zod-to-json-schema` from 3.23.0 to 3.23.1
- [Release notes](https://github.com/StefanTerdell/zod-to-json-schema/releases)
- [Changelog](https://github.com/StefanTerdell/zod-to-json-schema/blob/master/changelog.md)
- [Commits](https://github.com/StefanTerdell/zod-to-json-schema/commits)

---
updated-dependencies:
- dependency-name: "@anthropic-ai/sdk"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@bull-board/api"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@bull-board/express"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@hyperdx/node-opentelemetry"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@nangohq/node"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@sentry/node"
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: "@supabase/supabase-js"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: ajv
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: async-mutex
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: bull
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: date-fns
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: express-rate-limit
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: glob
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: json-schema-to-zod
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: keyword-extractor
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: logsnag
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: mongoose
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: natural
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: promptable
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: puppeteer
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: rate-limiter-flexible
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: resend
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: stripe
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: unstructured-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: prod-deps
- dependency-name: zod-to-json-schema
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-02 12:52:43 +00:00
dependabot[bot]
ad3e73b445
apps/test-suite(deps): bump the prod-deps group
Bumps the prod-deps group in /apps/test-suite with 6 updates:

| Package | From | To |
| --- | --- | --- |
| [@anthropic-ai/sdk](https://github.com/anthropics/anthropic-sdk-typescript) | `0.20.8` | `0.24.3` |
| [@dqbd/tiktoken](https://github.com/dqbd/tiktoken) | `1.0.14` | `1.0.15` |
| [@supabase/supabase-js](https://github.com/supabase/supabase-js) | `2.43.1` | `2.44.2` |
| [openai](https://github.com/openai/openai-node) | `4.40.2` | `4.52.2` |
| [playwright](https://github.com/microsoft/playwright) | `1.43.1` | `1.45.0` |
| [ts-jest](https://github.com/kulshekhar/ts-jest) | `29.1.2` | `29.1.5` |


Updates `@anthropic-ai/sdk` from 0.20.8 to 0.24.3
- [Release notes](https://github.com/anthropics/anthropic-sdk-typescript/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-typescript/compare/sdk-v0.20.8...sdk-v0.24.3)

Updates `@dqbd/tiktoken` from 1.0.14 to 1.0.15
- [Release notes](https://github.com/dqbd/tiktoken/releases)
- [Changelog](https://github.com/dqbd/tiktoken/blob/main/CHANGELOG.md)
- [Commits](https://github.com/dqbd/tiktoken/compare/@dqbd/tiktoken@1.0.14...@dqbd/tiktoken@1.0.15)

Updates `@supabase/supabase-js` from 2.43.1 to 2.44.2
- [Release notes](https://github.com/supabase/supabase-js/releases)
- [Changelog](https://github.com/supabase/supabase-js/blob/master/RELEASE.md)
- [Commits](https://github.com/supabase/supabase-js/compare/v2.43.1...v2.44.2)

Updates `openai` from 4.40.2 to 4.52.2
- [Release notes](https://github.com/openai/openai-node/releases)
- [Changelog](https://github.com/openai/openai-node/blob/master/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-node/compare/v4.40.2...v4.52.2)

Updates `playwright` from 1.43.1 to 1.45.0
- [Release notes](https://github.com/microsoft/playwright/releases)
- [Commits](https://github.com/microsoft/playwright/compare/v1.43.1...v1.45.0)

Updates `ts-jest` from 29.1.2 to 29.1.5
- [Release notes](https://github.com/kulshekhar/ts-jest/releases)
- [Changelog](https://github.com/kulshekhar/ts-jest/blob/main/CHANGELOG.md)
- [Commits](https://github.com/kulshekhar/ts-jest/compare/v29.1.2...v29.1.5)

---
updated-dependencies:
- dependency-name: "@anthropic-ai/sdk"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: "@dqbd/tiktoken"
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: "@supabase/supabase-js"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: playwright
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: ts-jest
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-02 12:47:58 +00:00
dependabot[bot]
60de6bb6e3
apps/playwright-service(deps): bump the prod-deps group
Bumps the prod-deps group in /apps/playwright-service with 3 updates: [hypercorn](https://github.com/pgjones/hypercorn), [fastapi](https://github.com/tiangolo/fastapi) and [playwright](https://github.com/Microsoft/playwright-python).


Updates `hypercorn` from 0.16.0 to 0.17.3
- [Changelog](https://github.com/pgjones/hypercorn/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pgjones/hypercorn/compare/0.16.0...0.17.3)

Updates `fastapi` from 0.110.0 to 0.111.0
- [Release notes](https://github.com/tiangolo/fastapi/releases)
- [Commits](https://github.com/tiangolo/fastapi/compare/0.110.0...0.111.0)

Updates `playwright` from 1.42.0 to 1.44.0
- [Release notes](https://github.com/Microsoft/playwright-python/releases)
- [Commits](https://github.com/Microsoft/playwright-python/compare/v1.42.0...v1.44.0)

---
updated-dependencies:
- dependency-name: hypercorn
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
- dependency-name: playwright
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-02 12:47:09 +00:00
Rafael Miller
f0f449fe51
Merge pull request #336 from snippet/allow-external-content-links
[Proposal] new feature allowExternalContentLinks
2024-07-02 09:45:21 -03:00
rafaelsideguide
db4a743365 Added e2e test 2024-07-02 09:44:08 -03:00
Eric Ciarla
0821017f5b
Update README.md 2024-07-02 07:08:46 -04:00
Nicolas
42cd58a679
Merge pull request #332 from mendableai/feat/rawHtmlExtraction
Adds pageOptions.includeRawHtml and new extraction mode "llm-extraction-from-raw-html"
2024-07-01 18:23:26 -03:00
Nicolas
c4f423981f Update pnpm-lock.yaml 2024-07-01 18:22:22 -03:00
rafaelsideguide
16aac7f8c5 Update single_url.ts 2024-07-01 18:21:15 -03:00
Nicolas
6d0c7a9ccd
Merge pull request #323 from mendableai/tests/crawl-limit-unit-tests
[Tests] Added crawl limit unit test
2024-07-01 17:56:04 -03:00
rafaelsideguide
4d6e25619b minor spacing and comment stuff 2024-07-01 16:05:34 -03:00
Eric Ciarla
e1af815f8c Update scrape.ts 2024-07-01 08:48:21 -04:00
Eric Ciarla
7ae195bacc Update index.test.ts 2024-06-29 10:13:12 -04:00
Eric Ciarla
837b446390 Update index.test.ts 2024-06-29 08:48:42 -04:00
Eric Ciarla
fe6e3aeadc Update index.test.ts 2024-06-29 08:44:21 -04:00
Eric Ciarla
6c9f0dfc91 Add tests 2024-06-29 08:32:20 -04:00
Jeff Pereira
a5fb45988c new feature allowExternalContentLinks 2024-06-28 17:23:40 -07:00
Eric Ciarla
87b54488d3 update to includeRawHtml 2024-06-28 17:07:47 -04:00
Eric Ciarla
70fcf2ce03 init 2024-06-28 16:39:09 -04:00
Nicolas
9bf74bc774 Update single_url.ts 2024-06-28 15:51:18 -03:00
Nicolas
7e17498bcf Update single_url.ts 2024-06-28 15:45:16 -03:00
rafaelsideguide
7dffaaa3e2 Changed port and added "using with firecrawl" section on readme 2024-06-28 11:51:24 -03:00
rafaelsideguide
d66e1f7846 looking good 2024-06-27 16:00:45 -03:00
Nicolas
9e7298945c Update openapi.json 2024-06-26 21:25:38 -03:00
Nicolas
1ec0bf8adf Update openapi.json 2024-06-26 21:22:46 -03:00
Nicolas
042f81ddf2 Update removeUnwantedElements.test.ts 2024-06-26 21:20:11 -03:00
Nicolas
388ce3cbce Nick: small changes 2024-06-26 21:15:42 -03:00
Nicolas
1d4907acc9 Nick: 2024-06-26 21:02:58 -03:00
rafaelsideguide
c40da77be0 Added implementation for saving docs on supabase
- TODO: remove the comments on `log_job.ts` before deploying to prod
2024-06-26 18:23:28 -03:00
Jeff Pereira
d833a132a5 new playwright service 2024-06-26 12:32:30 -07:00
Nicolas
3b92fb8433
Merge pull request #322 from mendableai/tests/metadata
[Test] Added E2E tests for checking metadata values
2024-06-26 12:09:18 -03:00
rafaelsideguide
67d7650cf3 Added to e2e_noAuth 2024-06-26 12:07:55 -03:00
rafaelsideguide
009df6c930 Added crawl limit unit test
I think this test is over relying on mocks but I have no idea on how to fix this without changing the code arch structure
2024-06-26 09:54:25 -03:00
rafaelsideguide
05eaa3c68d Update index.test.ts 2024-06-26 09:32:02 -03:00
rafaelsideguide
4381109dd8 added default values and fixed pdf bug 2024-06-26 09:00:54 -03:00
Nicolas
45f2765601
Merge pull request #316 from snippet/types-webscraper
add some types
2024-06-25 22:03:21 -03:00
Nicolas
768a131b5c
Merge pull request #318 from mendableai/bug/fix-custom-scrape-pdf-google-drive
[Bug] Fixed the regex test for google drive pdf files
2024-06-25 18:27:11 -03:00
rafaelsideguide
5f69fc7677 Fixed the regex test 2024-06-25 18:24:01 -03:00
rafaelsideguide
d02829d335 fixed clean jobs 2024-06-25 17:49:29 -03:00
Jeff Pereira
199cbe8bcb add some types 2024-06-25 12:20:25 -07:00
Nicolas
749b0c05dc Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-06-25 15:21:15 -03:00
Nicolas
e7be17db92 Nick: metadata fixes and lock duration for bull decreased to 2 hrs 2024-06-25 15:21:14 -03:00
Nicolas
f84fb4b331
Merge pull request #313 from snippet/google-search-term-fix
fix multi-word search term issue: /search (w/o Serp)
2024-06-24 19:24:58 -03:00
Jeff Pereira
6ddf3a58a1 fix multi-word search term issue: /search (w/o Serp) 2024-06-24 14:21:52 -07:00
Nicolas
90b7fff366
Update crawler.ts 2024-06-24 16:52:01 -03:00
Nicolas
08c1fa799b
Update queue-worker.ts 2024-06-24 16:51:32 -03:00
rafaelsideguide
3ebdf93342 removed console.logs 2024-06-24 16:43:12 -03:00
Nicolas
56d42d9c9b Nick: 2024-06-24 16:33:07 -03:00
rafaelsideguide
21d29de819 testing crawl with new.abb.com case
many unnecessary console.logs for tracing the code execution
2024-06-24 16:25:07 -03:00
Nicolas
3c7b7e7242 NIck: fixes fallback 2024-06-23 18:59:08 -03:00
Caleb Peffer
e59ba758f5 Caleb: changed posthog logging so that It associates jobs with a group. No 2024-06-18 17:42:21 -07:00
Caleb Peffer
5a91d8425f Caleb: solve for typechecking on idempotencyKey on my machine 2024-06-18 17:07:38 -07:00
rafaelsideguide
9c539e9113 Fixed includeHTML to use cleanedHtml as response 2024-06-18 16:26:54 -03:00
Rafael Miller
f5a9acc4c6
Merge branch 'main' into feat/removeTags-regex 2024-06-18 14:39:59 -03:00
rafaelsideguide
9f7afd1e88 fix for some complex cases 2024-06-18 14:36:51 -03:00
Nicolas
d0c05accf6 Nick: 2024-06-18 13:21:50 -04:00
Nicolas
818751a256
Merge pull request #294 from mendableai/tests/e2e-to-unit
[Test] Transcribed from e2e to unit tests for many cases
2024-06-18 13:09:22 -04:00
Nicolas
754c9fa08d Update package.json 2024-06-18 12:58:57 -04:00
Nicolas
90a807c547 Update index.ts 2024-06-18 12:56:13 -04:00
Nicolas
26e8bfc23a Merge branch 'main' into pr/296 2024-06-18 12:55:45 -04:00
Nicolas
b53ba58bc0
Merge pull request #282 from mendableai/nsc/rate-limiter-tests
test: Rate Limit Unit Tests
2024-06-18 11:01:28 -04:00
rafaelsideguide
727e5de8c5 Update index.test.ts 2024-06-18 11:54:10 -03:00
rafaelsideguide
c54e797eb1 (╯°□°)╯︵ ┻━┻ 2024-06-18 11:51:28 -03:00
rafaelsideguide
6e32522fa2 Improvements on response document types 2024-06-18 11:43:06 -03:00
rafaelsideguide
20f14bcf7f Added some types 2024-06-18 10:55:07 -03:00
rafaelsideguide
c2fc69af1c removed some e2e tests that are making the ci get stuck 2024-06-18 09:57:05 -03:00
rafaelsideguide
6c726a02eb Moved to utils/removeUnwantedElements, added unit tests 2024-06-18 09:46:42 -03:00
AndyMik90
8b3c3aae91 Added support for RegEx in removeTags 2024-06-18 07:31:46 +02:00
neev jewalkar
e5ffda1eec Added local host support for the javascript SDK 2024-06-18 05:42:25 +05:30
rafaelsideguide
b2bd562bb2 transcribed from e2e to unit tests for many cases 2024-06-17 17:09:44 -03:00
Nicolas
ab038051e9 Merge branch 'main' into nsc/rate-limiter-tests 2024-06-17 15:06:12 -04:00
rafaelsideguide
a20d002a6b Delete test-run-report.json 2024-06-17 09:25:29 -03:00
Eric Ciarla
519ab1aecb Update unit tests 2024-06-15 17:14:09 -04:00
Eric Ciarla
f0d4146b42 Merge branch 'feat/maxDepthRelative' of https://github.com/mendableai/firecrawl into feat/maxDepthRelative 2024-06-15 16:52:00 -04:00
Eric Ciarla
ff7b52cab1 Delete one more e2e test 2024-06-15 16:51:50 -04:00
Eric Ciarla
b1eb608295
Merge branch 'main' into feat/maxDepthRelative 2024-06-15 16:50:27 -04:00
Eric Ciarla
34e37c5671 Add unit tests to replace e2e 2024-06-15 16:43:37 -04:00
Eric Ciarla
2b40729cc2 Update index.test.ts 2024-06-15 08:56:32 -04:00
Eric Ciarla
f22759b2e7 Update index.test.ts 2024-06-14 19:42:11 -04:00
Eric Ciarla
a6b7197737 Fix for maxDepth 2024-06-14 19:40:37 -04:00
Nicolas
4ec863718b
Merge pull request #283 from mendableai/nsc/crawler-fixes
Fixes crawler getting confused with base paths that contain www.
2024-06-14 13:50:32 -07:00
Nicolas
43767360d8 Merge branch 'main' into nsc/rate-limiter-tests 2024-06-14 13:50:21 -07:00
Nicolas
e88cb314c8 Update crawler.ts 2024-06-14 13:44:54 -07:00
Rafael Miller
361cba4119
Merge pull request #175 from mendableai/test/load-testing
Test/load testing
2024-06-14 17:39:01 -03:00
Nicolas
7b11ace87d Create rate-limiter.test.ts 2024-06-14 12:31:42 -07:00
rafaelsideguide
e369d1dd0e Update index.test.ts 2024-06-14 16:17:54 -03:00
Nicolas
e37aa3db57 Nick: fixed rate limit on status 2024-06-14 12:13:02 -07:00
rafaelsideguide
a6ed2e693f Update index.test.ts 2024-06-14 15:22:52 -03:00
rafaelsideguide
ad7795f973 Merge remote-tracking branch 'origin/main' into test/load-testing 2024-06-14 15:14:01 -03:00
rafaelsideguide
354712a8a3 just changed the name for the test? 2024-06-14 13:02:04 -03:00
Eric Ciarla
2c5f5c0ea2
Merge branch 'main' into feat/maxDepthRelative 2024-06-14 11:49:12 -04:00
Eric Ciarla
80c10393b4 Update index.test.ts 2024-06-14 11:32:30 -04:00
Eric Ciarla
42ed1f4479 Update index.test.ts 2024-06-14 11:20:24 -04:00
Eric Ciarla
8830acce07 Update index.test.ts 2024-06-14 11:11:58 -04:00
Eric Ciarla
278bb311cb Update index.test.ts 2024-06-14 11:02:39 -04:00
Eric Ciarla
36a62727b8 Update index.test.ts 2024-06-14 10:52:43 -04:00
Rafael Miller
f9c7ca9388
Merge branch 'main' into feat/issue-266 2024-06-14 11:47:58 -03:00
Rafael Miller
3e2e76311c
Merge branch 'main' into feat/issue-205 2024-06-14 11:25:20 -03:00
Eric Ciarla
59451754f5 Add tests 2024-06-14 10:14:07 -04:00
rafaelsideguide
afee5684a3 Fixed tests' message and updated version 2024-06-14 11:05:19 -03:00
Eric Ciarla
9b254c1cd0 Update index.test.ts 2024-06-14 09:48:14 -04:00
Rafael Miller
5a5c532bea
Merge branch 'main' into py-sdk-improve-response-handling 2024-06-14 10:42:51 -03:00
Eric Ciarla
9aba451b18 Update index.test.ts 2024-06-14 09:33:43 -04:00
Rafael Miller
cc2e3f05b0
Merge pull request #256 from mattjoyce/feat-254-sdk-py-logging
Added logging to python sdk FIRECRAWL_LOGGING_LEVEL
2024-06-14 10:22:40 -03:00
rafaelsideguide
6963a490f1 Updated version 2024-06-14 10:21:44 -03:00
rafaelsideguide
5dd18ca79b fixed edge cases 2024-06-14 09:46:55 -03:00
Eric Ciarla
ab9de0f5ab Update maxDepth tests 2024-06-13 18:46:30 -04:00
Eric Ciarla
393bd45237 Update index.test.ts 2024-06-13 18:13:15 -04:00
Eric Ciarla
71c98d8b80 Update logic 2024-06-13 18:00:52 -04:00
Eric Ciarla
095951aa4d Update test 2024-06-13 17:40:00 -04:00
Eric Ciarla
5e8aa92788 Update index.ts 2024-06-13 17:33:13 -04:00
Eric Ciarla
bf10e9d392 Update index.test.ts 2024-06-13 17:28:59 -04:00
Eric Ciarla
65d63bae45 Update index.ts 2024-06-13 17:17:44 -04:00
Eric Ciarla
32e814bedc Update index.ts 2024-06-13 17:02:30 -04:00
Nicolas
6fc1ee32fd
Merge pull request #275 from mendableai/feat/issue-273
Added pageOptions.removeTags
2024-06-13 13:27:01 -07:00
rafaelsideguide
bb859ae9a7 Added metadata.pageStatusCode and metadata.pageError properties to the responses 2024-06-13 17:08:40 -03:00
rafaelsideguide
676d6e8ab5 Added pageOptions.removeTags 2024-06-13 10:51:05 -03:00
Nicolas
182f8d4d6c Update index.ts 2024-06-12 18:07:05 -07:00
Nicolas
11b6d5afa5 Update fly.toml 2024-06-12 18:00:22 -07:00
Nicolas
67dc46b454 Nick: clusters 2024-06-12 17:53:04 -07:00
rafaelsideguide
d20af257ba Added jobId to webhook data 2024-06-12 15:38:41 -03:00
rafaelsideguide
e37d151404 added parsePDF option to pageOptions
user can decide if they are going to let us take care of the parse or they are going to parse the pdf by themselves
2024-06-12 15:06:47 -03:00
rafaelsideguide
01c9f071fa fixed 2024-06-12 11:27:06 -03:00
rafaelsideguide
dc6acbf1f0 Merge remote-tracking branch 'origin/main' into feat/allowbackwardcrawling-option 2024-06-12 11:01:05 -03:00
Nicolas
f93231499f
Merge pull request #265 from mendableai/feat/issue-264
[Feat] Added route to clean completed jobs and a github action cron that triggers every 24h
2024-06-11 21:33:52 -07:00
Nicolas
45dee63943
Merge pull request #262 from mendableai/nsc/webhook-self-host-fix
Only fetch webhook from db if self host webhook not set and using db auth
2024-06-11 15:46:57 -07:00
rafaelsideguide
157fbe4a1e added bull auth key 2024-06-11 17:52:01 -03:00
rafaelsideguide
df3a678cf4 getting back the cancel test, this should work 2024-06-11 17:46:56 -03:00
rafaelsideguide
def2ba9987 added tests 2024-06-11 17:46:25 -03:00
Nicolas
1e3e06a1d5 Update replacePaths.test.ts 2024-06-11 13:02:39 -07:00
Nicolas
2239e03269 Update replacePaths.test.ts 2024-06-11 12:54:02 -07:00
Nicolas
520739c9f4 Nick: fixed bugs associated with absolute path replacements 2024-06-11 12:43:16 -07:00
Nicolas
b87725c683 Update openapi.json 2024-06-11 12:08:49 -07:00
rafaelsideguide
ee282c3d55 Added allowBackwardCrawling option 2024-06-11 15:24:39 -03:00
rafaelsideguide
a9f93c2f1e Added route to clean completed jobs and a github action cron that triggers every 24h 2024-06-11 14:18:05 -03:00
Nicolas
da38dad9a7 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-06-10 18:26:31 -07:00
Nicolas
9390816c1b Update openapi.json 2024-06-10 18:26:25 -07:00
Nicolas
f6b06ac27a Nick: ignoreSitemap, better crawling algo 2024-06-10 18:12:41 -07:00
Nicolas
1bd0327e1a Merge branch 'main' into nsc/pageoptions-crawler 2024-06-10 17:15:10 -07:00
Nicolas
99f2ffd6d5 Update webhook.ts 2024-06-10 17:03:10 -07:00
Nicolas
7ae9778642 Update single_url.ts 2024-06-10 16:57:31 -07:00
Nicolas
913c1dd568 Nick: fetch -> axios and fix timeouts 2024-06-10 16:49:03 -07:00
Nicolas
3091f0134c Nick: 2024-06-10 16:27:10 -07:00
Matt Joyce
827354a116 Added logging to python sdk FIRECRAWL_LOGGING_LEVEL
Instantiates the logger early and depends on env to set.
2024-06-10 21:21:23 +10:00
Nicolas
aafd23fa8a
Merge pull request #252 from mattjoyce/fix-208-py-sdk-interval-poll-name
Fix 208 py sdk interval poll name
2024-06-08 21:33:17 -07:00
Matt Joyce
6fd9ce1c89 type hints and linting 2024-06-08 11:46:52 +10:00
Matt Joyce
7477c5e5bd Use error handler consistently 2024-06-08 11:28:51 +10:00
Matt Joyce
9f306736af More detailed error handling 2024-06-08 11:18:30 +10:00
Matt Joyce
c71ea7a795 Prepare headers consistently 2024-06-08 11:08:26 +10:00
Matt Joyce
8f9a165c2f Lint - whitespace 2024-06-08 08:03:02 +10:00
Matt Joyce
5f0df596ec Align param name with JS SDK
timeout becomes poll_interval
2024-06-08 07:37:08 +10:00
Nicolas
f24ca76618 Nick: removing rate limit emails for now 2024-06-07 10:39:11 -07:00
Nicolas
98d82c4cec Update search.ts 2024-06-06 20:02:21 -07:00
Nicolas
5e80f8af87 Nick: llm extract 50 2024-06-06 18:35:44 -07:00
rafaelsideguide
7b7a6f8a39 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-06-06 17:51:28 -03:00
rafaelsideguide
f2695df215 updated sdk versions 2024-06-06 17:51:12 -03:00
rafaelsideguide
560f256a35 fixing minor problems on workflow 2024-06-06 17:36:48 -03:00
rafaelsideguide
f5318ea7d7 Update index.test.ts 2024-06-06 16:50:20 -03:00
rafaelsideguide
cd7f9abcec Update index.test.ts 2024-06-06 16:44:46 -03:00
rafaelsideguide
7b9b668b95 Update index.test.ts 2024-06-06 16:36:51 -03:00
rafaelsideguide
82e0ed4cd3 Update index.test.ts 2024-06-06 16:33:27 -03:00
rafaelsideguide
dac7612be2 Merge branch 'main' of https://github.com/mendableai/firecrawl into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk 2024-06-06 16:07:25 -03:00
Nicolas
c2ad358390 Nick: 2024-06-06 12:05:20 -07:00
rafaelsideguide
79ec9f04dc Merge branch 'main' of https://github.com/mendableai/firecrawl into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk 2024-06-06 15:58:14 -03:00
Nicolas
de06b13deb Update rate-limiter.ts 2024-06-06 11:56:22 -07:00
Nicolas
27a8fd0c3c Update rate-limiter.ts 2024-06-06 11:56:00 -07:00
Nicolas
1129d33321 Update rate-limiter.ts 2024-06-06 11:53:12 -07:00
rafaelsideguide
b234b4be5a Merge branch 'main' into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk 2024-06-06 15:44:29 -03:00
rafaelsideguide
af0bfca847 Merge branch 'main' into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk 2024-06-06 15:36:28 -03:00
rafaelsideguide
8132f22c73 nice 2024-06-06 15:36:20 -03:00
Nicolas
f1b5ec8517 Nick: fixes 2024-06-06 11:23:10 -07:00
Nicolas
deae7dcd61 Update email_notification.ts 2024-06-06 10:41:54 -07:00
Nicolas
f725fa5a97 Update email_notification.ts 2024-06-06 10:41:23 -07:00
rafaelsideguide
fb758fa05e go 2024-06-06 14:01:16 -03:00
Nicolas
0310da6729 Update rate-limiter.ts 2024-06-06 09:31:44 -07:00
Nicolas
01503c1fbf Nick: 2024-06-06 09:29:25 -07:00
rafaelsideguide
b3cae4c858 adding js and testing twine 2024-06-06 13:27:31 -03:00
rafaelsideguide
bc1c1e5053 updating version to check if it runs 2024-06-06 11:41:01 -03:00
Rafael Miller
7686ad5702
Merge pull request #196 from mattjoyce/main
Python-SDK transitional build setup for pyproject.toml
2024-06-06 10:26:16 -03:00
Nicolas
525b4f2a83 Update rate-limiter.ts 2024-06-05 14:38:10 -07:00
Nicolas
d7f8208cdb Update email_notification.ts 2024-06-05 13:53:31 -07:00
Nicolas
ec10eb09f3 Update credit_billing.ts 2024-06-05 13:22:03 -07:00
Nicolas
5991000d2b Update credit_billing.ts 2024-06-05 13:21:15 -07:00
Nicolas
5683bb2cc8 Nick: 2024-06-05 13:20:26 -07:00
rafaelsideguide
164676c70a bugfix screenshot for readme pages 2024-06-05 15:34:42 -03:00
rafaelsideguide
935406b96a Merge branch 'main' into pr/196 2024-06-05 15:19:25 -03:00
Nicolas
b4c6819a54 Nick: 2024-06-05 11:11:09 -07:00
rafaelsideguide
0d51b11dcd missing breaks 2024-06-05 15:02:28 -03:00
Rafael Miller
64423441b2
Merge branch 'main' into main 2024-06-05 14:44:29 -03:00
Nicolas
beb7526d1d Update webhook.ts 2024-06-05 10:38:05 -07:00
Nicolas
1a16378fe8
Merge pull request #234 from JakobStadlhuber/feat/webhook-self-hosted
Add support for Self-Hosted Webhook URL Usage and added project_id into the webhook payload
2024-06-05 10:25:05 -07:00
Nicolas
7cb14edec8 Nick: 2024-06-05 10:13:52 -07:00
Rafael Miller
9e000ded03
Merge branch 'main' into feat/better-gdrive-pdf-fetch 2024-06-05 14:07:56 -03:00
rafaelsideguide
ccc55127d6 Added scroll xpaths on fire-engine for handling readme docs 2024-06-05 11:48:41 -03:00
rafaelsideguide
b5045d1661 [feat] improved the scrape for gdrive pdfs 2024-06-04 17:47:28 -03:00
Nicolas
96257b7b17 Update handleCustomScraping.ts 2024-06-04 12:22:46 -07:00
Nicolas
674500affa Nick: 2024-06-04 12:15:39 -07:00
rafaelsideguide
5ae4d1caf5 Update single_url.ts 2024-06-04 15:28:09 -03:00
Jakob Stadlhuber
9e5ddec207 Remove default webhook URL from .env.example
The default value for the SELF_HOSTED_WEBHOOK_URL in the .env.example file was removed to prevent unintentional exposure or usage. The users are now required to explicitly specify
2024-06-04 19:56:35 +02:00