Eric Ciarla
|
837b446390
|
Update index.test.ts
|
2024-06-29 08:48:42 -04:00 |
|
Eric Ciarla
|
fe6e3aeadc
|
Update index.test.ts
|
2024-06-29 08:44:21 -04:00 |
|
Eric Ciarla
|
6c9f0dfc91
|
Add tests
|
2024-06-29 08:32:20 -04:00 |
|
Jeff Pereira
|
a5fb45988c
|
new feature allowExternalContentLinks
|
2024-06-28 17:23:40 -07:00 |
|
Eric Ciarla
|
87b54488d3
|
update to includeRawHtml
|
2024-06-28 17:07:47 -04:00 |
|
Eric Ciarla
|
70fcf2ce03
|
init
|
2024-06-28 16:39:09 -04:00 |
|
Nicolas
|
9bf74bc774
|
Update single_url.ts
|
2024-06-28 15:51:18 -03:00 |
|
Nicolas
|
7e17498bcf
|
Update single_url.ts
|
2024-06-28 15:45:16 -03:00 |
|
rafaelsideguide
|
d66e1f7846
|
looking good
|
2024-06-27 16:00:45 -03:00 |
|
Nicolas
|
9e7298945c
|
Update openapi.json
|
2024-06-26 21:25:38 -03:00 |
|
Nicolas
|
1ec0bf8adf
|
Update openapi.json
|
2024-06-26 21:22:46 -03:00 |
|
Nicolas
|
042f81ddf2
|
Update removeUnwantedElements.test.ts
|
2024-06-26 21:20:11 -03:00 |
|
Nicolas
|
388ce3cbce
|
Nick: small changes
|
2024-06-26 21:15:42 -03:00 |
|
Nicolas
|
1d4907acc9
|
Nick:
|
2024-06-26 21:02:58 -03:00 |
|
rafaelsideguide
|
c40da77be0
|
Added implementation for saving docs on supabase
- TODO: remove the comments on `log_job.ts` before deploying to prod
|
2024-06-26 18:23:28 -03:00 |
|
Nicolas
|
3b92fb8433
|
Merge pull request #322 from mendableai/tests/metadata
[Test] Added E2E tests for checking metadata values
|
2024-06-26 12:09:18 -03:00 |
|
rafaelsideguide
|
67d7650cf3
|
Added to e2e_noAuth
|
2024-06-26 12:07:55 -03:00 |
|
rafaelsideguide
|
009df6c930
|
Added crawl limit unit test
I think this test is over relying on mocks but I have no idea on how to fix this without changing the code arch structure
|
2024-06-26 09:54:25 -03:00 |
|
rafaelsideguide
|
05eaa3c68d
|
Update index.test.ts
|
2024-06-26 09:32:02 -03:00 |
|
rafaelsideguide
|
4381109dd8
|
added default values and fixed pdf bug
|
2024-06-26 09:00:54 -03:00 |
|
Nicolas
|
45f2765601
|
Merge pull request #316 from snippet/types-webscraper
add some types
|
2024-06-25 22:03:21 -03:00 |
|
Nicolas
|
768a131b5c
|
Merge pull request #318 from mendableai/bug/fix-custom-scrape-pdf-google-drive
[Bug] Fixed the regex test for google drive pdf files
|
2024-06-25 18:27:11 -03:00 |
|
rafaelsideguide
|
5f69fc7677
|
Fixed the regex test
|
2024-06-25 18:24:01 -03:00 |
|
rafaelsideguide
|
d02829d335
|
fixed clean jobs
|
2024-06-25 17:49:29 -03:00 |
|
Jeff Pereira
|
199cbe8bcb
|
add some types
|
2024-06-25 12:20:25 -07:00 |
|
Nicolas
|
749b0c05dc
|
Merge branch 'main' of https://github.com/mendableai/firecrawl
|
2024-06-25 15:21:15 -03:00 |
|
Nicolas
|
e7be17db92
|
Nick: metadata fixes and lock duration for bull decreased to 2 hrs
|
2024-06-25 15:21:14 -03:00 |
|
Nicolas
|
f84fb4b331
|
Merge pull request #313 from snippet/google-search-term-fix
fix multi-word search term issue: /search (w/o Serp)
|
2024-06-24 19:24:58 -03:00 |
|
Jeff Pereira
|
6ddf3a58a1
|
fix multi-word search term issue: /search (w/o Serp)
|
2024-06-24 14:21:52 -07:00 |
|
Nicolas
|
90b7fff366
|
Update crawler.ts
|
2024-06-24 16:52:01 -03:00 |
|
Nicolas
|
08c1fa799b
|
Update queue-worker.ts
|
2024-06-24 16:51:32 -03:00 |
|
rafaelsideguide
|
3ebdf93342
|
removed console.logs
|
2024-06-24 16:43:12 -03:00 |
|
Nicolas
|
56d42d9c9b
|
Nick:
|
2024-06-24 16:33:07 -03:00 |
|
rafaelsideguide
|
21d29de819
|
testing crawl with new.abb.com case
many unnecessary console.logs for tracing the code execution
|
2024-06-24 16:25:07 -03:00 |
|
Nicolas
|
3c7b7e7242
|
NIck: fixes fallback
|
2024-06-23 18:59:08 -03:00 |
|
Caleb Peffer
|
e59ba758f5
|
Caleb: changed posthog logging so that It associates jobs with a group. No
|
2024-06-18 17:42:21 -07:00 |
|
Caleb Peffer
|
5a91d8425f
|
Caleb: solve for typechecking on idempotencyKey on my machine
|
2024-06-18 17:07:38 -07:00 |
|
rafaelsideguide
|
9c539e9113
|
Fixed includeHTML to use cleanedHtml as response
|
2024-06-18 16:26:54 -03:00 |
|
Rafael Miller
|
f5a9acc4c6
|
Merge branch 'main' into feat/removeTags-regex
|
2024-06-18 14:39:59 -03:00 |
|
rafaelsideguide
|
9f7afd1e88
|
fix for some complex cases
|
2024-06-18 14:36:51 -03:00 |
|
Nicolas
|
d0c05accf6
|
Nick:
|
2024-06-18 13:21:50 -04:00 |
|
Nicolas
|
818751a256
|
Merge pull request #294 from mendableai/tests/e2e-to-unit
[Test] Transcribed from e2e to unit tests for many cases
|
2024-06-18 13:09:22 -04:00 |
|
rafaelsideguide
|
727e5de8c5
|
Update index.test.ts
|
2024-06-18 11:54:10 -03:00 |
|
rafaelsideguide
|
c54e797eb1
|
(╯°□°)╯︵ ┻━┻
|
2024-06-18 11:51:28 -03:00 |
|
rafaelsideguide
|
20f14bcf7f
|
Added some types
|
2024-06-18 10:55:07 -03:00 |
|
rafaelsideguide
|
c2fc69af1c
|
removed some e2e tests that are making the ci get stuck
|
2024-06-18 09:57:05 -03:00 |
|
rafaelsideguide
|
6c726a02eb
|
Moved to utils/removeUnwantedElements, added unit tests
|
2024-06-18 09:46:42 -03:00 |
|
AndyMik90
|
8b3c3aae91
|
Added support for RegEx in removeTags
|
2024-06-18 07:31:46 +02:00 |
|
rafaelsideguide
|
b2bd562bb2
|
transcribed from e2e to unit tests for many cases
|
2024-06-17 17:09:44 -03:00 |
|
Nicolas
|
ab038051e9
|
Merge branch 'main' into nsc/rate-limiter-tests
|
2024-06-17 15:06:12 -04:00 |
|
Eric Ciarla
|
519ab1aecb
|
Update unit tests
|
2024-06-15 17:14:09 -04:00 |
|
Eric Ciarla
|
f0d4146b42
|
Merge branch 'feat/maxDepthRelative' of https://github.com/mendableai/firecrawl into feat/maxDepthRelative
|
2024-06-15 16:52:00 -04:00 |
|
Eric Ciarla
|
ff7b52cab1
|
Delete one more e2e test
|
2024-06-15 16:51:50 -04:00 |
|
Eric Ciarla
|
b1eb608295
|
Merge branch 'main' into feat/maxDepthRelative
|
2024-06-15 16:50:27 -04:00 |
|
Eric Ciarla
|
34e37c5671
|
Add unit tests to replace e2e
|
2024-06-15 16:43:37 -04:00 |
|
Eric Ciarla
|
2b40729cc2
|
Update index.test.ts
|
2024-06-15 08:56:32 -04:00 |
|
Eric Ciarla
|
f22759b2e7
|
Update index.test.ts
|
2024-06-14 19:42:11 -04:00 |
|
Eric Ciarla
|
a6b7197737
|
Fix for maxDepth
|
2024-06-14 19:40:37 -04:00 |
|
Nicolas
|
4ec863718b
|
Merge pull request #283 from mendableai/nsc/crawler-fixes
Fixes crawler getting confused with base paths that contain www.
|
2024-06-14 13:50:32 -07:00 |
|
Nicolas
|
43767360d8
|
Merge branch 'main' into nsc/rate-limiter-tests
|
2024-06-14 13:50:21 -07:00 |
|
Nicolas
|
e88cb314c8
|
Update crawler.ts
|
2024-06-14 13:44:54 -07:00 |
|
Rafael Miller
|
361cba4119
|
Merge pull request #175 from mendableai/test/load-testing
Test/load testing
|
2024-06-14 17:39:01 -03:00 |
|
Nicolas
|
7b11ace87d
|
Create rate-limiter.test.ts
|
2024-06-14 12:31:42 -07:00 |
|
rafaelsideguide
|
e369d1dd0e
|
Update index.test.ts
|
2024-06-14 16:17:54 -03:00 |
|
Nicolas
|
e37aa3db57
|
Nick: fixed rate limit on status
|
2024-06-14 12:13:02 -07:00 |
|
rafaelsideguide
|
a6ed2e693f
|
Update index.test.ts
|
2024-06-14 15:22:52 -03:00 |
|
rafaelsideguide
|
ad7795f973
|
Merge remote-tracking branch 'origin/main' into test/load-testing
|
2024-06-14 15:14:01 -03:00 |
|
rafaelsideguide
|
354712a8a3
|
just changed the name for the test?
|
2024-06-14 13:02:04 -03:00 |
|
Eric Ciarla
|
2c5f5c0ea2
|
Merge branch 'main' into feat/maxDepthRelative
|
2024-06-14 11:49:12 -04:00 |
|
Eric Ciarla
|
80c10393b4
|
Update index.test.ts
|
2024-06-14 11:32:30 -04:00 |
|
Eric Ciarla
|
42ed1f4479
|
Update index.test.ts
|
2024-06-14 11:20:24 -04:00 |
|
Eric Ciarla
|
8830acce07
|
Update index.test.ts
|
2024-06-14 11:11:58 -04:00 |
|
Eric Ciarla
|
278bb311cb
|
Update index.test.ts
|
2024-06-14 11:02:39 -04:00 |
|
Eric Ciarla
|
36a62727b8
|
Update index.test.ts
|
2024-06-14 10:52:43 -04:00 |
|
Rafael Miller
|
f9c7ca9388
|
Merge branch 'main' into feat/issue-266
|
2024-06-14 11:47:58 -03:00 |
|
Rafael Miller
|
3e2e76311c
|
Merge branch 'main' into feat/issue-205
|
2024-06-14 11:25:20 -03:00 |
|
Eric Ciarla
|
59451754f5
|
Add tests
|
2024-06-14 10:14:07 -04:00 |
|
Eric Ciarla
|
9b254c1cd0
|
Update index.test.ts
|
2024-06-14 09:48:14 -04:00 |
|
Eric Ciarla
|
9aba451b18
|
Update index.test.ts
|
2024-06-14 09:33:43 -04:00 |
|
rafaelsideguide
|
5dd18ca79b
|
fixed edge cases
|
2024-06-14 09:46:55 -03:00 |
|
Eric Ciarla
|
ab9de0f5ab
|
Update maxDepth tests
|
2024-06-13 18:46:30 -04:00 |
|
Eric Ciarla
|
393bd45237
|
Update index.test.ts
|
2024-06-13 18:13:15 -04:00 |
|
Eric Ciarla
|
71c98d8b80
|
Update logic
|
2024-06-13 18:00:52 -04:00 |
|
Eric Ciarla
|
095951aa4d
|
Update test
|
2024-06-13 17:40:00 -04:00 |
|
Eric Ciarla
|
5e8aa92788
|
Update index.ts
|
2024-06-13 17:33:13 -04:00 |
|
Eric Ciarla
|
bf10e9d392
|
Update index.test.ts
|
2024-06-13 17:28:59 -04:00 |
|
Eric Ciarla
|
65d63bae45
|
Update index.ts
|
2024-06-13 17:17:44 -04:00 |
|
Eric Ciarla
|
32e814bedc
|
Update index.ts
|
2024-06-13 17:02:30 -04:00 |
|
Nicolas
|
6fc1ee32fd
|
Merge pull request #275 from mendableai/feat/issue-273
Added pageOptions.removeTags
|
2024-06-13 13:27:01 -07:00 |
|
rafaelsideguide
|
bb859ae9a7
|
Added metadata.pageStatusCode and metadata.pageError properties to the responses
|
2024-06-13 17:08:40 -03:00 |
|
rafaelsideguide
|
676d6e8ab5
|
Added pageOptions.removeTags
|
2024-06-13 10:51:05 -03:00 |
|
Nicolas
|
182f8d4d6c
|
Update index.ts
|
2024-06-12 18:07:05 -07:00 |
|
Nicolas
|
11b6d5afa5
|
Update fly.toml
|
2024-06-12 18:00:22 -07:00 |
|
Nicolas
|
67dc46b454
|
Nick: clusters
|
2024-06-12 17:53:04 -07:00 |
|
rafaelsideguide
|
d20af257ba
|
Added jobId to webhook data
|
2024-06-12 15:38:41 -03:00 |
|
rafaelsideguide
|
e37d151404
|
added parsePDF option to pageOptions
user can decide if they are going to let us take care of the parse or they are going to parse the pdf by themselves
|
2024-06-12 15:06:47 -03:00 |
|
rafaelsideguide
|
01c9f071fa
|
fixed
|
2024-06-12 11:27:06 -03:00 |
|
rafaelsideguide
|
dc6acbf1f0
|
Merge remote-tracking branch 'origin/main' into feat/allowbackwardcrawling-option
|
2024-06-12 11:01:05 -03:00 |
|
Nicolas
|
f93231499f
|
Merge pull request #265 from mendableai/feat/issue-264
[Feat] Added route to clean completed jobs and a github action cron that triggers every 24h
|
2024-06-11 21:33:52 -07:00 |
|
Nicolas
|
45dee63943
|
Merge pull request #262 from mendableai/nsc/webhook-self-host-fix
Only fetch webhook from db if self host webhook not set and using db auth
|
2024-06-11 15:46:57 -07:00 |
|
rafaelsideguide
|
157fbe4a1e
|
added bull auth key
|
2024-06-11 17:52:01 -03:00 |
|
rafaelsideguide
|
df3a678cf4
|
getting back the cancel test, this should work
|
2024-06-11 17:46:56 -03:00 |
|
rafaelsideguide
|
def2ba9987
|
added tests
|
2024-06-11 17:46:25 -03:00 |
|
Nicolas
|
1e3e06a1d5
|
Update replacePaths.test.ts
|
2024-06-11 13:02:39 -07:00 |
|
Nicolas
|
2239e03269
|
Update replacePaths.test.ts
|
2024-06-11 12:54:02 -07:00 |
|
Nicolas
|
520739c9f4
|
Nick: fixed bugs associated with absolute path replacements
|
2024-06-11 12:43:16 -07:00 |
|
Nicolas
|
b87725c683
|
Update openapi.json
|
2024-06-11 12:08:49 -07:00 |
|
rafaelsideguide
|
ee282c3d55
|
Added allowBackwardCrawling option
|
2024-06-11 15:24:39 -03:00 |
|
rafaelsideguide
|
a9f93c2f1e
|
Added route to clean completed jobs and a github action cron that triggers every 24h
|
2024-06-11 14:18:05 -03:00 |
|
Nicolas
|
da38dad9a7
|
Merge branch 'main' of https://github.com/mendableai/firecrawl
|
2024-06-10 18:26:31 -07:00 |
|
Nicolas
|
9390816c1b
|
Update openapi.json
|
2024-06-10 18:26:25 -07:00 |
|
Nicolas
|
f6b06ac27a
|
Nick: ignoreSitemap, better crawling algo
|
2024-06-10 18:12:41 -07:00 |
|
Nicolas
|
1bd0327e1a
|
Merge branch 'main' into nsc/pageoptions-crawler
|
2024-06-10 17:15:10 -07:00 |
|
Nicolas
|
99f2ffd6d5
|
Update webhook.ts
|
2024-06-10 17:03:10 -07:00 |
|
Nicolas
|
7ae9778642
|
Update single_url.ts
|
2024-06-10 16:57:31 -07:00 |
|
Nicolas
|
913c1dd568
|
Nick: fetch -> axios and fix timeouts
|
2024-06-10 16:49:03 -07:00 |
|
Nicolas
|
3091f0134c
|
Nick:
|
2024-06-10 16:27:10 -07:00 |
|
Nicolas
|
f24ca76618
|
Nick: removing rate limit emails for now
|
2024-06-07 10:39:11 -07:00 |
|
Nicolas
|
98d82c4cec
|
Update search.ts
|
2024-06-06 20:02:21 -07:00 |
|
Nicolas
|
5e80f8af87
|
Nick: llm extract 50
|
2024-06-06 18:35:44 -07:00 |
|
rafaelsideguide
|
f5318ea7d7
|
Update index.test.ts
|
2024-06-06 16:50:20 -03:00 |
|
rafaelsideguide
|
cd7f9abcec
|
Update index.test.ts
|
2024-06-06 16:44:46 -03:00 |
|
rafaelsideguide
|
7b9b668b95
|
Update index.test.ts
|
2024-06-06 16:36:51 -03:00 |
|
rafaelsideguide
|
82e0ed4cd3
|
Update index.test.ts
|
2024-06-06 16:33:27 -03:00 |
|
rafaelsideguide
|
dac7612be2
|
Merge branch 'main' of https://github.com/mendableai/firecrawl into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk
|
2024-06-06 16:07:25 -03:00 |
|
Nicolas
|
c2ad358390
|
Nick:
|
2024-06-06 12:05:20 -07:00 |
|
rafaelsideguide
|
79ec9f04dc
|
Merge branch 'main' of https://github.com/mendableai/firecrawl into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk
|
2024-06-06 15:58:14 -03:00 |
|
Nicolas
|
de06b13deb
|
Update rate-limiter.ts
|
2024-06-06 11:56:22 -07:00 |
|
Nicolas
|
27a8fd0c3c
|
Update rate-limiter.ts
|
2024-06-06 11:56:00 -07:00 |
|
Nicolas
|
1129d33321
|
Update rate-limiter.ts
|
2024-06-06 11:53:12 -07:00 |
|
rafaelsideguide
|
b234b4be5a
|
Merge branch 'main' into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk
|
2024-06-06 15:44:29 -03:00 |
|
rafaelsideguide
|
af0bfca847
|
Merge branch 'main' into 194-sdk-ci-pipeline-for-publishing-pythonnode-sdk
|
2024-06-06 15:36:28 -03:00 |
|
rafaelsideguide
|
8132f22c73
|
nice
|
2024-06-06 15:36:20 -03:00 |
|
Nicolas
|
f1b5ec8517
|
Nick: fixes
|
2024-06-06 11:23:10 -07:00 |
|
Nicolas
|
deae7dcd61
|
Update email_notification.ts
|
2024-06-06 10:41:54 -07:00 |
|
Nicolas
|
f725fa5a97
|
Update email_notification.ts
|
2024-06-06 10:41:23 -07:00 |
|
Nicolas
|
0310da6729
|
Update rate-limiter.ts
|
2024-06-06 09:31:44 -07:00 |
|
Nicolas
|
01503c1fbf
|
Nick:
|
2024-06-06 09:29:25 -07:00 |
|
Nicolas
|
525b4f2a83
|
Update rate-limiter.ts
|
2024-06-05 14:38:10 -07:00 |
|
Nicolas
|
d7f8208cdb
|
Update email_notification.ts
|
2024-06-05 13:53:31 -07:00 |
|
Nicolas
|
ec10eb09f3
|
Update credit_billing.ts
|
2024-06-05 13:22:03 -07:00 |
|
Nicolas
|
5991000d2b
|
Update credit_billing.ts
|
2024-06-05 13:21:15 -07:00 |
|
Nicolas
|
5683bb2cc8
|
Nick:
|
2024-06-05 13:20:26 -07:00 |
|
rafaelsideguide
|
164676c70a
|
bugfix screenshot for readme pages
|
2024-06-05 15:34:42 -03:00 |
|
Nicolas
|
b4c6819a54
|
Nick:
|
2024-06-05 11:11:09 -07:00 |
|
rafaelsideguide
|
0d51b11dcd
|
missing breaks
|
2024-06-05 15:02:28 -03:00 |
|
Nicolas
|
beb7526d1d
|
Update webhook.ts
|
2024-06-05 10:38:05 -07:00 |
|
Nicolas
|
1a16378fe8
|
Merge pull request #234 from JakobStadlhuber/feat/webhook-self-hosted
Add support for Self-Hosted Webhook URL Usage and added project_id into the webhook payload
|
2024-06-05 10:25:05 -07:00 |
|
Nicolas
|
7cb14edec8
|
Nick:
|
2024-06-05 10:13:52 -07:00 |
|
Rafael Miller
|
9e000ded03
|
Merge branch 'main' into feat/better-gdrive-pdf-fetch
|
2024-06-05 14:07:56 -03:00 |
|