Nicolas
|
3242872503
|
Update single_url.ts
|
2024-07-25 17:43:55 -04:00 |
|
Nicolas
|
11e6b2680e
|
Merge pull request #455 from mendableai/feat/scrape-monitoring
Add scrape monitoring
|
2024-07-25 16:27:07 -04:00 |
|
Nicolas
|
e5b797549e
|
Merge branch 'main' into feat/scrape-monitoring
|
2024-07-25 16:21:02 -04:00 |
|
Nicolas
|
50d2426fc4
|
Update scrape-events.ts
|
2024-07-25 16:20:29 -04:00 |
|
Nicolas
|
a75d6889c7
|
Merge pull request #450 from mendableai/feat/logger
[wip] Added logger
|
2024-07-25 14:40:19 -04:00 |
|
rafaelsideguide
|
1f1c068eea
|
changing from error to debug
|
2024-07-25 10:00:50 -03:00 |
|
rafaelsideguide
|
e720e1bacf
|
Merge remote-tracking branch 'origin/main' into feat/logger
|
2024-07-25 09:49:27 -03:00 |
|
rafaelsideguide
|
309728a482
|
updated logs
|
2024-07-25 09:48:06 -03:00 |
|
Nicolas
|
2c1221750b
|
Merge pull request #449 from mendableai/bugfix/malformed-url-sitemap
Added regex for links in sitemap
|
2024-07-24 20:37:35 -04:00 |
|
Nicolas
|
6ad7e24403
|
Update ingestion.tsx
|
2024-07-24 18:15:51 -04:00 |
|
Nicolas
|
92843a356d
|
Merge branch 'main' of https://github.com/mendableai/firecrawl
|
2024-07-24 18:13:36 -04:00 |
|
Nicolas
|
1e13ddbe8e
|
Nick: changes to the ui component
|
2024-07-24 18:13:34 -04:00 |
|
Gergő Móricz
|
623b547292
|
fix(fly.toml): scale up memory limit
|
2024-07-24 23:39:00 +02:00 |
|
Nicolas
|
15890772be
|
Scale bump
|
2024-07-24 16:56:19 -04:00 |
|
Eric Ciarla
|
a4bccbe3bb
|
Firecrawl UI Template
Firecrawl UI template
|
2024-07-24 15:05:55 -04:00 |
|
Eric Ciarla
|
a62c0730c1
|
Delete package-lock.json
|
2024-07-24 15:00:19 -04:00 |
|
Eric Ciarla
|
4cb091ad05
|
Update .gitignore
|
2024-07-24 14:59:34 -04:00 |
|
Eric Ciarla
|
4596d0b2e6
|
Add ReadMe and LICENSE
|
2024-07-24 14:56:53 -04:00 |
|
Eric Ciarla
|
9654721bf2
|
Vite commit
|
2024-07-24 14:27:50 -04:00 |
|
rafaelsideguide
|
cc98f83fda
|
added failed and completed log events
|
2024-07-24 15:25:36 -03:00 |
|
Gergo Moricz
|
60c74357df
|
feat(ScrapeEvents): log queue events
|
2024-07-24 18:44:14 +02:00 |
|
rafaelsideguide
|
4eca6bd301
|
fix/check-for-auth-on-scrape-log
|
2024-07-24 12:54:14 -03:00 |
|
Nicolas
|
4ead89f983
|
Merge pull request #453 from mendableai/nsc/notion-fix
Notion Website Fixes
|
2024-07-24 11:40:19 -04:00 |
|
Nicolas
|
3a1b8a9797
|
Update website_params.ts
|
2024-07-24 11:04:47 -04:00 |
|
Nicolas
|
8b48ec8d30
|
Update website_params.ts
|
2024-07-24 11:02:20 -04:00 |
|
Gergo Moricz
|
4d35ad073c
|
feat(monitoring/scrape): include url, worker, response_size
|
2024-07-24 16:43:39 +02:00 |
|
Gergo Moricz
|
64bcedeefc
|
fix(monitoring): bad success check on scrape
|
2024-07-24 16:21:59 +02:00 |
|
Gergo Moricz
|
d57dbbd0c6
|
fix: add jobId for scrape
|
2024-07-24 15:18:12 +02:00 |
|
Gergo Moricz
|
71072fef3b
|
fix(scrape-events): bad logic
|
2024-07-24 14:46:41 +02:00 |
|
Gergo Moricz
|
7cd9bf92e3
|
feat: scrape event logging to DB
|
2024-07-24 14:31:25 +02:00 |
|
Rafael Miller
|
5e728c1a4d
|
Update apps/api/src/scraper/WebScraper/crawler.ts
no need for regex
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
|
2024-07-24 08:33:00 -03:00 |
|
Eric Ciarla
|
1b7a00624d
|
Delete old comp
|
2024-07-23 21:51:08 -04:00 |
|
Eric Ciarla
|
565bc09439
|
Basic react app
|
2024-07-23 21:48:11 -04:00 |
|
rafaelsideguide
|
6208ecdbc0
|
added logger
|
2024-07-23 17:30:46 -03:00 |
|
Eric Ciarla
|
a0d89169ed
|
init
|
2024-07-23 15:48:12 -04:00 |
|
Nicolas
|
f0b07b509b
|
Update index.ts
|
2024-07-23 15:15:56 -04:00 |
|
rafaelsideguide
|
a684bd3c5d
|
added regex for links in sitemap
|
2024-07-23 09:07:23 -03:00 |
|
Nicolas
|
252bc09ee2
|
Merge pull request #447 from mendableai/nsc/speed-improvements
/scrape should now be 600ms-900ms faster
|
2024-07-22 19:18:24 -04:00 |
|
Nicolas
|
ac692ef09c
|
Update CONTRIBUTING.md
|
2024-07-22 19:17:53 -04:00 |
|
Nicolas
|
30e706b43f
|
Update scrape.ts
|
2024-07-22 19:15:24 -04:00 |
|
Nicolas
|
8916fec66c
|
Update index.ts
|
2024-07-22 19:14:53 -04:00 |
|
Nicolas
|
575ddc9e6e
|
Update scrape.ts
|
2024-07-22 19:12:51 -04:00 |
|
Nicolas
|
e31a5007d5
|
Nick: speed improvements
|
2024-07-22 18:30:58 -04:00 |
|
Nicolas
|
1bc36e1a56
|
Update fly-direct.yml
|
2024-07-22 14:12:55 -04:00 |
|
Nicolas
|
b229fbebd8
|
Update scrape_log.ts
|
2024-07-19 12:53:26 -04:00 |
|
rafaelsideguide
|
5c02dbe20c
|
fix(isFile): added .tiff extension
|
2024-07-18 17:07:21 -03:00 |
|
Gergo Moricz
|
f0e95ce399
|
fix(WebCrawler): filter out file URLs when taking URLs from sitemap
|
2024-07-18 21:49:37 +02:00 |
|
Gergo Moricz
|
95c6c63b85
|
fix(fly): raise heap limit to 4G per process
|
2024-07-18 20:56:54 +02:00 |
|
Nicolas
|
5f14f4f788
|
Update blocklist.ts
|
2024-07-18 14:20:19 -04:00 |
|
Nicolas
|
6161b83890
|
Update scrape_log.ts
|
2024-07-18 14:17:08 -04:00 |
|