Commit Graph

  • fa8875d64d Update single_url.ts nsc/rm-wait-before-click Nicolas 2024-10-28 15:09:50 -0300
  • 4ee3dec9f1 get rid of WebScraper (mostly) Gergő Móricz 2024-10-28 16:49:49 +0100
  • 007e3edfc5
    Update README.md Nicolas 2024-10-28 12:40:04 -0300
  • 5913437612 Merge branch 'main' into mog/webscraper-refactor Gergő Móricz 2024-10-28 16:36:49 +0100
  • e3e8375c7d Add AgentOps Monitoring Eric Ciarla 2024-10-28 11:13:33 -0400
  • 8ac6b81e2a fix formatting e2e v1 tests txrp0x9 2024-10-28 16:27:22 +0530
  • be004b9eab
    apps/api(deps): bump the prod-deps group across 1 directory with 38 updates dependabot[bot] 2024-10-28 07:05:59 +0000
  • 465b476ca6
    apps/api(deps-dev): bump the dev-deps group across 1 directory with 12 updates dependabot[bot] 2024-10-28 07:01:10 +0000
  • 1002b1c0e7
    apps/playwright-service(deps): bump the prod-deps group across 1 directory with 2 updates dependabot[bot] 2024-10-28 06:51:24 +0000
  • 44c8fb905e
    apps/test-suite(deps-dev): bump the dev-deps group dependabot[bot] 2024-10-28 06:11:00 +0000
  • 04a3898a70
    apps/test-suite(deps): bump the prod-deps group dependabot[bot] 2024-10-28 06:10:08 +0000
  • b48eed5716
    chore(README.md): use satisfies instead of as for ts example Twilight 2024-10-28 09:40:35 +0545
  • 877d5e4383 Update types.ts Nicolas 2024-10-27 23:17:20 -0300
  • 68b2e1b209 Update log_job.ts nsc/prevent-single-url-logs Nicolas 2024-10-27 23:14:25 -0300
  • 3bec0090bc links are now extracted after applying excludeTags txrp0x9 2024-10-28 04:51:58 +0530
  • a858cb8970
    (typo):I've made some corrections to your Documentation Prathamesh Pawar 2024-10-27 13:51:37 +0530
  • 8a4f4cb9d9
    Update README.md Nicolas 2024-10-26 16:10:51 -0300
  • 9593ab80e1
    Update README.md Nicolas 2024-10-26 16:03:07 -0300
  • 801f0f773e Nick: fix auto charge failing when payment is through Link Nicolas 2024-10-26 03:59:15 -0300
  • d87467a158 Merge branch 'main' into support-anthropic-for-extraction aar2dee2 2024-10-26 09:44:57 +0530
  • 20e5348e9a
    Merge pull request #809 from mendableai/nsc/pay-as-you-go-lw2 Nicolas 2024-10-25 16:15:08 -0300
  • 97b8d6c333 Update auto_charge.ts nsc/pay-as-you-go-lw2 Nicolas 2024-10-25 16:05:39 -0300
  • 95c4652fd4 Nick: 10min cooldown on auto charge Nicolas 2024-10-25 16:05:23 -0300
  • 4468a49aee concurrency limit fix PoC II. Gergő Móricz 2024-10-25 20:21:12 +0200
  • 256a98b86e always replace paths with absolutes fix/issue-736 rafaelsideguide 2024-10-25 15:14:43 -0300
  • 4590577cba Merge branch 'main' into nsc/pay-as-you-go-lw2 Nicolas 2024-10-25 14:02:24 -0300
  • a4f5e47a93 added tests for google drive content .txt and .pdf txrp0x9 2024-10-25 21:47:44 +0530
  • 91e7eaaf33 added verification_token to sdks txrp0x9 2024-10-25 21:27:52 +0530
  • 97b36c4755 webhook verification with verificationToken txrp0x9 2024-10-24 23:55:02 +0530
  • 177496d739 fix anthropic output parsing for extraction without schema - likely to be inconsistent because anthropic does not have a response_format param aar2dee2 2024-10-25 11:25:09 +0530
  • c7c846ae45 Merge branch 'main' into fly-dep aar2dee2 2024-10-25 10:53:25 +0530
  • dbcf2d7ff6 Nick: fix loggin for batch scrape Nicolas 2024-10-24 23:44:08 -0300
  • 73e6db45de Update email_notification.ts Nicolas 2024-10-24 23:14:41 -0300
  • d965f2ce7d Nick: fixes Nicolas 2024-10-24 23:13:30 -0300
  • 29b34270c8 Merge branch 'main' into nsc/pay-as-you-go-lw2 Nicolas 2024-10-24 22:31:04 -0300
  • 9a4ccd0801 Claude Web Crawler with Batch Scrape Eric Ciarla 2024-10-24 14:40:53 -0400
  • 629bf34bad Improved google drive support, added downloadable text type txrp0x9 2024-10-24 19:33:49 +0530
  • a1e8d2ee28 made secretkey optional txrp0x9 2024-10-24 21:32:36 +0530
  • d987f727fc added support for webhook authentication txrp0x9 2024-10-24 21:26:17 +0530
  • 2840991a50 add ANTHROPIC_API_KEY to .env.exampel aar2dee2 2024-10-24 20:45:59 +0530
  • 0e2a68d9ae get structured json using anthropic claude aar2dee2 2024-10-24 14:23:49 +0530
  • 8870cce75b cleanup: remove mongo commands from package.json skeptrune 2024-10-23 19:47:24 -0700
  • 6188f0f2f4 bugfix: do not exclude <prefix><excludeurl> patterns + require a subdomain skeptrune 2024-10-23 19:20:02 -0700
  • 23242d7150 cleanup: fully remove supabase dependency skeptrune 2024-10-23 17:58:59 -0700
  • 2a15d55e14 cleanup: add additional ignore URL patterns skeptrune 2024-10-23 16:58:17 -0700
  • b024f08639 cleanup: remove SLACK_WEBHOOK_URL env and code skeptrune 2024-10-23 16:36:25 -0700
  • a1756b0a38 cleanup: remove fire-engine skeptrune 2024-10-23 16:24:59 -0700
  • 8153225c43 cleanup: completely remove all code which uses supabase skeptrune 2024-10-23 14:01:30 -0700
  • c1b2a1c664 bugfix: use regex to exclude all subdomains for social media URLs + add wikipedia skeptrune 2024-10-23 13:15:37 -0700
  • 1da6360b77 feat(batch/scrape): restructure logs, add webhooks Gergő Móricz 2024-10-23 21:55:21 +0200
  • e3cb00990a Merge branch 'mog/bulk-scrape' Nicolas 2024-10-23 16:31:14 -0300
  • 19cac2220f Nick: mog/bulk-scrape Nicolas 2024-10-23 16:31:01 -0300
  • 76ca7fdcb5
    Merge pull request #789 from mendableai/mog/bulk-scrape Nicolas 2024-10-23 16:12:42 -0300
  • b11035814a Nick: Nicolas 2024-10-23 16:10:21 -0300
  • f0054da934 Nick: lgtm Nicolas 2024-10-23 16:06:08 -0300
  • c7f2170980 Update example.py Nicolas 2024-10-23 16:04:46 -0300
  • 60b6e6b1d4 Nick: fixes Nicolas 2024-10-23 15:59:40 -0300
  • d8abd15716 Nick: from bulk to batch Nicolas 2024-10-23 15:37:24 -0300
  • 70c4e7c334 feat(bulk/scrape): check credits via url list length Gergő Móricz 2024-10-23 19:42:02 +0200
  • 66e505317e Merge branch 'main' into mog/bulk-scrape Nicolas 2024-10-23 14:36:26 -0300
  • e0d3b761fc
    Merge pull request #808 from mendableai/feat/skipTlsVerification Nicolas 2024-10-22 20:47:13 -0300
  • 7432f25523
    Merge pull request #807 from mendableai/mog/acuc-cache-clear Nicolas 2024-10-22 20:43:18 -0300
  • d375bca167 Update acuc-cache-clear.ts mog/acuc-cache-clear Nicolas 2024-10-22 20:42:59 -0300
  • bbfdda8867 Nick: init Nicolas 2024-10-22 19:47:23 -0300
  • acde353e56 skipTlsVerification on robots.txt scraping feat/skipTlsVerification Thomas Kosmas 2024-10-23 01:07:03 +0300
  • 69bcbbe3c0 bugfix: only exclude social links w/ the full http:// skeptrune 2024-10-22 14:45:31 -0700
  • bd55464b52 skipTlsVerification Thomas Kosmas 2024-10-22 22:28:02 +0300
  • 6ed3104eb6 feat: clear ACUC cache endpoint based on team ID Gergő Móricz 2024-10-22 20:28:10 +0200
  • 3cd328cf93 feat(bulk/scrape): add node and python SDK integration + docs Gergő Móricz 2024-10-22 18:58:48 +0200
  • 5ab3a8824e
    chore(test): adapt src and add phpunit tests sanix-darker 2024-10-21 22:02:20 +0200
  • 76c0073829 Nick: grok 2 example Nicolas 2024-10-21 16:27:15 -0300
  • a7cb973aa1
    chore(php-sdk): initial commit sanix-darker 2024-10-21 06:20:26 +0200
  • d2344aa14b Revert "Nick: improved map ranking algorithm" Nicolas 2024-10-21 16:11:32 -0300
  • edac6850c6
    Merge pull request #797 from rishi-raj-jain/patch-2 Eric Ciarla 2024-10-21 12:05:00 -0400
  • 22d375ad29 Updates Eric Ciarla 2024-10-21 12:01:09 -0400
  • 9ab922837c
    Merge pull request #796 from mendableai/fix/issue-663 Nicolas 2024-10-21 12:24:57 -0300
  • d31b85fa91
    Merge pull request #793 from mendableai/fix/issue-665 Nicolas 2024-10-21 12:24:46 -0300
  • 9a2658768e
    Merge pull request #799 from Mefisto04/contribute Nicolas 2024-10-21 12:23:56 -0300
  • e1d8e1584e
    Update SELF_HOST.md Nicolas 2024-10-21 12:23:27 -0300
  • 209bbd1346
    Merge pull request #798 from mendableai/nsc/improved-map-search Nicolas 2024-10-21 12:22:19 -0300
  • b37d2a0000
    apps/test-suite(deps-dev): bump the dev-deps group dependabot[bot] 2024-10-21 07:11:05 +0000
  • 85a296fe4d
    apps/test-suite(deps): bump the prod-deps group dependabot[bot] 2024-10-21 07:10:12 +0000
  • 2b338c08c4
    apps/api(deps-dev): bump the dev-deps group in /apps/api with 12 updates dependabot[bot] 2024-10-21 06:35:00 +0000
  • 5014d6972a
    apps/api(deps): bump the prod-deps group in /apps/api with 37 updates dependabot[bot] 2024-10-21 06:32:34 +0000
  • 0a448608bf
    apps/playwright-service(deps): bump the prod-deps group dependabot[bot] 2024-10-21 06:15:40 +0000
  • cf98d69bbb
    Update requirements.txt Rishi Raj Jain 2024-10-20 18:09:38 +0530
  • d113199a29
    Update app.py Rishi Raj Jain 2024-10-20 18:08:38 +0530
  • 2b0c52ff67
    Update SELF_HOST.md Mayur Kawale 2024-10-20 12:33:45 +0530
  • 7acd8d2edb Nick: improved map ranking algorithm nsc/improved-map-search Nicolas 2024-10-19 13:27:47 -0300
  • 8a4ee4482d
    Create output_01f6efd5-1297-4745-94b5-5972c10f17d6.json Rishi Raj Jain 2024-10-19 03:54:14 +0530
  • 42ec08c76e
    Update websites.csv Rishi Raj Jain 2024-10-19 03:53:41 +0530
  • 7d8519218a
    Update app.py Rishi Raj Jain 2024-10-19 02:27:39 +0530
  • 2022db7f0a
    Update websites.csv Rishi Raj Jain 2024-10-19 02:27:25 +0530
  • f5af938ea2
    Update requirements.txt Rishi Raj Jain 2024-10-19 02:27:17 +0530
  • ba3ee8ead6
    Create .env.example Rishi Raj Jain 2024-10-19 00:52:47 +0530
  • adfc493c9b
    Create websites.csv Rishi Raj Jain 2024-10-19 00:52:26 +0530
  • 11fd630e55
    Create requirements.txt Rishi Raj Jain 2024-10-19 00:52:14 +0530
  • 10381b5d3c
    Create app.py Rishi Raj Jain 2024-10-19 00:51:18 +0530
  • 18f69c90b1 fix/missing error in response fix/issue-663 rafaelsideguide 2024-10-18 15:18:57 -0300
  • aed11e72a6 fix encoding if error fix/issue-665 rafaelsideguide 2024-10-18 11:50:58 -0300