Email Extractor · Crawlbase 文档

API 用法

在 Crawling API 请求中添加 &scraper=email-extractor。在 url 参数中对目标 URL 进行 URL 编码。

curl 'https://api.crawlbase.com/?token=YOUR_TOKEN' \
  --data-urlencode 'url=https://letsencrypt.org/contact/' \
  --data-urlencode 'scraper=email-extractor' -G
from crawlbase import CrawlingAPI

api = CrawlingAPI({'token': 'YOUR_TOKEN'})
res = api.get(
    'https://letsencrypt.org/contact/',
    {'scraper': 'email-extractor'}
)

import json
data = json.loads(res['body'])
const { CrawlingAPI } = require('crawlbase');
const api = new CrawlingAPI({ token: 'YOUR_TOKEN' });

const res = await api.get(
  'https://letsencrypt.org/contact/',
  { scraper: 'email-extractor' }
);
const data = JSON.parse(res.body);
require 'crawlbase'
api = Crawlbase::API.new(token: 'YOUR_TOKEN')

res = api.get('https://letsencrypt.org/contact/', scraper: 'email-extractor')
data = JSON.parse(res.body)

示例输入 URL

传入 url 参数的 URL（为便于阅读已进行 URL 解码）：

https://letsencrypt.org/contact/

响应结构

JSON 响应体。当源页面省略相应值时，字段类型可能为 null。

url

string

最终 URL。

title

string

页面标题。

emails

array

电子邮件地址字符串（已去重）。

emails_with_context

array

电子邮件 + 周围文本。

emails_with_context[].email

string

电子邮件地址。

emails_with_context[].context

string

周围文本。

示例响应

{
  "url": "https://letsencrypt.org/contact/",
  "title": "Contact - Let's Encrypt",
  "emails": [
    "[email protected]",
    "[email protected]"
  ]
}