Bytespider
ByteDance's aggressive training crawler. Frequently blocked due to heavy crawl volume and robots.txt concerns.
| Operator | ByteDance |
|---|---|
| Powers | ByteDance / TikTok AI model training |
| Purpose | Model training |
| User-agent token | Bytespider |
| Respects robots.txt | Inconsistent |
Bytespider gathers data for ByteDance (the company behind TikTok) to train its AI models. It is known for high-volume crawling.
Bytespider has a reputation for not consistently honoring robots.txt, so some operators enforce blocking at the server or firewall level in addition to a robots.txt rule.
Allow Bytespider
You want your content represented in ByteDance's AI models.
User-agent: Bytespider Allow: /
Block Bytespider
Reduce server load from aggressive crawling and keep content out of ByteDance training — though a robots.txt rule alone may not be fully respected.
User-agent: Bytespider Disallow: /
Heads up: Bytespider does not reliably honor robots.txt. To enforce a block, combine the rule above with server- or firewall-level filtering of the user-agent.
Can Bytespider read your page right now?
Test any URL and see exactly what AI crawlers receive.