Sona

Bytespider

ByteDance's aggressive training crawler. Frequently blocked due to heavy crawl volume and robots.txt concerns.

OperatorByteDance
PowersByteDance / TikTok AI model training
PurposeModel training
User-agent tokenBytespider
Respects robots.txtInconsistent

Bytespider gathers data for ByteDance (the company behind TikTok) to train its AI models. It is known for high-volume crawling.

Bytespider has a reputation for not consistently honoring robots.txt, so some operators enforce blocking at the server or firewall level in addition to a robots.txt rule.

Allow Bytespider

You want your content represented in ByteDance's AI models.

User-agent: Bytespider
Allow: /

Block Bytespider

Reduce server load from aggressive crawling and keep content out of ByteDance training — though a robots.txt rule alone may not be fully respected.

User-agent: Bytespider
Disallow: /

Heads up: Bytespider does not reliably honor robots.txt. To enforce a block, combine the rule above with server- or firewall-level filtering of the user-agent.

Can Bytespider read your page right now?

Test any URL and see exactly what AI crawlers receive.

Check my site