Forum Discussion

dbaimakov's avatar
dbaimakov
Icon for Altocumulus rankAltocumulus
Feb 01, 2024

Bytespider webcrawler unintentional DoS

We've been experiencing constant hits from the Bytespider web crawler, generating a massive amount of traffic every second. This bot, associated with TikTok, has led to what I'd describe as an unintentional DoS attack. Web crawlers from large corporations should use a reasonable approach when scraping websites, but this one falls short. Notably, it fails to comply with robots.txt rules. Even our F5 Bot Defense system doesn't seem to block it. Does anyone have any recommendations for creating an effective custom bot attack signature to counter this?

I've attempted the following:

headercontent:"spider-feedback@bytedance.com"; useragentonly; nocase;
headercontent:"Bytespider"; useragentonly; nocase;

But to no avail. I've also implemented a browser verification challenge, which initially seemed to slow it down, but the crawler eventually bypassed that too.

Any help would be appreciated. Perhaps how would one use advanced rule mode to create a BoT signature for the below User-Agent
 Writing Custom Bot Signatures (f5.com)

Here's the request

Connection: keep-alive Content-Length: 0 Upgrade-Insecure-Requests: 1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/heif,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
User-Agent: Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)