Classify HTTP User-Agent strings to determine if traffic is human, bot, or AI/LLM crawler.
Send a POST /classify request with a User-Agent string and optionally an IP address. The API runs the input through multiple detection layers:
| Layer | What it does |
|---|---|
| Pattern Matching | Matches against 784 known bot patterns from arcjet/well-known-bots, crawler-user-agents, and ai-robots-txt |
| IP → ASN | Looks up the Autonomous System to identify the network operator |
| Bot IP Verification | Checks if the IP matches officially published ranges (Google, Bing, OpenAI) |
| Datacenter Detection | Identifies if the IP belongs to AWS, GCP, or other cloud providers |
| Confidence Scoring | Combines all signals into high / medium / low confidence |
curl -X POST https://ua-api.lab.c2comms.cloud/classify \
-H "Content-Type: application/json" \
-d '{
"userAgent": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
"ip": "66.249.66.1"
}'
{
"classification": {
"isBot": true,
"isLLM": false,
"category": "search_engine",
"botName": "Google Crawler",
"verified": true,
"confidence": "high",
"signals": ["verified_bot_ip"]
}
}
| Source | Purpose | License |
|---|---|---|
| arcjet/well-known-bots | Primary bot DB (600+ bots) | Apache 2.0 |
| monperrus/crawler-user-agents | Additional regex patterns | MIT |
| ai-robots-txt/ai.robots.txt | AI/LLM bot identification | MIT |
| Google / Bing / OpenAI | Bot IP verification ranges | Public |
| ip-location-db ASN MMDB | IP → ASN lookup | CC0 |