Web scraping IP bans: rotation tactics for legacy TypeScript codebases

Enterprise teams working with legacy TypeScript scrapers face evolving anti-bot defenses including TLS fingerprinting and behavioral analysis. Proxy rotation and request throttling remain foundational, but modern defenses require more sophisticated approaches. The real question: when to refactor versus when to buy commercial scraping APIs.

The Biggish Editorial · Tuesday, February 3, 2026

The Problem

Websites deploy multi-layered anti-scraping defenses: IP blocking, rate limiting, CAPTCHAs, and increasingly, TLS/HTTP/2 fingerprinting that detects automated clients regardless of IP rotation. For teams maintaining legacy TypeScript codebases built on axios or native http, upgrading these systems while preserving functionality presents specific challenges.

What Works in Legacy Systems

Proxy rotation remains foundational. Residential proxies outperform datacenter IPs for evading blocks, as they mimic legitimate user traffic patterns. A basic implementation cycles through proxy pools:

const proxies = [
  { host: 'proxy1.example.com', port: 8080 },
  { host: 'proxy2.example.com', port: 8080 }
];

function getRandomProxy() {
  return proxies[Math.floor(Math.random() * proxies.length)];
}

Request throttling with randomized delays (2-5 seconds) mimics human behavior:

async function scrapeData(urls: string[]) {
  for (const url of urls) {
    await sleep(Math.random() * 3000 + 2000);
    const response = await fetchWithProxy(url);
  }
}

Header rotation (User-Agent, Referer) adds another evasion layer, though sophisticated sites now fingerprint TLS handshakes and HTTP/2 prioritization, which headers alone can't mask.

The Trade-offs

These tactics suit small-scale projects but fail at enterprise scale. Sites deploying WAFs and behavioral analysis detect patterns across IP pools. Headless browsers (Puppeteer, Playwright) handle JavaScript-rendered content but introduce resource overhead in legacy systems not designed for browser automation.

Commercial scraping APIs (Zyte, ScrapingBee) offload proxy management and CAPTCHA solving, trading cost for reliability. For organizations where data collection supports revenue-critical workflows, the build-versus-buy calculus increasingly favors external services.

Legal Context Matters

Ignoring robots.txt or terms of service invites legal risk alongside technical blocks. Enterprise teams should validate scraping compliance before deploying evasion techniques.

What to Watch

Defensive technology evolves faster than evasion methods. Sites adding Cloudflare Turnstile or similar CAPTCHA systems require solver integrations that legacy codebases struggle to accommodate without significant refactoring. The pattern is clear: tactical fixes buy time, but long-term data acquisition strategies need architectural review.

Bottom line: Proxy rotation and throttling still work for basic scraping. For production systems facing modern defenses, evaluate whether legacy code modifications justify the maintenance burden versus transitioning to managed scraping infrastructure.

The Problem

What Works in Legacy Systems

The Trade-offs

Legal Context Matters

What to Watch

Related Articles

Eight Linux commands that actually matter when production breaks on Friday

Testing geo-blocked features without VPNs: what actually works

Developer pushes Pixi.js infinite canvas to 60,000 cards before performance breaks