In the early days of the web, a bot was easy to spot. It had no cookies, a weird User-Agent string, and it moved with the grace of a brick. Fast forward to 2026, and the landscape has shifted. Today’s attackers use Headless Chrome, often orchestrated by frameworks like Puppeteer or Playwright, to mimic human behaviour with terrifying accuracy.
For e-commerce merchants and developers, these ghost browsers are the primary tools used for card testing, account takeovers (ATO), and most importantly for growing brands, Competitive Intelligence.What is a Headless Browser?
A headless browser is a web browser without a graphical user interface (GUI). It provides a real browser environment capable of executing complex JavaScript and rendering CSS, but it runs on a server or a command-line interface.
- Performance: Significantly faster and more resource-efficient because it doesn’t spend CPU/RAM on drawing pixels to a screen.
- Environment: Ideal for servers and CI/CD pipelines that lack display capabilities.
- Control: Managed programmatically via APIs or command-line interfaces.
While essential for legitimate developers to run automated tests, these tools have become the Swiss Army Knife for fraudsters and corporate spies alike.Primary Use Cases
Headless browsers are essential for tasks that require browsing behaviour without human intervention:
- End-to-End (E2E) Testing: Simulating user interactions to verify frontend flows in automated testing environments.
- Web Automation: Executing repetitive tasks, such as filling out complex forms or interacting with internal web tools at scale.
- Web Scraping: Essential for dynamic websites (SPAs) where content is rendered via JavaScript and cannot be captured by simple GET requests.
Why New Headless Changed the Game
Recently, Google introduced a new Headless mode. Unlike the old version, which was a separate, stripped-down implementation, the new mode is the actual Chrome browser just running without a window. This made traditional detection methods, like checking for a specific headless User-Agent, completely obsolete. If the browser is Chrome, how do you tell if a human or a script is clicking the buttons?The Competitor’s Spy: Why They Are Targeting Your Store
If your brand has recently gone viral on Instagram or you’ve scaled up your Meta Ads in the market, you will likely see Shadow Traffic within 48 hours. Tools like Koala Inspector or PPSPY use headless browsers to scrape:
- Live Sales Data: They don’t guess your revenue; they monitor your Recent Purchase pop-ups and inventory levels to see exactly how much you are making daily.
- The /collections/all Backdoor: Bots hit this URL with a sort_by=created-desc parameter to see your newest product launches before you even market them.
- Inventory Level Cart-Testing: Bots add 999 items to a cart to leak your exact stock levels, allowing competitors to know exactly when you’re running low so they can undercut your ads.
Sophisticated Detection Methods for 2026
Security platforms like Sensfrx now look for automation artefacts deep within the browser’s execution layer.
1. Detecting Automation via User-Agent and WebDriver Flags
The simplest detection method involves checking for automation flags that headless browsers often expose.
Example Code (Server-Side Detection):
from flask import Flask, request
import re
app = Flask(__name__)
def detect_headless_browser(user_agent, headers):
"""Detect headless browsers through User-Agent and header analysis"""
# Common headless browser signatures
headless_patterns = [
r'headless',
r'phantomjs',
r'selenium',
r'puppeteer',
r'playwright'
]
# Check User-Agent
ua_lower = user_agent.lower()
for pattern in headless_patterns:
if re.search(pattern, ua_lower):
return True, f"Headless pattern detected: {pattern}"
# Check for missing common headers
common_headers = ['accept-language', 'accept-encoding']
missing_headers = [h for h in common_headers if h not in headers]
if len(missing_headers) > 0:
return True, f"Missing headers: {missing_headers}"
# Check for automation flags
webdriver_flag = headers.get('sec-ch-ua')
if webdriver_flag and 'Headless' in webdriver_flag:
return True, "Headless Chrome detected in sec-ch-ua"
return False, "Appears to be genuine browser"
@app.route('/api/check')
def check_request():
user_agent = request.headers.get('User-Agent', '')
is_bot, reason = detect_headless_browser(user_agent, request.headers)
return {
'is_bot': is_bot,
'reason': reason,
'user_agent': user_agent
}
if __name__ == '__main__':
app.run(debug=True)
Result: When a headless browser visits this endpoint, the server detects patterns like “HeadlessChrome” in the User-Agent or missing standard headers like Accept-Language. For example, a Puppeteer bot might return: {‘is_bot’: True, ‘reason’: ‘Headless pattern detected: headless’, ‘user_agent’: ‘Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 HeadlessChrome/120.0.0.0’}.
2. Behavioural Timing Analysis (The Human Element)
Bots move in perfect lines or “teleport” between coordinates. Human mouse movement consists of non-linear curves and variable acceleration.
Example Code (Analysing Movement Data):
import numpy as np
from scipy.spatial.distance import euclidean
import json
def analyse_mouse_movements(movement_data):
"""
Analyse mouse movement patterns to detect bots
movement_data: list of dicts with x, y, timestamp
"""
if len(movement_data) < 10:
return False, "Insufficient data"
# Extract coordinates and timestamps
points = [(m['x'], m['y']) for m in movement_data]
times = [m['time'] for m in movement_data]
# Calculate velocities and accelerations
velocities = []
for i in range(1, len(points)):
dist = euclidean(points[i], points[i-1])
time_diff = (times[i] - times[i-1]) / 1000.0 # Convert to seconds
if time_diff > 0:
velocity = dist / time_diff
velocities.append(velocity)
# Check for perfect straight lines (collinearity)
straight_line_count = 0
for i in range(2, len(points)):
p1, p2, p3 = points[i-2], points[i-1], points[i]
# Calculate cross product to check collinearity
cross_product = abs(
(p2[0] - p1[0]) * (p3[1] - p1[1]) -
(p2[1] - p1[1]) * (p3[0] - p1[0])
)
if cross_product < 1.0: # Nearly perfect line
straight_line_count += 1
linearity_ratio = straight_line_count / (len(points) - 2)
# Check for constant velocity (bot signature)
if len(velocities) > 5:
velocity_variance = np.var(velocities)
velocity_mean = np.mean(velocities)
# Human movements have high variance
if velocity_variance < 100 and velocity_mean > 500:
return True, f"Constant velocity detected (var: {velocity_variance:.2f})"
# Check linearity
if linearity_ratio > 0.7:
return True, f"Too linear: {linearity_ratio*100:.1f}% straight lines"
return False, "Movement appears human"
# Example usage
movement_data = [
{'x': 100, 'y': 100, 'time': 1000},
{'x': 150, 'y': 150, 'time': 1050},
{'x': 200, 'y': 200, 'time': 1100},
{'x': 250, 'y': 250, 'time': 1150},
{'x': 300, 'y': 300, 'time': 1200},
{'x': 350, 'y': 350, 'time': 1250},
{'x': 400, 'y': 400, 'time': 1300},
{'x': 450, 'y': 450, 'time': 1350},
{'x': 500, 'y': 500, 'time': 1400},
{'x': 550, 'y': 550, 'time': 1450},
]
is_bot, reason = analyse_mouse_movements(movement_data)
print(f"Bot detected: {is_bot}")
print(f"Reason: {reason}")
Result: For the example data showing a perfect diagonal line with constant velocity, the output would be: Bot detected: True, Reason: Too linear: 100.0% straight lines. Human users produce organic curves with variable acceleration, while bots create geometric patterns.
3. Hardware & API Inconsistencies Detection
Most headless servers lack a GPU or sound card. By analysing browser fingerprints from the client side and validating them server-side, we can detect mismatches.
Example Code (Fingerprint Analysis):
import hashlib
from collections import Counter
class FingerprintAnalyser:
def __init__(self):
# Known headless signatures
self.headless_renderers = [
'swiftshader',
'llvmpipe',
'mesa',
'angle'
]
self.headless_vendors = [
'google inc. (intel)',
'brian paul',
'vmware'
]
def analyse_fingerprint(self, fingerprint_data):
"""
Analyse browser fingerprint for inconsistencies
fingerprint_data: dict with webgl_renderer, webgl_vendor,
platform, user_agent, audio_devices
"""
flags = []
renderer = fingerprint_data.get('webgl_renderer', '').lower()
vendor = fingerprint_data.get('webgl_vendor', '').lower()
platform = fingerprint_data.get('platform', '')
user_agent = fingerprint_data.get('user_agent', '')
audio_count = fingerprint_data.get('audio_devices', 0)
# Check for headless GPU signatures
for sig in self.headless_renderers:
if sig in renderer:
flags.append(f"Headless GPU detected: {sig}")
for sig in self.headless_vendors:
if sig in vendor:
flags.append(f"Headless vendor detected: {sig}")
# Check for hardware inconsistencies
if 'mac' in user_agent.lower() or 'macintosh' in platform.lower():
if 'swiftshader' in renderer or 'llvmpipe' in renderer:
flags.append("MacOS claimed but software renderer detected")
if audio_count == 0:
flags.append("MacOS claimed but no audio devices")
if 'windows' in user_agent.lower() or 'win' in platform.lower():
if 'llvmpipe' in renderer:
flags.append("Windows claimed but Linux renderer detected")
# Check canvas consistency
canvas_hash = fingerprint_data.get('canvas_hash', '')
if canvas_hash in self.get_known_headless_hashes():
flags.append("Known headless canvas signature")
is_bot = len(flags) > 0
return is_bot, flags
def get_known_headless_hashes(self):
"""Returns known headless browser canvas hashes"""
return [
'a3c4b8f9e2d1', # Common Puppeteer signature
'f7e8d9c6b5a4', # Common Playwright signature
]
# Example usage
analyser = FingerprintAnalyser()
# Suspicious fingerprint (headless browser)
suspicious_fp = {
'webgl_renderer': 'ANGLE (Intel, Mesa DRI Intel(R) HD Graphics, SwiftShader)',
'webgl_vendor': 'Google Inc. (Intel)',
'platform': 'MacIntel',
'user_agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)',
'audio_devices': 0,
'canvas_hash': 'a3c4b8f9e2d1'
}
is_bot, flags = analyser.analyse_fingerprint(suspicious_fp)
print(f"Bot detected: {is_bot}")
print(f"Flags: {flags}")
# Genuine fingerprint
genuine_fp = {
'webgl_renderer': 'ANGLE (Apple, Apple M1, OpenGL 4.1)',
'webgl_vendor': 'Google Inc. (Apple)',
'platform': 'MacIntel',
'user_agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)',
'audio_devices': 2,
'canvas_hash': '8d7c6b5a4e3f'
}
is_bot2, flags2 = analyser.analyse_fingerprint(genuine_fp)
print(f"\nGenuine browser - Bot detected: {is_bot2}")
print(f"Flags: {flags2}")
Result: For the suspicious fingerprint, output would be:
Bot detected: True
Flags: [‘Headless GPU detected: swiftshader’, ‘MacOS claimed but software renderer detected’, ‘MacOS claimed but no audio devices’, ‘Known headless canvas signature’]
For the genuine fingerprint: Bot detected: False, Flags: []
4. Request Pattern Analysis for Scraping Detection
Bots often exhibit distinctive request patterns when scraping e-commerce sites.
Example Code (Traffic Pattern Analysis):
from datetime import datetime, timedelta
from collections import defaultdict
import re
class ScraperDetector:
def __init__(self):
self.request_log = defaultdict(list)
self.suspicious_patterns = {
'collections_backdoor': r'/collections/all\?.\\*sort_by=created',
'cart_testing': r'/cart/add.\\*quantity=(999|9999)',
'rapid_product_viewing': r'/products/',
'admin_probing': r'/wp-admin|/admin'
}
def log_request(self, ip_address, path, timestamp=None):
"""Log incoming requests for pattern analysis"""
if timestamp is None:
timestamp = datetime.now()
self.request_log[ip_address].append({
'path': path,
'time': timestamp
})
def analyse_ip(self, ip_address, time_window_minutes=5):
"""Analyse request patterns for a specific IP"""
if ip_address not in self.request_log:
return False, "No data"
requests = self.request_log[ip_address]
now = datetime.now()
recent_requests = [
r for r in requests
if (now - r['time']).total_seconds() < time_window_minutes * 60
]
if len(recent_requests) < 5:
return False, "Insufficient activity"
flags = []
# Check for backdoor access
backdoor_hits = [
r for r in recent_requests
if re.search(self.suspicious_patterns['collections_backdoor'], r['path'])
]
if len(backdoor_hits) > 0:
flags.append(f"Collections backdoor accessed {len(backdoor_hits)} times")
# Check for cart testing
cart_tests = [
r for r in recent_requests
if re.search(self.suspicious_patterns['cart_testing'], r['path'])
]
if len(cart_tests) > 0:
flags.append(f"Inventory cart testing detected ({len(cart_tests)} attempts)")
# Check request rate
request_rate = len(recent_requests) / time_window_minutes
if request_rate > 20: # More than 20 requests per minute
flags.append(f"High request rate: {request_rate:.1f} req/min")
# Check for sequential product browsing (bot pattern)
product_paths = [
r['path'] for r in recent_requests
if '/products/' in r['path']
]
if len(product_paths) > 10:
# Check if products are being accessed too quickly
product_times = [
r['time'] for r in recent_requests
if '/products/' in r['path']
]
if len(product_times) >= 2:
avg_time_between = sum([
(product_times[i] - product_times[i-1]).total_seconds()
for i in range(1, len(product_times))
]) / (len(product_times) - 1)
if avg_time_between < 2: # Less than 2 seconds between products
flags.append(f"Rapid product scanning: {avg_time_between:.2f}s avg")
is_scraper = len(flags) > 0
return is_scraper, flags
# Example usage
detector = ScraperDetector()
# Simulate bot activity
bot_ip = "203.0.113.45"
base_time = datetime.now()
# Bot accessing products rapidly
for i in range(15):
detector.log_request(
bot_ip,
f"/products/product-{i}",
base_time + timedelta(seconds=i*1.5)
)
# Bot using backdoor
detector.log_request(bot_ip, "/collections/all?sort_by=created-desc", base_time)
# Bot testing inventory
detector.log_request(bot_ip, "/cart/add?id=12345&quantity=999", base_time)
is_scraper, flags = detector.analyse_ip(bot_ip)
print(f"Scraper detected: {is_scraper}")
print(f"Flags:")
for flag in flags:
print(f" - {flag}")
# Simulate normal user
normal_ip = "198.51.100.23"
detector.log_request(normal_ip, "/", base_time)
detector.log_request(normal_ip, "/products/shirt-1", base_time + timedelta(seconds=30))
detector.log_request(normal_ip, "/products/shirt-1", base_time + timedelta(seconds=45))
detector.log_request(normal_ip, "/cart", base_time + timedelta(seconds=60))
is_scraper2, flags2 = detector.analyse_ip(normal_ip)
print(f"\nNormal user - Scraper detected: {is_scraper2}")
print(f"Reason: {flags2}")
Result: For the bot IP, output would be:
Scraper detected: True
Flags:
- Collections backdoor accessed 1 times
- Inventory cart testing detected (1 attempts)
- High request rate: 3.4 req/min
- Rapid product scanning: 1.50s avgFor normal user: Scraper detected: False, Reason: Insufficient activity
Detection Layer Comparison
This table compares various levels of security for detecting automated attacks, specifically describing the technique used on each level and the type of malicious traffic it detects.
| Layer | Technique | What it Catches |
| Traditional WAF | IP Blacklisting / Rate Limiting | Basic, high-volume scrapers |
| Client-Side JS | navigator.webdriver check | Amateur Puppeteer scripts |
| Advanced Fingerprinting | CDP Detection & WebGL Hashes | Professional New Headless bots |
| Sensfrx AI | Behavioural Biometrics | Residential Proxies & Spy Extensions |
Action Plan for Merchants
In the event that your e-commerce site, for example, WooCommerce or Shopify, has been experiencing an increased rate of fraud transactions (Card Testing) or non-human traffic, commonly known as Ghost Visitors, accessing the site from data centers, the following strategic steps are recommended in order to contain the situation and strengthen the defence.
- Identify the Endpoint: Immediately identify and prioritise security for high-value transaction endpoints. Malicious bots predominantly target critical processing paths such as /checkout and platform-specific administrative routes like /wp-admin/admin-ajax.php.
- Muzzle Notifications: In an ongoing card-testing incident, disable Failed Order email notification automation. This prevents the high volume of email being sent from impacting your server’s sender reputation and being marked as spam.
- Hide the Backdoor:Make competitive intelligence less accessible by hiding or restricting access to pages that disclose your product strategy. Consider redirecting or password-protecting aggregation endpoints such as /collections/all to prevent research tools from scraping data about winning products.
- Adopt Behavioural Defence: Move beyond reliance on static, rule-based security measures (e.g., IP blacklisting). Implement advanced security platforms that utilize Behavioral Biometrics to analyse how users interact with the checkout funnel, focusing on the quality of interaction rather than solely on IP geography. This dynamic approach is essential for detecting highly sophisticated botnets.
Conclusion
The arms race between e-commerce security and automated attacks will go on. But by understanding the nature of headless browsers and implementing advanced detection techniques beyond User Agent detection, we can safeguard our competitive intelligence, prevent fraud, and ensure the integrity of our e-commerce stores. The solution lies in moving away from static rule-based detection and into dynamic behavior-based detection, taking into consideration all the context of user interactions with our store.
Frequently Asked Questions (FAQs)
A headless browser is a web browser without a graphical user interface (GUI). It runs on a server and is capable of executing complex JavaScript and rendering CSS just like a normal browser. Attackers use frameworks like Puppeteer or Playwright with Headless Chrome because they can mimic human behavior with terrifying accuracy, making them effective for card testing, account takeovers (ATO), and competitive intelligence scraping.
The new mode is the actual Chrome browser running without a window, unlike the old, stripped-down version. This change made traditional detection methods, such as checking for a specific “headless” User-Agent string, completely obsolete. Now, telling a human from a script requires looking for “automation artefacts” deep within the browser’s execution layer.
Competitor spies use headless browsers for “Shadow Traffic” to gain a competitive advantage by:
• Monitoring Live Sales Data: Scraping “Recent Purchase” pop-ups and inventory levels to determine daily revenue.
• Using the /collections/all Backdoor: Hitting this URL with a sort_by=created-desc parameter to see a store’s newest product launches before they are marketed.
• Inventory Cart-Testing: Adding a large quantity (e.g., 999) of an item to a cart to “leak” the exact stock levels.
The document outlines moving from static rules to dynamic, behavior-based analysis:
• Detecting Automation via User-Agent and WebDriver Flags: Checking for automation flags or missing standard headers (accept-language, accept-encoding).
• Behavioural Timing Analysis (The Human Element): Analysing mouse movement data for bot signatures like constant velocity and perfectly straight (linear) lines, which contrast with human users’ non-linear curves and variable acceleration.
• Hardware & API Inconsistencies Detection: Analysing client-side browser fingerprints for telltale signs like missing audio devices, or using software-based GPU renderers (e.g., swiftshader, llvmpipe) while claiming to be a desktop OS.
• Request Pattern Analysis for Scraping Detection: Flagging suspicious traffic patterns like rapid sequential product viewing or attempts to access backdoors like /collections/all.
The recommended action plan includes:
• Identify the Endpoint: Determine if bots are targeting known points like /checkout or /wp-admin/admin-ajax.php.
• Muzzle Notifications: Disable “Failed Order” emails during a card-testing attack to prevent the server from being blacklisted as a spammer.
• Hide the Backdoor: Redirect or password-protect the /collections/all page.
• Adopt Behavioural Defence: Implement advanced security platforms to analyse how a user interacts with the checkout process, moving beyond simple IP analysis.