HowAutomate
    Back to Portfolio
    DataPythonScrapyPostgreSQLGoogle Sheets

    Web Crawler & Price Intelligence

    Built a competitor price monitoring crawler tracking 50,000+ SKUs daily across 12 e-commerce platforms, delivering live pricing intelligence directly to the sales team.

    50,000+
    SKUs tracked daily
    12
    Platforms monitored
    2×/day
    Crawl frequency
    Real-time
    Price alert notifications
    Web Crawler & Price Intelligence

    The Challenge

    A consumer electronics distributor was losing deals to competitors who had faster access to market pricing. Their sales team was manually checking competitor prices on 3–4 sites for key SKUs — a process that took hours and was always incomplete. They had no systematic way to know when a competitor dropped prices or ran a promotion.

    What We Built

    We built a Scrapy-based crawler covering 12 competitor and marketplace websites. The crawler runs twice daily, extracting prices, availability, and promotional badges for 50,000+ tracked SKUs. Data lands in PostgreSQL with full price history. A Google Sheets dashboard — auto-refreshed via Apps Script — shows each SKU's current price, 7-day trend, and a price alert flag when a competitor drops below a threshold.

    How It Works

    In commoditised electronics distribution, price is often the deciding factor in a deal. The client's sales team was essentially flying blind — quoting based on gut feel about where competitors were pricing, occasionally losing deals by a margin of 0.5–1% that they could have matched if they'd known.

    The crawler architecture uses Scrapy with rotating proxies and realistic browser fingerprints to reliably extract data from 12 different site structures. Each site has a custom spider that handles pagination, dynamic JavaScript rendering where needed, and product variant matching.

    The hardest part of the project was SKU matching across platforms. The same product might be listed as 'Samsung Galaxy S24 128GB Phantom Black' on one site and 'Samsung S24 (128 GB, Black)' on another. We built a fuzzy-matching pipeline using model number extraction and embedding similarity to reliably link SKUs across platforms.

    Price data loads into PostgreSQL with full history — every price point, timestamped, for every SKU. The sales team's Google Sheet connects via Apps Script to a view that shows: current competitor low price, HowAutomate client's current price, price gap (positive or negative), and a 7-day sparkline.

    Alert rows highlight in red when a competitor is more than 2% below the client's price on a high-volume SKU. The sales team can then take action — adjust pricing, call the customer proactively, or flag to procurement — before losing the deal.

    More Data Case Studies

    Chat with us