E-Commerce ETL Pipeline
End-to-end ETL pipeline pulling data from 5 e-commerce platforms into a centralised data warehouse for unified reporting and inventory management.

The Challenge
A D2C brand selling across Amazon, Flipkart, Meesho, their own Shopify store, and a B2B wholesale portal had no unified view of their business. Each platform had its own reporting interface, export format, and update frequency. Reconciling inventory and revenue across all five took a full day each week.
What We Built
We built Python-based ETL pipelines for each platform's API, orchestrated by SnapLogic. Data from all five platforms lands in AWS S3 in normalised Parquet format, then loads into PostgreSQL nightly. A single unified schema covers orders, inventory, returns, and revenue — with platform as a dimension. Power BI connects on top for cross-platform reporting.
How It Works
The client's ops team spent every Monday doing a painful reconciliation: download CSVs from five portals, align columns, deal with different date formats, manually flag discrepancies, and build a master spreadsheet. By the time the report was ready, another week of data had already started accumulating.
We began with API integrations for each platform. Amazon SP-API, Flipkart Seller API, Meesho Supplier API, Shopify Admin API, and a custom database connector for the B2B portal — each with its own authentication flow, rate limiting, and response schema.
Python extraction scripts normalise each platform's data into a standard schema: order_id, platform, product_sku, quantity, revenue, order_date, fulfillment_status, return_flag. All data lands in AWS S3 as Parquet files, partitioned by platform and date.
SnapLogic orchestrates the nightly load into PostgreSQL, handling deduplication (some orders appear across platforms with slightly different identifiers), currency normalisation, and data quality checks. If any pipeline fails, alerts fire to Slack before the team starts their day.
The result is a single PostgreSQL database the team can query directly, or explore through Power BI. Cross-platform inventory reconciliation — previously a day-long task — now runs automatically overnight.



