eBay’s enormous data warehouses

Curt Monash meets with Ebay’s Oliver Ratzesberger and gets us numbers on two of the world’s largest data warehouses in the world. Look at these Ebay stats!

Metrics on eBay’s main Teradata data warehouse include:

  • >2 petabytes of user data
  • 10s of 1000s of users
  • Millions of queries per day
  • 72 nodes
  • >140 GB/sec of I/O, or 2 GB/node/sec, or maybe that’s a peak when the workload is scan-heavy
  • 100s of production databases being fed in

Metrics on eBay’s Greenplum data warehouse (or, if you like, data mart) include:

  • 6 1/2 petabytes of user data
  • 17 trillion records
  • 150 billion new records/day, which seems to suggest an ingest rate well over 50 terabytes/day
  • 96 nodes
  • 200 MB/node/sec of I/O (that’s the order of magnitude difference that triggered my post on disk drives)
  • 4.5 petabytes of storage
  • 70% compression
  • A small number of concurrent users

More details.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>