Friday, December 18, 2015

Facebook Photo Storage


1. Introduction

Binary Large OBject

- Unstructured: video, jpg
- Immutable: a post contents will not be modified

Current scale

- Over 400 billion photos stored in BloB system
- The world's largest photo store

Blob hotness and age

- Access cool down overtime
- Hot data and warm data






2. Storage Solutions

Haystack 2008

  • High throughput
    • In memory index (store in group)
    • Single I/O per request
    • Multiple copies
  • Failure tolerant
    • RAID6
      • 2 Redundant drives
      • 1.2x replication
      • with 3 hosts: 3.6x space, e.g., 100PB raw = 28 PB Usable
    • Multiple copies 
      • 3 copies in different data centers

Warm Storage


  • Redundancy still required
  • Read throughput much less
  • Ever growing
  • Can we do better than 3.6x?
    • Yes, F4

F4

  • RS Encoding
    • Redundancy != Replication
    • Leverage need for less throughput
  • Space saving
    • 2.1x from 3.6x
  • Example
    • Hot contents through Haystack

    • Warm contents through f4



3. F4

Warm storage problem


  • Need to store (warm) data efficiently
  • Storage must be highly fault tolerant
  • Read latency should be comparable to haystack
  • Load is NOT primary concern

Solution: f4

  • 2.1x replication factor compared to haystack's 3.6x
  • Yet more fault tolerant than haystack!!


Design of f4


  • Data splitting RS(5,2)
    • Use two parity blocks for each 5 blocks, and group to a stripe


  • RS rebuild







  • Block placement policy
    • Each stripe is placed in different racks (=>hosts)
    • RS(10,4) is used in practice (1.4x)
    • Tolerate 4 racks (->4 disks/hosts) failure






  • f4 cell anatomy
    • f4 storage consists of a set of cells
    • One cell resides completely in one datacenter
    • Cell consists of 3 kind of nodes, the index is distributed across storage nodes
      • Storage
      • Compute
      • Coordinator.





  • f4 Reads




  • Reads with datacenter failures (2.1x)








  • Haystack v.s. f4









4. Tips

  • Hot data will be cached by cdn
Reference
[1] https://www.quora.com/How-have-Facebook-distributed-systems-has-been-designed-to-look-as-a-single-system-Transparency
[2] https://code.facebook.com/videos/334113483447122/f4-photo-storage-at-facebook-scale-presentation/

No comments:

Post a Comment