(Deprecated) Lisa's Tech Blog: Facebook Cassandra

Saturday, December 19, 2015

Facebook Cassandra

Introduction

Cassandra is an open source distributed database system that is designed for storing and managing large amounts of data across commodity servers.

Provide scalability and high availability without compromising performance
Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platfrom for mission critical data.
Support replicating across multiple datacenters and providing low latency for your users
Offer the convenience of column indexes with the performance of log-structured updates, strong support for denormalization and materialized views, and powerful built-in caching.

History

combined Google BigTable with Amazon Dynamo

Architecture

Overview

Peer-to-Peer distributed system
All nodes the same
Data partitioned among all nodes in the cluster
Custom data replication to ensure fault tolerance
Read/Write anywhere design

Communication

Each node communicates with each other through the Gossip protocol, which exchanges information across cluster every second
A commit log is used on each node to capture write activity
Data durability is assured
Data also written to an in-memory structure (memtable) and then to disk once the memory structure is full (an SStable)

Advantage

Scalability

Since it uses peer to peer distributed systems, it is easy to add nodes to increase from TB to PB

Fault-tolerance

Replication

Post-relation database

Since it column index, it has dynamic schema design to allows for much more flexibile data storage

Data compression

Uses Google's Snappy data compression algorithm
Compresses data on a per column family level

Current status

Facebook used to use Cassandra for inbox search. Now they are using HBase for the same.

Reference

[1] http://cassandra.apache.org/
[2] http://www.planetcassandra.org/blog/interview/facebooks-instagram-making-the-switch-to-cassandra-from-redis-a-75-insta-savings/
[3] https://www.quora.com/Do-facebook-still-make-use-of-Cassandra-in-any-of-its-main-leaving-out-analytics-architecture

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)