Ubuntu Unleashed 2019 Edition: Covering 18.04, 18.10, 19.04

(singke) #1

Cassandra


Cassandra was developed by Facebook for its inbox searching feature. It was
released as an open source project when Facebook turned it over to Apache in



  1. Cassandra is a key/value store that runs on a flexible cluster of nodes
    and is also a wide column store, like HBase, discussed in the “Wide Column
    Store” section, later in this chapter. Nodes may be added and removed from
    the cluster. Data is replicated across multiple nodes of the cluster. There is no
    central node, and access to data exists from any node; if the node receiving
    the request does not house the specific data requested, it still services the
    request by retrieving and sending the data. The main goal of Cassandra is fast
    retrieval of data, with fault tolerance being handled through replication across
    nodes and speed adjustments via adding additional nodes to create more
    access points.


One interesting feature is that Cassandra may be tuned to adjust the trade-off
between speed of transactions and consistency of data. When data is stored, it
is initially stored in memory and gets sent to disk only when specific criteria
are met. This makes interaction very quick. In fact, not all data stored in
Cassandra is designed to persist over time, and data might not get written to
disk at all. This means that not all readers or seekers of data may find a
specific piece, but in cases like Facebook’s need to store inbox search data
that has only limited time value (such as search results that could be different
tomorrow or even 10 minutes from now), this might not matter at all. In these
cases, both access speed and convenience are more important.


Cassandra is being used by Facebook, Twitter, Reddit, and many others.


etcd


The open source project behind etcd is CoreOS, which is working in the
container world. (See Chapter 32, “Containers and Ubuntu,” for more about
containers.) This key/value store is designed specifically for containerized
deployment across a cluster of machines. It is written in Go and is in
production use by many large companies, including Cloud Foundry and
anyone using Kubernetes.


The focus of etcd is four-fold: simplicity, security, speed, and reliability. It
includes a user-facing API and complete access to the source code via
GitHub.


Memcached and MemcacheDB

Free download pdf