Red Is Incredible
Redis is an in-memory key value datastore written in ANSI C programming language by Salvatore Sanfilippo. Redis not only supports string datatype but it also supports list, set, sorted sets, hashes datatypes, and provides a rich set of operations to work with these types. If you have worked with Memcached, an in-memory object caching system, you will find that it is very similar, but Redis is Memcached++. Redis not only supports rich datatypes, it also supports data replication and can save data on disk. The key advantages of Redis are :
- Exceptionally Fast : Redis is very fast and can perform about 110000 SETs per second, about 81000 GETs per second. You can use the redis-benchmark utility for doing the same on your machine.
- Supports Rich data types : Redis natively supports most of the datatypes that most developers already know like list, set, sorted set, hashes. This makes it very easy to solve a variety of problems because we know which problem can be handled better by which data type.
- Operations are atomic : All the Redis operations are atomic, which ensures that if two clients concurrently access Redis server will get the updated value.
- MultiUtility Tool : Redis is a multi utility tool and can be used in a number of usecases like caching, messaging-queues (Redis natively supports Publish/ Subscribe ), any short lived data in your application like web application sessions, web page hit counts, etc. There are a lot of people using Redis and they can be found at the owner website.
Here are a few things we suggest thinking about when you are utilising the superpowers of Redis.
- Choose consistent ways to name and prefix your keys. Manage your namespace.
- Create a “registry” of key prefixes which maps each to your internal documents for those application which “own” them.
- For every class of data you put into your Redis infrastructure: design, implement and test the mechanisms for garbage collection and/or data migration to archival storage.
- Design, implement and test a sharding (consistent hashing) library before you’ve invested much into your application deployment and ensure that you keep a registry of “shards” replicated on each server.
Let us explain each of these points in brief.
You should assume, from the outset, that your Redis infrastructure will be a common resource used by a number of applications or separate modules. You can have multiple databases on each server numbered 0 through 31 by default, though you can increase the number of these. However, it’s best to assume that you’ll need to use key prefixes to avoid collisions among various different application/modules.
Consistent key prefixing & Managing your namespace:
Your applications/modules should provide the flexibility to change these key prefixes dynamically. Be sure that all keys are synthesized from the application/module prefix concatenated with the key that you’re manipulating; make hard-coding of key strings verboten.
Registry: Document and Track your namespace
We suggest that you have certain key patterns (prefixes or glob patterns) as “reserved” on your Redis servers. For example you can have __key_registry__ (similar to the Python reserved method/attribute names) as a hash of key prefixes to URLs into your wiki or Trac or whatever internal documentation site you use. Thus you can perform housekeeping on your database contents and track down who/what is responsible for every key you find in any database. Institute a policy that any key which doesn’t match any pattern in your registry can/will be summarily removed by your automated housekeeping.
In a persistent, shared, key/value store, and in the case of Redis, in particular the collection of garbage is probably the single major maintenance issue.
So you need to consider how you’re going to select the data that needs to be migrated out of Redis perhaps into your SQL/RDBMS or into some other form of archival storage, and how you’re going to track and purge data which is out-of-date or useless.
The obvious approaches involve the use of the EXPIRE or EXPIREAT features/commands. This allows Redis to manage the garbage collection for you, either relative to your manipulation of any given key, or in terms of an absolute time specification. The only trick about Redis expiration is that you must reset it every single time.
Redis doesn’t provide sharding. You should probably assume that you’ll grow beyond the capacity of a single Redis server. Slaves are for redundancy, not for scaling, though you can offload some read-only operations to slaves if you have some way to manage the data consistency, for example the ZSET of key/timestamp values describe for expiry can also be used for some offline bulk processing operations; also the pub/sub features can be used for the master to provide hints regarding the quiescence of selected keys/data.
So you should consider writing your own abstraction layer to provide sharding. Basically imagine that you have implemented a consistent hashing method and you run every synthesized key through that before you use it. While you only have a single Redis server then the hash to server mapping always ends up pointing to your only server. Later if you need to add more servers then you can adjust the mapping so that half or a third of your keys resolve to your other servers. Of course you’ll want to implement this so that the failure on a primary server causes your library/service module to automatically retry on the secondary and possibly any tertiary server. Depending on your application you might even have the tertiary attempts fetch certain types of data from an another data source entirely.