What database should I use?

Nowadays, there are a lot of different databases. They all function extremely differently, so it can be tricky to know what database would suit a specific project best. This aim of this tool is to be a quick and easy way to compare different databases, and see which one would work best for you.

Relational vs Nonrelational

Relational databases are the 'old school' databases, like SQL. Non relational databases are the 'trendier' databases, such as MongoDB. Relational databases are stored similarly to an Excel spreadsheet, with a value corresponding to a row and a column. Non-relational databases does not store data in the tabular method (how the data is actually stored varies by database). Because the data is stored differently, it effects the speed, scaling, and complexity of the database. If your data would fit well in a table, then you most likely would want a relational database. When choosing between a relational or nonrelational database, the most important question to ask is Do I care about the relations between the data? An example of this is a system that uses lots of joins to combine rows from multiple tables.

Category Relational Nonrelational
Queries Can handle more complicated queries (like joins, for example) Better at simpler queries
Ease of Scaling Harder to scale (vertical scaling) Easier to scale (horizontal scaling)
Data type Strucred data only Unstructured data

Different types of NRDBs

There are a lot of different types of nonrelational databases. A whole lot. They all function in very different ways, from the way one can store data, from the type of data that can be stored, to the difficulty in setting up the database. I would recommend using this chart to get an overall view of different types of NRDBs, but you'll definitely want to read a little bit more into the architecture of the different systems.

NRDBs usually split up the data, and stores it in different, individually scalable areas, rather than keeping it all together like in a relational database. While the database is typically very scalable and there usually is not one single point of failure, there are downsides, such as very expensive joins.

An important theorem used in NRDBs is the CAP theorem - Consistency, Avaliability and Partion Tolerance. It is believed that in a NRDB, tradeoffs need to be made between these three categories. So a very avaliable system would also have a very low consistency.

NRDB MongoDB Cassandra Redis Hadoop
Storage Type Documents Wide columns Datastructures Distributed file system/MapReduce
Schema Extremely flexible Least flexible Flexible in that it supports abstract data strings Flexible data types
Ease of Use Easy set up Moderate set up Most difficult set up Fairly difficult set up
Architecture Master/Slave Distributed clusters Master/Slave Distributed clusters
General Speed Slower writes due to a lock system Constant write time Read/write at the same rate Very slow, especially with small data sets
Speed at Scale The more writes, the slower it is Performs fast even at scale Fastest Slowest
General Notes
Good if you don't know how you're going to query your data yet Eventually consistent
No single point of failure
Data must fit in memory
Supports very advanced queries
Flexible data types
Can store lots of data
The bottom line Good for new users, looking to try a NRDB. Also, can take any kind of doc you throw at it. Great if aren't fully sure how you plan to query your data, as it is very flexible. Splits your data up into 'keyspaces', with 'column families', then rows and then columns and values. Spreads data around clusters. A little tricky to set up, but performs very well at scale. Complex data type key value stores. The in memory database can perform some complicated queries, and fast. Perfect for storing huge sets of data that you plan on performing batch jobs on at the end of the day.