Saying "Yes" To Nosql: Overview
Saying "Yes" To Nosql: Overview
Overview:
The Relational Model
Structured Query Language (SQL)
The “original” NoSQL Movement
NoSQL Today
Associations:
Raymond F. Boyce
Hugh Darwen
C.J. Date
Nikos Lorentzos
David McGoveran
Fabian Pascal
2
The Relational Model
“A Relational Model of Data for Large Shared Data Banks,” E.F. Codd, Communications of the ACM, Vol. 13,
“Further Normalization of the Data Base Relational Model,” E.F. Codd, Data Base Systems, Proceedings of
“Relational Completeness of Data Base Sublanguages,” E.F. Codd, Data Base Systems, Proceedings of 6th
Plus others…
3
The Relational Model
“Employee”
The basic data model:
ID Last-Name Date-of-Birth Job-Category
Relations, tuples, attributes, domains 15394
21621
Jones
Smith
11/3/75
6/24/69
Software
Management
17852 Brown 8/14/72 Hardware
Primary & foreign keys 32904 Carson 10/29/64 Software
:
:
Normal forms
Query model:
Relational algebra – cartesian product, selection, projection, union, set-difference
Relational calculus
A primary theme:
Physical data independence
4
Relational Database Management Systems (RDBMS)
5
Structure Query Language (SQL)
History:
Developed at IBM San Jose Research Laboratory, early 1970’s, for System R
Credited to Donald D. Chamberlin and Raymond F. Boyce
Based on relational algebra and tuple calculus
Originally called SEQUEL
Language Elements:
Clauses, expressions, predicates, queries, statements, transactions, operators, nesting etc.
group by o_orderpriority
order by o_orderpriority;
6
SQL and the Relational Model
A text search of E.F. Codd’s early papers for “SQL” (or SEQUEL) reveals:
7
Relational Query Languages
8
The NoSQL RDBMS
One of first uses of the phrase NoSQL is due to Carlo Strozzi, circa 1998.
NoSQL:
A fast, portable, open-source RDBMS
A derivative of the RDB database system (Walter Hobbs, RAND)
Not a full-function DBMS, per se, but a shell-level tool
User interface – Unix shell
Based on the “operator/stream paradigm”
http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page
9
Operator/stream Paradigm
“…almost all are software prisons that you must get into and leave the power of UNIX behind.”
“…large, complex programs which degrade total system performance, especially when they are run in a multi-user
environment.”
“…put walls between the user and UNIX, and the power of UNIX is thrown away.”
In summary:
Relational model => yes
UNIX => big yes
Big, COTS, relational DBMS => no
SQL => no
10
The NoSQL RDBMS
In that sense, and interpreted literally, NoSQL means “no sql,” i.e., we are not using the SQL language.
11
NoSQL Today
More recently:
The term has taken on different meanings
One common interpretation is “not only SQL”
Most modern NoSQL systems diverge from the relational model or standard RDBMS functionality:
The data model: relations documents
tuples vs. graphs
attributes key/values
domains
normalization
In that sense, NoSQL today is more commonly meant to be something like “non-relational”
12
NoSQL Today
Is this another grand conspiracy by the government and, you know, that guy….
13
NoSQL Today
(a partial, unrefined list)
14
NoSQL Today
15
Primary NoSQL Categories
16
Key/Value Store DynamoDB
Azure Table Storage
Riak
Rdis
Aerospike
FoundationDB
“Dynamo: Amazon’s Highly Available Key-value Store,” DeCandia, G., et al., SOSP’07, 21 st ACM LevelDB
Berkeley DB
Oracle NoSQL Database
Symposium on Operating Systems Principles. GenieDb
BangDB
Chordless
Scalaris
Tokyo Cabinet/Tyrant
The basic data model: Scalien
Voldemort
Dynomite
Database is a collection of key/value pairs KAI
MemcacheDB
Faircom C-Tree
The key for each pair is unique LSM
KitaroDB
HamsterDB
No requirement for normalization STSdb
(and consequently dependency TarantoolBox
Primary operations: preservation or lossless join) Maxtable
Quasardb
Pincaster
insert(key,value) RaptorDB
TIBCO Active Spaces
Allegro-C
delete(key) nessDB
HyperDex
update(key,value) SharedHashFile
Symas LMDB
Sophia
lookup(key) PickleDB
Mnesia
LightCloud
Hibari
OpenLDAP
Genomu
Additional operations: BinaryRage
Elliptics
variations on the above, e.g., reverse lookup Dbreeze
RocksDB
TreodeDB
iterators (www.nosql-database.org
www.db-engines.com
www.wikipedia.com)
17
Wide Column Store
“Bigtable: A Distributed Storage System for Structured Data,” Chang, F., et al., OSDI’06: Seventh
Accumulo
The basic data model: Amazon SimpleDB
BigTable
Database is a collection of key/value pairs Cassandra
Cloudata
Cloudera
Key consists of 3 parts – a row key, a column key, and a time-stamp (i.e., the version) Druid
Flink
Hbase
Flexible schema - the set of columns is not fixed, and may differ from row-to-row Hortonworks
HPCC
Hyupertable
KAI
KDI
One last column detail: Warning #1! MapR
MonetDB
OpenNeptune
Column key consists of two parts – a column family, and a qualifier Qbase
Splice Machine
Sqrrl
(www.nosql-database.org
www.db-engines.com
www.wikipedia.com)
18
Wide Column Store
Column families
Row key
Column qualifiers
19
Wide Column Store
Medical data
One “table”
20
Wide Column Store
Row key
t1
t0
One “row”
21
Graph Store
AllegroGraph
ArangoDB
Bigdata
Neo4j - “The Neo Database – A Technology Introduction,” 2006. Bitsy
BrightstarDB
DEX/Sparksee
Execom IOG
Fallen *
Filament
The basic data model: FlockDB
GraphBase
Directed graphs Graphd
Horton
HyperGraphDB
Nodes & edges, with properties, i.e., “labels” IBM System G Native Store
InfiniteGraph
InfoGrid
jCoreDB Graph
MapGraph
Meronymy
Neo4j
Orly
OpenLink virtuoso
Oracle Spatial and Graph
Oracle NoSQL Datbase
OrientDB
OQGraph
Ontotext OWLIM
R2DF
ROIS
Sones GraphDB
SPARQLCity
Sqrrl Enterprise
Stardog
Teradata Aster
Titan
Trinity
TripleBit
VelocityGraph
VertexDB
WhiteDB
(www.nosql-database.org
www.db-engines.com
www.wikipedia.com)
22
Document Store
MongoDB - “How a Database Can Make Your Organization Faster, Better, Leaner,” February 2015.
AmisaDB
ArangoDB
BaseX
Cassandra
The basic data model: Cloudant
Clusterpoint
Couchbase
The general notion of a document – words, phrases, sentences, paragraphs, sections, CouchDB
Densodb
Djondb
subsections, footnotes, etc. EJDB
Elasticsearch
Flexible schema – subcomponent structure may be nested, and vary from eXist
FleetDB
iBoxDB
document-to-document. Inquire
JasDB
MarkLogic
Metadata – title, author, date, embedded tags, etc. MongoDB
MUMPS
Key/identifier. NeDB
NoSQL embedded db
OrientDB
RaptorDB
RavenDB
RethinkDB
One implementation detail: SDB
SisoDB
Terrastore
Formats vary greatly – PDF, XML, JSON, BSON, plain text, various binary, ThruDB
(www.nosql-database.org
scanned image. www.db-engines.com
www.wikipedia.com)
23
ACID vs. BASE
CAP theorem - At most two of the above can be enforced at any given time.
Conjecture – Eric Brewer, ACM Symposium on the Principles of Distributed Computing, 2000.
Proved – Seth Gilbert & Nancy Lynch, ACM SIGACT News, 2002.
24
ACID vs. BASE
Thus, distributed NoSQL systems are typically said to support some form of BASE:
Basic Availability
Soft state
Eventual consistency*
“We’d really like everything to be structured, consistent and harmonious,…, but what we are faced with is a
little bit of punk-style anarchy. And actually, whilst it might scare our grandmothers, it’s OK...”
-Julian Browne
https://www.youtube.com/watch?v=pOe9PJrbo0s
25