Big data analytics with Neo4j and Java, Part 1
See the original posting on JavaWorld
Relational databases have dominated data management for decades, but they’ve recently lost ground to NoSQL alternatives. While NoSQL data stores aren’t right for every use case, they are generally better for big data, which is shorthand for systems that process massive volumes of data. Four types of data store are used for big data:
- Key/value stores such as Memcached and Redis
- Document-oriented databases such as MongoDB, CouchDB, and DynamoDB
- Column-oriented data stores such as Cassandra and HBase
- Graph databases such as Neo4j and OrientDB
This article introduces Neo4j, which is a graph database used for interacting with highly related data. While relational databases are good at managing relationships between data, graph databases are better at managing n-th degree relationships. As an example, take a social network, where you want to analyze patterns involving friends, friends of friends, and so on. A graph database would make it easy to answer a question like, “Given five degrees of separation, what are five movies popular with my social network that I have not yet seen?” Such questions are common for recommendation software, and graph databases are perfect for solving them. Additionally, graph databases are good at representing hierarchical data, such as access controls, product catalogs, movie databases, or even network topologies and organization charts. When you have objects with multiple relationships, you’ll quickly find that graph databases offer an elegant, object-oriented paradigm for managing those objects.