Apache Cassandra… Best of Big Data and Relational Databases
Published on: Author: Gail Djaspan Category: Data ScienceApache Cassandra was originally developed for Facebook to handle its inbox search feature. Because Facebook has more than one billion users, and no other software at hand for search on this scale, the social network had to be creative. They wrote their own software, Cassandra, and eventually that tool was released as an open source project on Google Code. It became so successful that it moved on to be an Apache incubator project and finally a top-level project.
Cassandra is designed especially for processing huge amounts of transactions on multiple servers. It provides high availability, supports using clusters on multiple datacenters and places high value on performance. Cassandra also uses other utilities such as OpsCenter, and it can also be integrated with other Apache projects such as Hadoop.
Big data?
Where collections of large and complex datasets meet, you are likely to hear people speak of Big Data. With challenges like capture, curation, storage, search , sharing, transfer, analysis and visualization. At Qualogy Caribbean a group of the Java developers faces these kind of challenges daily. So far Cassandra proved to be ideal for them. The developers of the Java team found that it stored their data with ease, without slowing down production.
Will Apache Cassandra eventually replace relational databases altogether?
We don't think so. In most cases, big data databases, or NoSQL (or Not Only SQL) databases aren't there because relational database technology fails, they usually form as a result of wanting to combine datasets. Therefore, databases like Cassandra should only be used when the circumstances require it.
However, if combining 2 database technologies when developing an application, helps speed up development; why not choose that option. When developing a messaging application, for example. You can use relational technologies to manage the user accounts and big data technology to manage the messages and files. With Apache Cassandra you can always opt for best of both worlds.