Databases In Review

Can the new NoSQL databases formats like key-store, graph based, column family datastore, and document oriented databases compete with already optimized relational databases like oracle, MySQL etc.

The traditional relational databases have little room for improvement as they are highly optimized and are already in place in number of applications but these relational databases don’t scale well that is where the different type of NoSQL databases come in they don’t some things like transactions or comprise on some features but they are built to be fast and scalable. Also, with the fast speed of changes in applications and need to adapt the database quickly to changing requirements and changing database structure the traditional databases are more difficult to change. Changing the traditional databases requires changing the whole database structure, change all the applications which uses the applications and this makes it harder for the databases to accomodate change whereas in less critical areas like in social network data where change is normal and using traditional databases is too difficult to use and to scale NoSQL provides the flexibility to be able to add change the structure for new data and merge different format of data without changing existing applications. Sharding and replication work well with large databases. With the progression of internet more and more data is collected by all organizations and existing databases fail to accomodate Big data. In order to accomodate big data there is a need to use technologies like NoSQL databases, hadoop, map-reduce and similar techniques to reduce the problems in smaller chunks and use cloud computing to do what is not possible to do anymore in traditional databases.

In the past if you had a lot of data with a lot of columns and based on the columns you wanted to find a pattern between the variables and the output we are interested to analyze you would use statistical analysis. These statistical models are too difficult to use when the data approaches a large scale i.e. Big data. Big data makes the statistical models slow to use and impossible to use. So in order to use them there is a need to make some kind of algorithms which distribute the data in buckets, uses hadoop and map-reduce to apply some kind of calculation we are interested in and apply them to smaller problems, finding the result and merging them to get the result we want. This involves now use of cloud computing.