Don't Rush into the MongoDB IPO



MongoDB (MDB), one of the biggest names in NoSQL databases has gone public! Investors rejoice as the stock shoots up over 30% in its debut on the NASDAQ last week, putting it’s market cap at over $1.5 billion.

In a CNBC interview with CEO Dev Ittycheria on IPO day, he claimed that MongoDB sees itself as a partner to cloud giants like Amazon and criticizes traditional database technologies as archaic. The company also touts the Stack Overflow survey of 64,000 developers that reported MongoDB as their preferred database to work with.

For the average retail investor potentially looking to jump on the MongoDB ship, there are a few contextual points you will want to understand with regards to these comments. Unless you work with both NoSQL and traditional relational databases (RDBMS) on a daily basis, there will likely be knowledge gaps you will want to fill before deciding whether or not to invest.

First, a three sentence description of SQL and NoSQL.

SQL - involves tables (made of rows and columns) that when looked at separately might not be too useful but when joined together to answer specific questions would be very useful. Ex. a movie production company can have an cast table, movie table, movie revenue table, etc. And to get insights, one would write SQL code (query) to grab the relevant information they want to see.



NoSQL -  the movie production company would store individual movies or people as unique objects. Imagine an array that has all of the characteristics, and it can be added to at any time, extending indefinitely to capture all of the relevant properties for that individual object. Instead of having a cast table that is used as a centerpiece to map what data you want such as total revenue by each actor, each actor could be an individual object that would include characteristics like movies they starred in, and revenues generated.



From that description it seems this “premapped” object is a winner, but the difference is that these objects have no relational property (hence the name NoSQL) to other objects and I will discuss more of the nuances of that later).


The Context

If you’re a potential investor you probably know that traditional relational databases (which run some version of SQL) are what Dev refers to when he claims that most apps run on 1970's database technology. Traditional SQL databases (think Oracle) have indeed been around for a long time. However, and this is a big however, that does not mean that they have not evolved - and evolved they have. Amazon (S3 – Redshift), Microsoft (Azure), Google (Big Query) services are the evolved forms of traditional SQL databases. These forms are highly scalable, cloud based (rent the amount of processing power and storage you need), and easily set up for migration. Thus, I interpret Dev’s partner comment to really mean he simply isn’t competing with these big players (which predominantly sell SQL solutions) so sure they can be partners. If it is true that he doesn’t see them as going for the same market, then his argument about SQL being outdated and thus use NoSQL becomes a non sequitur since the evolved cloud SQL solutions are not at all what he was initially criticizing. He would have to actually prove that NoSQL > SQL (the current evolved forms).

I believe this is where one major difficulty lies – ease of analytical use. Given the needs of today’s businesses, SQL is generally the more suitable product if you had to pick one. This is also why the big 3 predominantly sell SQL instead of NoSQL as they can certainly focus on either market.

First, its simply easier to help big enterprise clients migrate their old SQL databases to the new and improved cloud SQL.  Second, NoSQL databases aren’t optimized for big data querying (NO JOINS) – syntactically and operationally. This is getting better over time, but the lack of the relational aspect to be able to easily grab relevant data poses a challenge to storing all of the information within each object. It’s easy for simple queries, but more complex multistep aggregations and procedures are better executed in SQL environments. Training a new analyst to pull the same kind of data via NoSQL takes way more time – trust me.

NoSQL qualities

The huge advantage to MongoDB’s NoSQL environment is that its easily deployable, and quick setups means happy clients. However, this perspective is from a developer’s POV (survey). But analysts matter too – the people that need to wade through the junk to find gold.

Another unique quality of NoSQL is that there is no schema. A table schema for SQL requires that data be read/written in a specific format (ex. no string values in a numeric field). In NoSQL, attributes can simply be appended onto the array (including unstructured data like tweets or jpegs). For example, I imagine Facebook (back when it was built on NoSQL) would store objects with each person as the unique key defining that object. Lets also say that object starts off with qualities of Sex: “M/F”, Occupation: “analyst”, Age: “30”. No schema in this case means you can add on another dimension, like relationship status, or an unstructured data such as a video. But if you were to create another class of objects i.e. another class of people, as it exists on the same level as the original objects, you could run into some organizational headaches when you’re trying to do more complex analysis which is why it is critical to have database experts to map out how everything should look. In SQL, you would have to go through a process to modify the schema and and perform tedious updates. But at the very least you could always create new tables and join the data. In the end, how you determine whether SQL or NoSQL wins in this case is up to the use case as both cases will need database architecture experts.

There are other differences between the two types of databases that are useful to mention and others I won't discuss much (like NoSQL's JSON syntax provides great ease of use with Javascript frameworks) as it probably won’t drive your growth investment thesis. The last property I will mention is NoSQL isn’t supportive of ACID transactions. This means when you are updating a NoSQL database and you come across some errors, you can have a result of partially updated objects as updates are occurring in a distributed manner. It jives with the no schema approach where seemingly anything can be added or updated. In a traditional RDBMS, the entire transaction must complete in totality and no partial updates could happen, and any disruption would cause the entire process to roll back to preupdate. This is considered ACID safe. Imagine these differences at massive scale - it would greatly impact the ease of use depending on the task at hand. 

Overall,
  • I believe NoSQL could be used as a complementary tool to SQL databases, most large companies have some instance of both. Otherwise, SQL would be the prerequisite.
  • I would look to evaluate the NoSQL market relative to the SQL market, and not treat them as direct competitors.
  • I would keep tabs on MongoDB's its main competitor Cassandra as it makes headway with the NoSQL complaints outlined here.
  • I do believe a healthy integration partnership with the big 3 is a big revenue booster for MongoBD.



If you’re buying MongoDB, don’t buy it to replace SQL – it just won’t happen. 

Comments

Popular posts from this blog

Thoughts on Personal Finance