Recently, I talked with a bunch of people across several Discord and Slack communities about their thoughts on ORMs and SQL. In this post, I'll be breaking down when we should use each, and the sacrifices we would make along the way.
Before we begin, I would like to clarify that I am not writing this post to bash your technology choices. You are welcome to disagree with me and discuss this topic constructively on Twitter or in the comments below. Feel more than comfortable to continue using what you use, this post just contains my views on why ORMs aren't always a great way to represent and model relational data.
ORM is simply an acronym for object-relational mapping. In simple terms, it's basically serves as a layer between relational databases and your applications.
So, ORMs are basically an abstraction on-top of standard SQL to make interacting with your database even simpler. With that being said, there are some consequences (positive and negative) of this abstraction that we'll talk about in this post.
Now that you have an idea of what exactly an ORM is, let's get into potential problems we can run into while using them.
In my opinion, there are a lot of use cases where an ORM could become an anti-pattern. The fundamental problem is that an ORM tries to abstract away the underlying database, and reduces the feature set to the 'lowest common denominator' -- essentially meaning that they try to plaster a fits-all API over different database management systems.
This also raises the question of whether or not an abstraction like ORMs are even necessary. We already have a succinct language for interacting with databases, and that language is called SQL.
I think it's completely appropriate to bring in other arguments here as well. When talking with this to other people about this, a common feeling people got was:
So, essentially, this means that generally abstractions are oftentimes necessary for us as developers, which I totally agree with. Now, the problem here is that not all abstractions are good, which segues onto my next point about how ORMs abstract away features of relational databases.
When talking in the context of relational databases, the primary point is that they're not arbitrarily interchangeable. Every database is unique, and has it's fair share of strengths and weaknesses.
As I mentioned above, ORMs have a tendency to plaster a general, one-size 'fits-all' API over several different types of databases, pretending that the databases are in-fact arbitrarily interchangeable. You can apply this same sort of logic onto pretending that all different types of message ques are interchangeable, or that all operating systems are interchangeable.
Please don't pretend here that all relational databases are just columns and rows. They're not, and to get a perspective of this we can compare something like MySQL and PostgreSQL. Although at a foundational level they're both relational databases, they have vastly different features.
ORMs are decent for regular CRUD operations, but in-reality those operations don't even represent the majority of database interactions for large-scale products that have any level of complexity. Thinking about it, these operations are really simple to do in SQL itself. If your application primarily just uses simple CRUD queries, is there a point in having an extra layer of abstraction unless you're at a crucial time constraint?
A common reply people gave me is that ORMs help smaller-scale startups ship things. Of course, with large companies such as Meta they'd want to spend time and energy into having their engineers optimize a custom low-level solution. In fact, since we're talking about Meta, I know someone who works at Meta that has told me that they use stored procedures for just about everything, which is the other end of the extreme.
But anyways, back to the topic -- for the average startup, ORMs can help them ship their product more easily, which is something I totally agree with. HOWEVER, I also believe that this 'development time' argument is primarily used by short-sighted managers in these startups who ignore the long-term costs of their decisions. Unless you plan on throwing out your startup and your company in a few months, the time you save now will be paid tenfold in maintenance over the years as your startup grows. There's a reason that companies hire DBAs (database administrators).
As mentioned above, the fact that ORMs don't give you complete control over the queries you run in your database might cause you to overcomplicate simple queries.
Let me to explain: say that you're building a social media platform and you have to fetch a list of users from your database, fetch their recent posts, do something with them, and then write them back to your database. If you were using an ORM, this would require FOUR queries and trips to your database when in reality this could be done rather easily with just a single plain SQL query. If you have to keep on doing this, then it's definitely not great for performance. Along with this, ORMs tend to produce sub-par JOINs and further reduce efficiency in that way.
To add onto that, if you have a bit of prior knowledge on ORMs, you are probably well aware of the 'N+1 selects' problem. In case you're not already familiar, I found this great Stack Overflow Answer which describes it in detail.
The N+1 query problem happens when the data access framework executed N additional SQL statements to fetch the same data that could have been retrieved when executing the primary SQL query.
The larger the value of N, the more queries will be executed, the larger the performance impact. And, unlike the slow query log that can help you find slow running queries, the N+1 issue won’t be spot because each individual additional query runs sufficiently fast to not trigger the slow query log.
The problem is executing a large number of additional queries that, overall, take sufficient time to slow down response time.
Something to note is that this problem occurs in SQL as well, but you can fix it rather easily (as the answer shows). However, if you're using an ORM, due to the fact that you have no control over the code that it generates, you simply have ZERO control over whether or not you can run into this problem.
Moreover, another issue with ORMs is that they try to hide the semantics of relational databases from the developer. Instead of writing explicit joins of various kinds and whatnot, they provide you with a linear API and try to turn that into somewhat efficient or inefficient queries which have terrible performance.
Ultimately, the only way that can be as efficient as SQL is if SQL could be reduced to that API. Worse, you have little control over the queries they do generate, and they can change between versions (or on variables hidden to you).
At a foundational level, the problem is in the name. Object-relational mapping. Relational databases aren't about objects. Treating them as such you might as well use a simple key/value store instead (and that itself is a disingenuous statement as different K/V stores have different trade-offs).
ORMs tend to just obscure what is going on. If I know SQL and PostgreSQL, I can essentially understand just about ANY project that uses them, without having to learn the specifics of one of two dozen ORMs with its own idiosyncrasies. If you are a startup that uses an ORM, then essentially the developers working on your product will always interact with the database through an ORM preventing them from developing any understanding of how the underlying database technology works. Due to the nature of different databases, you will have use cases where powerful database features might come into use. The fact that an ORM gives you an advantage in migrating to other database easily is something which simply isn't that common in reality. If a company does end up doing that however, it isn't that hard to incorporate those change into your SQL. As I've mentioned countless times above, for any non-trivial use case, you will need to use SQL.
Now another argument that people bring up here is that ORMs generally ship with some sort of way to run
raw code. For example, Prisma, an ORM that I personally use in almost all of my projects, ships with useful methods to execute raw queries.
My response here is while you can absolutely use an ORM with raw, you can just as easily NOT use an ORM in applications that you plan to scale 😛.
I think we've discussed the downsides of using an ORM enough now. Let's talk about the upsides. A big one is that it speeds up development.
You would say, it's pretty hypocritical for me use an ORM in my projects while writing this. I would definitely agree with you here, besides the fact that most (if not all) my projects are built during hackathons, when I actually have the motivation to finish something.
As mentioned above, ORMs probably aren't a great long-term decision, but they come with a lot of upsides. With something like Prisma, you can generate types from your schema and use them in your application, saving you a tremendous amount of development time in places like hackathons where every minute of your time is valuable.
However, as I've mentioned above, if you're a startup then this should not be a sacrifice you're willing to make unless you plan on throwing out your project within a two weeks. The time you save during development now is what you'll be paying in tenfold in maintenance as your startup scales.
Unfortunately, most beginners aren't very eager to learn another language just to communicate with their database. This leads to beginners either using NoSQL databases or using an ORM to connect with their database. Considering the simplicity applications that most beginners write, it will not matter much if they decide to use an ORM or not. In fact, if a beginner doesn't properly validate values received from their web forms or cookies before passing them onto SQL queries executed by the database server, they're at a risk for a SQL injection attack.
I think more experienced developers can definitely easily avoid SQL injections from occurring in their applications, however in the context of beginners it's very possible that they overlook a vulnerability like this if they don't have a prior understanding of what exactly it is. SQL injection is one of the easiest and potentially one of the most dangerous threats to application security. In-fact, 65.1% or two-thirds of all attacks on software between 2017-2019 were SQL injections alone.
With that being said, you probably got the idea that SQL injections aren't a joke. You face little to no risk of SQL injections using an ORM unless you use plain SQL somewhere else in your application.
Although it's worth asking yourself: considering that you don't need an ORM to prevent SQL injection, would it really a good idea to for you use an ORM just because they prevent SQL injections if you're not at a critical time constraint?
In the end, I think ORMs are an nice tool if used appropriately in the correct scenario. Once you balance out the positives and negatives however, it may seem that including an ORM in your stack may not be the best idea once your startup scales. I love tools which make SQL easier to work with. There's something called sqlc which basically compiles SQL to type-safe Go. Tools like this provide us with some of the same advantages as an ORM, just without effecting our actual database workflow.
That's all for today, until next time 👋