Published on 2021-04-25
In my years working as a software developer I've had to design data models for a variety of web applications. I consider data and referential integrity fundamental to developing reliable systems. If a particular combination of input data should be treated as invalid in my application, I prefer to have that validation in the database. Clearly, we want these same validations in the application server and/or the web frontend for usability, but I have found that bad data will find its way to your app if you don't validate it in your database. I can usually cover our needs with the tools provided by modern RDBM systems (PostgreSQL in most cases) like foreign keys, (partial) unique constraints, and check constraints. I try to avoid triggers if I can, but they're there as a last resort.
However, I've detected a pattern in more than one project with which I struggle to implement in a way that referential integrity is preserved. I call this pattern the "triple relation problem".
Imagine the following model. Three entities all related to one another: an Organization entity, a Department entity which belongs to an Organization, and an Employee entity which can belong to many Organizations and/or Departments. We want employees having a direct relation to organizations so they can be moved around between its departments without being accidentally "removed" from the organization entirely.
The obvious solution
This is a simple model, we just need two extra tables to implement the many-to-many relations, and that's basically it. But there is a catch: we can't let employees be a part of a department
from an organization to which they don't belong.
I've encountered this exact pattern in three different occasions in projects developed with Django. I wanted a solution that doesn't involve triggers and that doesn't rely (at least entirely) on multi-column foreign keys, as Django simply doesn't support them.
After giving it some thought, it became evident that composite foreign keys will be required for a model to perform this check. To overcome Django's limitations, we will need to create those constraints in a migration and add a couple of simple foreign keys to provide the same relations in a way that Django can see them.
An improved, albeit more complex, solution
This solution makes three changes:
Now we make it impossible for employees to join a department if they don't belong to that organization first. In the same way, if we delete an employee from an organization, the database operation will either fail or drop all relations between that employee and the organization's departments, depending on what we set as the ON DELETE action.
With all our requirements now satisfied, we need to keep in mind that this is a more complex model to understand and that and an insertion in EmployeeDepartment now has to process 4 foreign keys instead of 2, potentially hurting performance. Make sure that the benefits of this approach exceed its drawbacks.