Over the course of Wikijump development, both analyzing Wikidot code and implementing new Wikijump code, we’ve come across a fair number of table schemas. And there’s something we noticed, which is that there are a lot of database tables which essentially just link one thing to another, potentially tagging on some data.
Now this is actually kind of a problem. Because, for each table, we need to have boilerplate for CRUD operations, as well as checks for sanity and permissions. This is a place where Wikidot definitely comes up short; user blocks only affect private messages, some systems for inputting or maintaining relational data are way better than others, and overall there is inconsistency. And honestly, I kind of get it. It’s a lot of essentially duplicated code.
So again the question, what are relationships?
This system began with bluesoul pioneering a PHP version of what were then called “interactions”. The core idea is you have one database table which stores two different objects (via ID) and a relationship type (the kind of relation the two objects have). You can optionally include other JSON data with each relationship. The pair of the objects and the relationship type form the primary key. That’s it.
It’s a simple but very powerful concept. For instance you don’t need a separate “user blocks” table, just have a block
relationship type that goes between user / user. Similarly, bans are just a ban
relationship type between site / user. Membership is also a site / user relationship, but of type member
. The extra data section can be used to store the fields unique to each type, such as who accepted a user (membership) or ban expiration date (bans).
Let’s say in the future we wanted to add a bookmarking feature. With Wikidot, they’d add a new table, have to add new modules for every possible interaction with bookmarks, and then we’d all have to cross our fingers that they properly dotted all their is and crossed their ts.
But for relationships, we add a new entry to our relationship enum, give it a constant string (say bookmark
), and then add the types, in this case user / page. Use of a Rust macro auto-implements all the basic CRUD interactions for us, and exposes an escape hatch for us to implement the create
method ourselves, in case the relationship requires extra logic. That’s all you need to do.
So to briefly take a digression, what’s “extra logic”?
Well when you block a user, you remove any contact / following status that exists between you two. That means in the code to create a user block, we must delete the user1 / user2 contact, user2 / user1 contact, etc.
Or, let’s say you’re trying to follow a user, meaning you’re creating a follow relationship. Well, such a relationship should be prohibited from being created if you’re blocked, right? So in the create method we can do such checks and then reject if a block is found. Same thing with applying for membership while banned.
There’s another advantage to buying into the relationship system rather than rolling your own. The relationship system features a history. Whenever you delete a relationship, it’s soft-deleted, meaning it’s marked as gone, but isn’t actually removed from the database. This is an aspect of Wikidot CRUD that is severely lacking; if you unban a user, if you don’t independently maintain records of bans and the like, there isn’t actually any indication they used to be banned. It’s bare basic CRUD, once it’s deleted it’s gone.
But with the relationship system, you can easily look at the ban history for a user, or their membership history, or check if you’ve blocked or followed someone before (what if they changed their name?). And you don’t have to even write any extra code for it!
For completeness sake, it should be noted there are two systems that is not planned to be moved to relationships.
The first are page links. This actually has three components:
- Page external links – This refers to the list of URLs that a page contains within it. I don’t think it’s actually exposed anywhere in Wikidot, but it does in fact track this information, which is useful for seeing if people are using your site to spamdex or what sites are commonly linked to.
- Page internal links – These are called “page connections”, since they aren’t just crosslinks (for instance one wiki page linking to another), but also includes and other relationships, since these are connections between pages.
- Page missing internal links – Also called “orange links”, these are like page connections except that the target page is missing. So we cannot link to its page ID, but rather the slug which is purported to exist.
If you notice, 2/3 of these use strings, so they cannot fit in our relationship system. Instead of having only one of these as a relationship and the other two using a different system, all three tables use their own separate system. (The governing logic for this is in the LinkService.)
The other are page votes. These are an obvious candidate for relationship-ization! It’s just a page / user relationship with an integer as the data. However, I think it makes sense to leave this as its own table. There will be large numbers of votes, and computing them will be a common task. The extra overhead of needing to peer inside the JSON to get the vote value (since there will also be the “disabled” flag in addition to just the value) will add overhead (though it is possible that in the real world use of the JSONB
type eliminates this issue).
But with those exceptions out of the way, that’s the relationship system! As you spend time using Wikidot, think about what operations are actually just relationships and could benefit from leveraging a unified system rather than needing to roll your own every time. (If you need inspiration, here are Wikidot’s table schemas in PHP.)