Wikijump Updates – 10/22

It’s been a bit since the last development blog post, and as you can see from the post’s title, I meant to write this sooner. That said, things have generally proceeding gradually and smoothly. In this blog post I wanted to highlight one particular improvement made to DEEPWELL, our backend / internal API service.

To first provide some context, within the current architecture, DEEPWELL is what stands in front of the various datastores (PostgreSQL, Redis, S3) to provide logical Wikijump operations. This means that it provides operations we think of when it comes to wikis, such as “edit this page” or “unban this user”. When it comes to some operations, like for pages, things can actually get quite involved. This is why “logical operations” also include sub-operations which are not exposed to end users, like “create a page revision”. The idea is that after this method has returned, you can count on a set of postconditions being true, and other, more complicated operations can be built on top of that.

Our web server and client-side code are both served by Framerail, which is our frontend service. This is a SvelteKit app which calls DEEPWELL as our internal API. This performs the highest-level operations, such as page undeletion (which checks destination page presence, creates a page revision, etc.) and fetching view data (used for rendering web pages).

Originally DEEPWELL’s use was unclear. It is currently an internal API, meaning that all requests received are assumed to be trusted: if you tell it to make a page revision with user ID 123 as the creator, it will do so. But earlier it also included some other features like a per-IP ratelimiting middleware, which is clearly for an external API use case.

Eventually the pattern became more clear; Framerail will serve as the public-facing web server and will call DEEPWELL, our internal API, from within a VPC. This way all requests are coming from a trusted server who untangles web requests and determines (via other calls to DEEPWELL!) whether a given user is allowed to perform some action.

But the other aspect of having a public (?) API was that it used REST. After all, this is a very standard paradigm that basically everything has tooling for. But if we have an internal-only API, we don’t need to be constrained to this, and should use whatever best fits the needs.

So without the need for something easily accessible publicly, with no use for HTTP headers or cookies (those are managed by the Framerail web server, not DEEPWELL), then why are we using REST?

REST as it turns out, has some downsides. For instance, we have a rich variety of error codes we’d like to return, but they don’t map particularly well to HTTP statuses, causing multiple unrelated errors to be mapped to values like 500 Internal Server Error or 404 Not Found instead of more granular ones.

Additionally, we needed a bit of boilerplate due to calling conventions. REST permits you to pass in values via the route itself, such as PUT /blogpost/123 to update some entity with id 123, which needed to be parsed out in our handlers and put into the input structures that DEEPWELL internally used. Furthermore, requests can also have query parameters, which add another thing to parse and combine, and in the case of GET requests, must be used because no HTTP bodies can be sent for such requests in most servers/clients!

REST is ultimately not a good choice for our internal DEEPWELL API.

As such, recently I merged in a series of changes to switch DEEPWELL over to using JSONRPC 2.0 as our transport. (Main PR found here.) It isn’t hard to use from server-side Javascript (i.e. Framerail), fits our pattern of each method being effectively remote procedure call, allows us to pass in data even in fetch-only requests, and return a rich list of error codes for every reasonable situation to aid in debugging.

Because these requests in this context are just JSON, we can save a lot of boilerplate by simply deserializing requests directly into the structure that DEEPWELL internal services expect rather than needing to construct it ourselves. Additionally, I finally got around to making an improvement where we no longer need to manually create and close the database connection for each method (which is error-prone since it can silently prevent database changes form persisting).

JSONRPC has been good for DEEPWELL, though there is one pending item. Because everything is, well, JSON, this makes it unsuitable for large blob uploads (such as files), because the whole thing needs to get encoded to base64 or hex or something first. The current plan to fix this flaw is to set up a system with presigned URLs for S3, allowing simple file hashes to be passed around for processing. You can track the issue here: WJ-1196.

Author: aismallard

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.