However libraries feel like a very simplistic argument against microservices. In a world where you have a single monolith, then libraries may work fine, but often you have 2, 3 maybe 4 decently sized monoliths. In that situation libraries may not make much sense when you need to reference a central source of data.
I know nothing about microservices, so pardon my naivety. This sentence makes me think of a good old database.
If you need centralized data, can't you use a database from libraries?
But in all seriousness, that would make data migrations really tricky, as well as handling differing versioning etc. NVM extending that to more apps, feels like a dangerous venture
The biggest problem is schema changes, because they require the library to be changed. The apps now need to match their library versions against a schema they don’t control. It’s messy to schedule and rollbacks suck really bad.
A microservice that returns data from the database can paper over a lot of schema changes, because the responses aren’t directly tied to the schema. Data can be shared without clients ever knowing, entities can be moved and split between tables without clients knowing, etc.
I try hard to only have 1 service connect to a database directly. If there’s only 1 app, it can have a direct connection. If there’s more than one, I like to chuck a micro service in front of it.
This is only true if multiple applications are accessing the same tables, and even then, only if the changes aren’t backwards-compatible. For example, adding a column shouldn’t be a breaking change, because no one should be doing SELECT * (and also then assuming a fixed width when parsing).
I think another point of common confusion is how overloaded the word “schema” is. Database schema, i.e. how tables are logically related? Table schema? A synonym for database, as MySQL has it? A logical grouping of tables, procedures, etc. within a database, as Postgres has it?
It’s entirely possible (and generally much cheaper, and with less maintenance overhead) for multiple services to share a database cluster/server/node, but have complete separation from one another, modulo indirect impact via the buffer pool.
Need to add a column or perform delicate schema alterations? You run the risk of breaking dependents.
Have parts of your database that are sensitive (i.e. PII)? Your DB platform will need to reflect that to avoid arbitrary queries.
Is it possible to perform a query that recursively joins dozens of tables together on every entry? Your control is limited in how you could prevent them.
Who wrote that breaking change? Or, did someone's service get compromised? Hope you aren't sharing credentials for access among services.
> Need to add a column or perform delicate schema alterations? You run the risk of breaking dependents.
Create a stable-interface view over your unstable-interface underlying table, and query that view. Or version your tables (my_table_v1 -> my_table_v2) and create a new table for each breaking change.
> Have parts of your database that are sensitive (i.e. PII)? Your DB platform will need to reflect that to avoid arbitrary queries.
Use views to exclude PII fields or aggregate/hash them.
> Is it possible to perform a query that recursively joins dozens of tables together on every entry? Your control is limited in how you could prevent them.
Databases are designed for multitenancy, they have mechanisms to cap the resource usage of a query.
> Who wrote that breaking change? Or, did someone's service get compromised? Hope you aren't sharing credentials for access among services.
Not sure about MySQL, but Postgres has the pgaudit extension which handles provides this visibility for DML/DDL queries out of the box, as long as you create a DB user for each use case.
> Hope you aren't sharing credentials for access among services.
Who would do this with a db and not with microservices?
Write a monolith in Golang with Postgres instead. 99% of ordinary applications would never be able to max out a modern server running this.
Microservices are an anti pattern for most ordinary application needs.
The other tradeoff (perhaps larger, actually) is poor performance if using a non-k-sortable PK, like a UUIDv4. Unfortunately this is extremely common.
As with using many connections, a UUIDv4 PK is a case of "don't do that".
> “don’t do that”
If I could get people to use RDBMS properly, I’d be both thrilled and potentially out of a job.
None of these ring true for me.
>> It also scales better on its own, due to a threaded connection model vs. Postgres’ process model.
My understanding is that Postgres has an excellent SMP design that gives almost linear vertical scale. Couldn't quickly find good backup docs.
>> FAR less maintenance overhead
What do you mean by "maintenance overhead"? I don't find Postgres needs much maintenance let along FAR less.
In what way?
> What do you mean by "maintenance overhead"? I don't find Postgres needs much maintenance let along FAR less.
At small scale, you likely won’t notice. Once your tables get up to 10s or 100s of millions of rows, you will.
Autovacuum will need tuning, both globally and per-table.
Indices will need periodic re-indexing.
Specific columns in tables may need custom statistics targets.
TXID wraparound, though this isn’t so much maintenance as it is a waking nightmare that you have to monitor for.
Don’t get me wrong, I quite like Postgres. However, IME from having ran large-ish (10s of TB) clusters of both, MySQL tends to Just Work.
If you were aiming for one of the more popular options that still scale pretty well, you might pair either PosgreSQL/MariaDB/MySQL with one of the JVM languages (e.g. Java) or one of the CLR languages (e.g. C#). Both runtimes and their languages are okay and have pretty big ecosystems.
Then again, Node or even PHP might be enough for most apps, maybe even Python or Ruby with a bit of vertical (and simple horizontal) scaling down the line.
Just be prepared for that monolith to grow into an eldritch nightmare of dependencies somewhere between 5 to 15 years later. Currently working on such a Java system, it makes the laptop thermal throttle regularly.
Sending multiple requests at once is possible through batching if you want to avoid HTTP overhead. This depends on your threading model, but you can make this semi transparent in the client interface if you wish. Alternatively, use a protocol that supports multiplexing of requests on the same connection.
Libraries don't work well if your services go past two layers. After this, you start nesting things (library in library), and ownership of an update becomes much more "fun", especially in an emergency situation.
Updates in general with libraries can be easier or harder depending on your deployment model. On one hand, with libraries any tests in your pipeline are going to be more deterministic; you arent as likely to be surprised by behavior in production. On the other, if you have a lengthy deployment process for services and need to plan windows carefully, having a separate underlying service can decouple the process and let you focus on localized changes.
YMMV.