This Microservice Should Have Been a Library

41 points by philippta 17 hours ago | 28 comments

makoto12 16 hours ago |
I agree that microservices need to be considered seriously with real engineering thought and not jumped onto to pad CVs. They are a tool just like anything else, and need to be used carefully. Microservices have a particularly sharp edge.
However libraries feel like a very simplistic argument against microservices. In a world where you have a single monolith, then libraries may work fine, but often you have 2, 3 maybe 4 decently sized monoliths. In that situation libraries may not make much sense when you need to reference a central source of data.
jraph 16 hours ago |
> when you need to reference a central source of data
I know nothing about microservices, so pardon my naivety. This sentence makes me think of a good old database.
If you need centralized data, can't you use a database from libraries?
makoto12 16 hours ago |
If you have 2 monoliths both using a shared library, where do you store the data?
exe34 15 hours ago |
in the database. the library is the client to access the database.
makoto12 14 hours ago |
Then you'd have multiple monoliths share the same database. For me that's where I've zero experience, because all engineers more senior to me have cordoned that off and have said that's a no go area.
But in all seriousness, that would make data migrations really tricky, as well as handling differing versioning etc. NVM extending that to more apps, feels like a dangerous venture
everforward 15 hours ago |
It depends on how many services connect directly to the database. More than one and there be dragons.
The biggest problem is schema changes, because they require the library to be changed. The apps now need to match their library versions against a schema they don’t control. It’s messy to schedule and rollbacks suck really bad.
A microservice that returns data from the database can paper over a lot of schema changes, because the responses aren’t directly tied to the schema. Data can be shared without clients ever knowing, entities can be moved and split between tables without clients knowing, etc.
I try hard to only have 1 service connect to a database directly. If there’s only 1 app, it can have a direct connection. If there’s more than one, I like to chuck a micro service in front of it.
sgarland 14 hours ago |
> The biggest problem is schema changes, because they require the library to be changed.
This is only true if multiple applications are accessing the same tables, and even then, only if the changes aren’t backwards-compatible. For example, adding a column shouldn’t be a breaking change, because no one should be doing SELECT * (and also then assuming a fixed width when parsing).
I think another point of common confusion is how overloaded the word “schema” is. Database schema, i.e. how tables are logically related? Table schema? A synonym for database, as MySQL has it? A logical grouping of tables, procedures, etc. within a database, as Postgres has it?
It’s entirely possible (and generally much cheaper, and with less maintenance overhead) for multiple services to share a database cluster/server/node, but have complete separation from one another, modulo indirect impact via the buffer pool.
NBJack 14 hours ago |
Databases in practice are surprisingly dangerous with multiple direct services, even with just readers. Your schema effectively becomes your protocol.
Need to add a column or perform delicate schema alterations? You run the risk of breaking dependents.
Have parts of your database that are sensitive (i.e. PII)? Your DB platform will need to reflect that to avoid arbitrary queries.
Is it possible to perform a query that recursively joins dozens of tables together on every entry? Your control is limited in how you could prevent them.
Who wrote that breaking change? Or, did someone's service get compromised? Hope you aren't sharing credentials for access among services.
10000truths 10 hours ago |
The same caveats for managing backwards compatibility of HTTP routes/endpoints apply to managing database tables/columns.
> Need to add a column or perform delicate schema alterations? You run the risk of breaking dependents.
Create a stable-interface view over your unstable-interface underlying table, and query that view. Or version your tables (my_table_v1 -> my_table_v2) and create a new table for each breaking change.
> Have parts of your database that are sensitive (i.e. PII)? Your DB platform will need to reflect that to avoid arbitrary queries.
Use views to exclude PII fields or aggregate/hash them.
> Is it possible to perform a query that recursively joins dozens of tables together on every entry? Your control is limited in how you could prevent them.
Databases are designed for multitenancy, they have mechanisms to cap the resource usage of a query.
> Who wrote that breaking change? Or, did someone's service get compromised? Hope you aren't sharing credentials for access among services.
Not sure about MySQL, but Postgres has the pgaudit extension which handles provides this visibility for DML/DDL queries out of the box, as long as you create a DB user for each use case.
jraph 7 hours ago |
How do microservices solve these concerns better than a properly configured database?
> Hope you aren't sharing credentials for access among services.
Who would do this with a db and not with microservices?
remram 9 hours ago |
If you have multiple monoliths, what do you call "monolith"?
nasmorn 8 hours ago |
Multiple monoliths are called oligolith of course
andrewstuart 16 hours ago |
Microservices trade development complexity for operational complexity.
Write a monolith in Golang with Postgres instead. 99% of ordinary applications would never be able to max out a modern server running this.
Microservices are an anti pattern for most ordinary application needs.
theshrike79 16 hours ago |
Also deployment will be completely trivial, just copy the one executable + configs (And not even those if your configs live in a central service)
arunix 16 hours ago |
Why Golang / Postgres in particular?
dukeyukey 15 hours ago |
Both scale well beyond the point of most applications, but aren't overly complex to get running.
sgarland 14 hours ago |
MySQL is equally easy to get running, and has FAR less maintenance overhead. It also scales better on its own, due to a threaded connection model vs. Postgres’ process model. The main trade-off is less extensibility, but IME Postgres’ more esoteric types are rarely used anyway.
The other tradeoff (perhaps larger, actually) is poor performance if using a non-k-sortable PK, like a UUIDv4. Unfortunately this is extremely common.
ndriscoll 13 hours ago |
The process thing is irrelevant if you're using a decent database library with connection pooling. Mysql also performs/scales better when keeping connection count proportional to the hardware. I'm personally not a fan of golang (scala has similar performance while being much easier to maintain IMO), but at least pooling is normal/relatively transparent there.
As with using many connections, a UUIDv4 PK is a case of "don't do that".
sgarland 12 hours ago |
I’d say greatly reduced in magnitude, but not irrelevant. I’d also prefer to use middleware like PgBouncer or PgCat rather than relying on a library’s connection pooler, mostly because the latter IME has a habit of causing fun surprises from people not properly handling transactions.
> “don’t do that”
If I could get people to use RDBMS properly, I’d be both thrilled and potentially out of a job.
andrewstuart 9 hours ago |
>> MySQL is equally easy to get running, and has FAR less maintenance overhead
None of these ring true for me.
>> It also scales better on its own, due to a threaded connection model vs. Postgres’ process model.
My understanding is that Postgres has an excellent SMP design that gives almost linear vertical scale. Couldn't quickly find good backup docs.
>> FAR less maintenance overhead
What do you mean by "maintenance overhead"? I don't find Postgres needs much maintenance let along FAR less.
sgarland 5 minutes ago |
> None of these ring true for me.
In what way?
> What do you mean by "maintenance overhead"? I don't find Postgres needs much maintenance let along FAR less.
At small scale, you likely won’t notice. Once your tables get up to 10s or 100s of millions of rows, you will.
Autovacuum will need tuning, both globally and per-table.
Indices will need periodic re-indexing.
Specific columns in tables may need custom statistics targets.
TXID wraparound, though this isn’t so much maintenance as it is a waking nightmare that you have to monitor for.
Don’t get me wrong, I quite like Postgres. However, IME from having ran large-ish (10s of TB) clusters of both, MySQL tends to Just Work.
KronisLV 5 hours ago |
Golang is great but depending on the job market in a given country, might be a little bit esoteric.
If you were aiming for one of the more popular options that still scale pretty well, you might pair either PosgreSQL/MariaDB/MySQL with one of the JVM languages (e.g. Java) or one of the CLR languages (e.g. C#). Both runtimes and their languages are okay and have pretty big ecosystems.
Then again, Node or even PHP might be enough for most apps, maybe even Python or Ruby with a bit of vertical (and simple horizontal) scaling down the line.
Just be prepared for that monolith to grow into an eldritch nightmare of dependencies somewhere between 5 to 15 years later. Currently working on such a Java system, it makes the laptop thermal throttle regularly.
cedws 16 hours ago |
But then how are AWS going to empty your pockets?
PittleyDunkin 15 hours ago |
You're discussing two different design patterns as if they're exclusive—tight code coupling (monolith) and independent scaling (microservices). The idea you have to choose is a false dichotomy.
patmorgan23 11 hours ago |
I thought the main benefit of micro services is they make the deployment unit smaller, so you can have more teams doing independent deployments, rather than everyone having to wait for the next version of the monolith to go out
andrewstuart 6 hours ago |
Yes this is trading dev complexity for operations complexity.
cainxinth 15 hours ago |
The “this meeting should have been an email” of software dev.
NBJack 14 hours ago |
This feels a bit naive. I won't advocate for microservices at all times, but there are additional considerations and caveats
Sending multiple requests at once is possible through batching if you want to avoid HTTP overhead. This depends on your threading model, but you can make this semi transparent in the client interface if you wish. Alternatively, use a protocol that supports multiplexing of requests on the same connection.
Libraries don't work well if your services go past two layers. After this, you start nesting things (library in library), and ownership of an update becomes much more "fun", especially in an emergency situation.
Updates in general with libraries can be easier or harder depending on your deployment model. On one hand, with libraries any tests in your pipeline are going to be more deterministic; you arent as likely to be surprised by behavior in production. On the other, if you have a lengthy deployment process for services and need to plan windows carefully, having a separate underlying service can decouple the process and let you focus on localized changes.
YMMV.