MitmProxy2Swagger: Automagically reverse-engineer REST APIs

590 points by AbuAssar 8 days ago | 71 comments

Gamemaster1379 8 days ago |
This is a nice tool. A game I liked to play announced end of service back in 2023. They gave enough notice to let me capture some logs from their cooridinator service.
I captured them in mitmproxy and ran those through this to help me identify all the endpoints and their general structure. (A few things were a misnomer, like the examples suggesting certain values were able to be floats when they could only be integers)
I was able to get a team together and we were able to stand up private servers as a result.
simonjgreen 8 days ago |
Amazing! What game was this for? I was involved in the RE efforts around UO way back in the day.
kirici 8 days ago |
Gundam Evolution, going by comment history.
ge96 8 days ago |
Different plot/game mechanics but armored core 6 is great if you like mecha
Gamemaster1379 7 days ago |
Gundam Evolution, as someone else noted from my comment history.
andrewstuart 8 days ago |
This is something that would be easy to do an ordinary job of, missing lots of edge cases and not making something thorough and complete.
A really professional and thorough job would be extremely time consuming and hard.
matthewolfe 8 days ago |
I do this a lot for my work. A tool like this that can help get me to a nice starting point is huge. Instead of developing a mental model of the API in my head by manually looking through API requests/responses in ProxyMan, this can start me off much more quickly. From there, the edge cases can be worked out.
zython 8 days ago |
This is so cool. Thanks for sharing !
tinchox5 8 days ago |
Coool!
zebomon 8 days ago |
I looked through this earlier today when I saw it mentioned in that thread about the closed source tool for the same purpose.
Having done a good bit of this type of reverse engineering the hard way over the years, it's a very exciting find. I had been talking with my partner about building something similar for the past six months. How exciting to learn that it's already out there and open source too!
colesantiago 8 days ago |
Again, this is the very easy part of the reverse engineering API process that most tools can do, similar to API Parrot and the rest of them. This is not hard to do.
The hard part is that inevitably, all these internal APIs will just add aggressive CAPTCHAs, Device Check, fingerprinting, etc to prevent common drive by re'ing. Easy to add these on the defence side, and extremely difficult to bypass on the other side.
I can imagine all developer teams now upping their security with the combination of the above mentioned to prevent this.
sebmellen 8 days ago |
Depends on the age of the tool. We work with a lot of legacy systems that actually want us to integrate with them but don’t have the dev resources to build a proper API surface. As a result, we end up doing a lot of painful reverse engineering. These tools look promising for purposes like this.
devjab 8 days ago |
I curious as to why people would have a public API to begin with if they wanted to protect it from people using it. Then again, why would anyone have a public undocumented API in 2024 when a LLM can give you a cli tool to auto-generate 90% of the OpenAPI spec in a couple of hours? The last question isn't serious, I've worked in enterprise for decades and almost none of the tools organisations end up buying have good documentation for their API's. Not that those are publicly available, but still.
lesuorac 8 days ago |
I think you have a misunderstanding here.
The API needs to be "public" because the app uses the internet to communicate back to the home server.
The API is not "public" in the sense that the app developers want anybody to use it; they just want their app to use this API. So they don't write publicly accessible documentation about it because they don't want to encourage its use.
A tool like MitmProxy2Swagger lets you run the app and record all of its API calls so that you can use this unadvertised API.
devjab 6 days ago |
Why wouldn’t you add authentication to an API you don’t want others to use?
ssdspoimdsjvv 6 days ago |
The web app probably authenticates using an API as well, in which case it's trivial to add that to your shadow client as long as you have the credentials.
lesuorac 6 days ago |
Laziness / skill issue.
How many apps have you seen only do client-side protection?
mad_vill 8 days ago |
There are many cases where users are behind a forward proxy for security/compliance reasons. Most applications need to support these types of users.
jampekka 8 days ago |
Making a mitmproxy dump from a manual browsing session is more or less unblockable, barring some TPM or similar fuckery.
Usage of the API even with the protocol known OTOH can be quite easily made really hard.
notcrazylol 8 days ago |
I was wondering how it would take in graphql endpoints and convert it to swagger, since its just a single POST API with change in params. But thats more of a swagger issue than the tools. Has anyone dealt with this? Would be really helpful if you could share your ideas too :)
asabla 8 days ago |
Why would you tho?
If you're working against an GraphQL based API, you should be able to pull a schema file. And use that to implement your own API.
All you would get from an Mitmproxy is example queries and mutations. With the additional complexity of extra tooling to stich together the schema file
jampekka 8 days ago |
Pulling the schema file can, and often is, disabled server side. And GraphQL APIs can, and often do, decline to serve other than persisted queries, and those can't be really inferred even with known schema.
notcrazylol 7 days ago |
So I am working with a new company that has a ton of graphql queries. What I wanted to do was write an integration test for them in the fastest and easiest way possible.
I don't want to sit and read each query to identity where it is in the user flow. So I was thinking if I run this in the background and go through a happy flow, I can get the APIs in order and write an integration test.
swyx 8 days ago |
did i miss something or why are there TWO (2) "magically reverse engineer REST APIs" projects on the HN front page right now? is there some offline beef going on?
(screenshot in case this goes away https://x.com/swyx/status/1874762725383188502)
Quarrel 8 days ago |
Presumably, because the closed source one got some traction, so people are pointing out the open source alternative.
littlestymaar 8 days ago |
Likely because of this comment[1] in the other thread which made people submit this link, and when multiple independent people submit the same link in a short period of time you're very likely to end up on the front page (this exact situation happened to me once)
[1] https://news.ycombinator.com/item?id=42568121
AbuAssar 8 days ago |
Yeah, that's where I got the link from.
mylastattempt 7 days ago |
Offtopic and meta, but, you share a screenshot using Twitter/X? That's really bizarre to me. That is all, just had to say that.
swyx 5 days ago |
how is it worse than photobucket or imgur
youngNed 8 days ago |
perhaps a n00b question, but would this work, or is there something similar for apps, specifically android apps?
rhaps0dy 8 days ago |
Depends on the app. If it uses some online functionality probably yes. You could also try decompilation, it’s decent on java apps like android’s.
whilenot-dev 8 days ago |
A MITM proxy isn't specific to any app, it's a forward proxy for your outgoing network connection. In case of an Android app you'd need to run mitmproxy on a machine in your network and setup the connection as proxy in your Android's network settings. Then you'd need follow http://mitm.it to install mitmproxys root certificate on the Android device (to trust the connection with TLS) and off you go.
EDIT: or rather follow the docs[0]
[0]: https://docs.mitmproxy.org/stable/howto-install-system-trust...
tecleandor 8 days ago |
I've used this specific tool to help me reverse engineer the private API of an Android App.
The thing is, depending on how hardened the app is, you'll have to play with Android to allow this interception, mostly because of certificate pinning. Also I remember something about apps not using the system wide trusted certificates you install (IIRC).
I remember using a rooted device with LineageOS, and downloading the APK and modifying it with a tool so the self signed certificate for the mitm proxy works with it.
The mitm proxy docs have some links to tools that can do that [0] and you could also use an Android emulator if you don't have an extra phone to mess with it [1]
0: https://docs.mitmproxy.org/stable/concepts-certificates/ 1: https://docs.mitmproxy.org/stable/howto-install-system-trusted-ca-android/
jazz9k 8 days ago |
I use burp suite combined with Frida (which can remove root check and override ssl pinning).
nsteel 7 days ago |
Yes, this. The Frida tools method to remove cert pinning is the only method that has worked for me. The mitmproxy docs for android (as referred to by another commenter) didn't work for any apps I tried.
construct0 8 days ago |
Yeah - does this get nullabilities right?
tecleandor 8 days ago |
I've used this tool in the past with success. Not perfect but it accelerates the work greatly if you can launch a mitm proxy quickly and are familiar with the tool.
I've been fighting lately with an API, though. It's not very, let's say, RESTy. It has only one endpoint, and the different "sections" of the API are defined in parameters, so MitmProxy2Swagger doesn't detect them properly :(
nejsjsjsbsb 8 days ago |
Nothing is RESTy
quectophoton 8 days ago |
> It's not very, let's say, RESTy. It has only one endpoint,
To be fair, from what I understand an actual(tm) REST API would only have a single defined endpoint[1]: the entry point. With every other endpoint being discovered from the responses. And also from your message I'm guessing a URI still uniquely identifies a resource (specifically through the "query" part of the URI, instead of the more common "path").
So, technically, assuming there's nothing too weird with that API, it seems like MitmProxy2Swagger is failing to detect a REST API.
[1]: Corollary: If an API is RESTful, it should be possible to rename any endpoint (except the entry point) at any moment in time without prior notice, and clients would not break as long as the response types/schemas are still supported by the clients. In-flight requests might fail with a 4xx, but after a retry they should go to the correct endpoint without any code change required.
zdragnar 8 days ago |
This is HATEOAS, basically the core feature of REST that very few people actually use. Most of what the industry calls REST or RESTful is just structured and inefficient RPC.
tecleandor 7 days ago |
True, I almost never see the endpoint discovery thing, I almost forgot about it...
pests 7 days ago |
I don't think anyone has ever used REST in the way you are using it - the sibling comment points out that HATEOAS is probably what you mean - this generally embeds links to all resources, full data navigation, next/prev links, and so on. It is true that a proper HATEOAS client should be able to navigate an endpoint completely with just a starting address.
quectophoton 7 days ago |
Yeah unfortunately despite it being part of the REST definition, nowadays "REST" has become a term that means "REST but without HATEOAS". Similar to how "API" now means specifically "HTTP API that returns JSON", or "AI" now means "Generative AI specifically".
srameshc 8 days ago |
Obvious question: How to protect against this ?
smallnix 8 days ago |
What specifically do you want to protect?
K0nserv 8 days ago |
Your first line of defence should be a secure API where an attacker doesn't gain anything by knowing it.
You can add obfuscation, but ultimately if the client is shipped to the user you must assume an attacker can reverse engineer it.
mathgeek 8 days ago |
Build your API assuming anything public facing will be known. This includes anything downloaded to a device.
bandrami 8 days ago |
I find this confusing because the point of an API is to be known, yes? Otherwise who's accessing it?
quesera 8 days ago |
It's a valid desire, but you have to be really dedicated to the effort to block it, in practice.
You might intend your API to be consumed only by your own clients. E.g. your published mobile apps.
A well-designed API won't allow a third-party client to do anything that your own client wouldn't allow of course. Permissions are always enforced on the back end.
But there are many cases where a user might want a custom/different client:
If your mobile apps are not awesome, or if they deprioritize a specific use case, or if they serve ads ... or even if your users want to automate some action in your service...
If your service is popular enough (or you attract a certain kind of user), you will have some people building their own clients.
bandrami 8 days ago |
Those sound like bad use cases for a client-server model with public endpoints, then? I mean, you could cert-pin yourself in the client app, I guess.
quesera 7 days ago |
Not sure what you mean here. All endpoints are equally public.
kube-system 8 days ago |
Not necessarily. A common pattern is to build a 'private API' intended to be used by one's own front-end applications. For example: most client-rendered applications, like the Airbnb example on this page.
nsonha 7 days ago |
Modern APIs are actually most of the times poor man's RPC, they don't need to exist, much less known.
tonyhart7 7 days ago |
for me, we cant 100% protect again this type of usage but we can minimize with good observarbility and monitoring tools that always check if user is run this via verified way (signed app,web or etc) or RE'ing the api <<
because guess what??? we are the creator of such system, its easy to detect bot/such case when you have good analytical data because this type of way does not give any "traces"
mkagenius 8 days ago |
If only someone could automate[1] the clicking and navigating part by writing in plaintext something like "Open airbnb and explore all the features as much as possible" :)
1. https://github.com/BandarLabs/clickclickclick - It does that and I am one of the authors.