zbar has great barcode reading performance! I've seen far newer software that's nowhere near as good in terms of real-world performance.
But it seems the original developer hasn't updated it since 2009 [1] - and fuzz testing only rose to prominence in ~2012 with the rise of tools like afl-fuzz.
I would be absolutely astonished if it had ever been fuzzed.
> Cut out any unnecessary features to limit attack vectors. ZBar by default scans all code types, which means that an attacker can trigger a bug in any of the scanners. If you only need to scan QR codes for instance, then ZBar can be configured to do so in the code
Absolutely sensible, yes.
Not just for security, but also because packages sometimes have extra barcodes. If you're scanning an EAN-13 on a pack of pasta, decoding a QR code for a pasta recipe website is just going to confuse things :)
That was the main theme of the book. Everything was well designed with failsafes, but too many of the design assumptions turned out to be wrong. Expecting only the expected lead to many small mistakes that were harmless individually but together snowball into a disaster.
(Well, not obviously dystopian, more ‘oh shit, that is how we’d be fucked isn’t it). Alien and Aliens also had a similar feel in their writing, except in real life there is rarely a Ripley there when you need one.
Kroger, for example, has an app that allows you to scan items to add them to a virtual cart as you shop and avoid scanning them at the register... however the same app is used to read QR codes on in-store coupons, which are "helpfully" placed very close to the price tags with UPC barcodes on them.
If I want to scan one of those coupon QR codes, I need to either start with the camera very close to the QR code or cover the barcode with my finger.
The modern reader firmware tend to have multiple modes and many options. Some modes are as simple as "scan whatever you see out of the many formats you support, and spit out the decoded value of something as USB Serial". Or, worse, "...as USB Keyboard".
You can imagine how easy those modes are to integrate with POS software, without implementing the proprietary protocol for that device, and you can also imagine how poorly that can work out.
If you owned a store with a POS setup with flaky reader behavior like this, and were stuck with it, you could try reconfiguring the reader (to, say, disable QR support). This reprogramming can sometimes be done via documented protocol, via sketchy Windows software, or via... barcode... Careful you don't make it worse.
(Our startup used modern readers (multiple 1D formats, QR, NFC) for a factory station, and had to do a lot of experimenting with different brands and models, to get the behavior and speed we needed. We even managed to brick a reader, just with configuration changes, not flashing firmware.)
Secondarily, there's probably also a rich vein to be mined scanning barcodes like "'); DROP TABLE Item" that would exploit systems further up the chain. That's not what this article is covering (since they're just looking at the barcode scanning library).
There would be some fun in carrying around a bunch of "edge case" barcodes ("programming" barcodes for various kinds of scanners, SQL injection attacks, etc) and feeding them to unsupervised barcode scanners "in the wild" to see what happens.
This is definitely still a problem because there might be situations where you’re allowing an end user to pass an image file in and are then passing it unmodified to this library to interpret the barcode in it, but it’s not the same as some special barcode that encodes data that crashes the library.
So for example this blog entry does not describe a situation where you can just print out a barcode and when you scan the barcode then the library crashes or has the opportunity for arbitrary code execution. That would be a very exciting exploit. They don’t actually rule out the possibility, but they didn’t get anywhere near fuzzing at that level in this blog post.
However, zbar isn't used all that widely in industry. The airport's baggage handling system is much more likely to have a self-contained scanner from Cognex or Omron or Zebra running propriety, closed-source software.
I'd love to get to a point I could fuzz a program but the gulf of execution is vast -- I enjoyed attempting OSCP, but I can't keep paying for lab extensions.
(I also have a gut feeling there's a lot of unfuzzed apps which people don't look at because they're utilitarian and don't use the network much. So if I can phish you, then leverage some innocuous tool for RCE or whatever... useful.)
But I've struggled to find resources on this topic -- anyone know of a book, course, or wiki?
Parsing network protocols and ABIs is possible, but usually requires a fair amount of coding.
Thanks, this is useful context -- it's easy to get overwhelmed and quit early on with these sorts of things. It looks like someone else posted a set of exercises[1] using AFL that seem to be aimed at smaller programs like you describe.
Is a good course
The issue I found with a lot of fuzzing tutorials is that they're difficult to reproduce because there's a lot of work in setting up the environment and toolchain. In my tutorial, you can kick off fuzzing with one command, but I also walk through how I created the workflow step by step.
I wrote a program that takes a byte array as input and drives the library under test with it, attached that to llvm's fuzzer and left it running. You end up with a lot of files containing some bytes that did something vaguely interesting with the program. Good experience overall.
You might get some meaning out of https://github.com/JonChesterfield/bigint/tree/trunk/fuzz_bi... but ymmv, I got sidetracked by interesting stuff at work ~3 months back and don't currently remember what state that repo was in when I paused work on it.
Thanks, this kind of social stuff can be useful -- it looks like all the resources folks shared seem to favor AFL.
Toying with barcodes - https://www.youtube.com/watch?v=QCtdEYnlykA
"But it crashed. That's bad. I can't stop people scanning bad barcodes."
Sad. What a poor understanding of our field.
The number one rule of them all is: "Never trust (user) input".
A slightly more powerful variation being: "assume all input is malicious until proven otherwise".
I mean: on one hand there are people who fuzz, who test, who think about edge cases, who think about security, who think about uptime, etc. And OTOH you have people saying "such input shouldn't happen". It's just really pathetic.
In computing, the robustness principle is a design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others". It is often reworded as: "be conservative in what you send, be liberal in what you accept". The principle is also known as Postel's law, after Jon Postel, who used the wording in an early specification of TCP.
Also, that law gets quoted, and IMO is a rather large design mistake.
Because if we just didn’t do that, then it would all work.
In particular, see folks talking about Self Driving hah.
I was trying random barcodes I had lying around to test my own component. The one with the zero byte happened to be a large one they had added to my passport when I visited the USA. It had "US-VISIT" printed next to it in big letters.
The device was a rugged industrial handheld device with a screen and a camera, designed for mailrooms and warehouses. This was around 20 years ago and I remember the OS (including the barcode component) was completely bespoke and it ran without any process protections. This meant that the barcode would crash the whole device and you had to perform a hard reset.
Is this surprising? Does libFuzzer support Redqueen or laf-intel like AFL++ [0][1] which will pick up on any comparisons (like a comparison to size=1024) and fuzz with the intention of changing that comparison to become true or false (to put it overly simple)?
0: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instr...
1: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instr...
My boss keeps telling me "it's not that difficult". I keep telling him "it's more difficult than you believe".
> 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO > WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. > EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR > OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY > KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE > IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR > PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE > LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME > THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
They're not required to fix anything, and by including that disclaimer imply that they won't necessarily even intend to fix anything. They disclaim liability, and you, the user, "ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION".
Proprietary software pretty much always has similar clauses too. It's not an issue with open-source, it's an issue with software in general.