Ntfs2btrfs does in-place conversion of NTFS filesystem to the open-source Btrfs

402 points by Sami_Lehtinen 4 days ago | 173 comments

the_hoser 4 days ago |
The degree of hold-my-beer here is off the charts.
koverstreet 4 days ago |
It's not quite as dangerous as you'd think.
The standard technique is to reserve a big file on the old filesystem for the new filesystem metadata, and then walk all files on the old filesystem and use fiemap() to create new extents that point to the existing data - only writing to the space you reserved.
You only overwrite the superblock at the very end, and you can verify that the old and new filesystems have the same contents before you do.
jeroenhd 4 days ago |
I believe that is also the method [btrfs-convert](https://btrfs.readthedocs.io/en/latest/Convert.html) uses. A cool trick that tool uses is to keep the ext4 structures on disk (as a subvolume), which allows reverting to ext4 if the conversion didn't go as planned (as long as you don't do anything to mess with the ext4 extents, such as defragmenting or balancing the filesystem, and you can't revert after deleting the subvolume of course).
Joe_Cool 4 days ago |
I believe you are right. You can only convert back to the metadata from before. So any new or changed (different extents) files will be lost or corrupted.
So best to only mount ro when considering to rollback. Otherwise it's pretty risky.
heftig 4 days ago |
No, it also covers the data. As long as you don't delete the rollback subvolume, all the original data should still be there, uncorrupted.
Even if you disable copy-on-write, as long as the rollback subvolume is there to lay claim to the old data, it's considered immutable and any modification will still have to copy it.
Joe_Cool 3 days ago |
I understood it as "it doesn't touch/ignore the data". But I guess we mean the same thing.
You are right. All of the old files will be in areas btrfs should consider used. So it should correctly restore the state from before the migration. Thanks!
doublepg23 4 days ago |
I tried that on a system in 2020 and it just corrupted my new FS. Cool idea though.
ahartmetz 3 days ago |
You don't understand. You did get a btrfs that worked normally. /s
jorvi 3 days ago |
My conversion went fine, but there were so many misaligned sectors and constant strange checksum errors (on files written after the conversion). With the cherry on top being that if there’s more than X% of checksum errors, btrfs refuses to mount and you have to do multiple arcane incantations to get it to clear all its errors. Real fun if you need your laptop for a high priority problem to solve.
Lesson learned: despite whatever “hard” promises a conversion tool (and its creators) make, just backup, check the backup, then format and create your new filesystem.
jeroenhd 3 days ago |
I've never had the conversion corrupt a filesystem for me (plenty of segfaults halfway through, though). It's a neat trick for when you want to convert a filesystem that doesn't have much on it, but I wouldn't use it for anything critical. Better to format the drive and copy files back from a backup, and you probably want that anyway if you're planning on using filesystem features like snapshots.
Windows used to feature a similar tool to transition from FAT32 to NTFS. I'd have the same reservations about that tool, though. Apple also did something like this with an even weirder conversion step (source and target filesystem didn't have the same handling for case sensitivity!) and I've only read one or two articles about people losing data because of it. It can definitely be done safely, if given enough attention, but I don't think anyone cares enough to write a conversion tool with production grade quality.
bongodongobob 4 days ago |
This is a weird level of pedantry induced by holding many beers tonight, but I've always thought of "Hold my beer" as in "Holy shit the sonofabitch actually pulled it off, brilliant". I think it's perfectly fitting. Jumping a riding lawnmower over a car with a beer in hand but they actually did the math first. I love it.
Gigachad 4 days ago |
It’s referring to a comment a drunk person would make before doing something extremely risky. They need someone to hold the beer so it isn’t spilled during what’s coming next.
bongodongobob 3 days ago |
Right, but in those situations they succeed, kind of like "the cameraman never dies".
bmacho 3 days ago |
{\off I think they used "hold my beer" correctly. It can be used for any weird idea, that a drunk person would actually try (usually with a stretch), regardless if they succeed or not. I don't think that "the SOAB actually pulled it off" is part of the usage.}
boricj 4 days ago |
A couple of years ago it was more like juggling chainsaws: https://github.com/maharmstone/ntfs2btrfs/issues/9
I tracked down a couple of nasty bugs at that time playing around with it, hopefully it's more stable now.
chasil 4 days ago |
Note this is not the Linux btrfs:
"WinBtrfs is a Windows driver for the next-generation Linux filesystem Btrfs. A reimplementation from scratch, it contains no code from the Linux kernel, and should work on any version from Windows XP onwards. It is also included as part of the free operating system ReactOS."
This is from the ntfs2btrfs maintainer's page.
https://github.com/maharmstone/btrfs
chungy 4 days ago |
It's the same file system, with two different drivers for two different operating systems.
chasil 4 days ago |
The metadata is adjusted for Windows in a way that is foreign to Linux.
Do Linux NTFS drivers deal with alternate streams?
"Getting and setting of Access Control Lists (ACLs), using the xattr security.NTACL"
"Alternate Data Streams (e.g. :Zone.Identifier is stored as the xattr user.Zone.Identifier)"
biorach 4 days ago |
Not sure what point you're making here. WinBtrfs is a driver for the same btrfs filesystem that Linux uses. It's most common use case is reading the Linux partitions in Windows on machines that dual-boot both operating systems
dark-star 4 days ago |
What? Why would you need a Linux NTFS driver to read a btrfs filesystem? that makes no sense.
Storing Windows ACLs in xattrs is also pretty common (Samba does the same)
chasil 4 days ago |
I'd delete my comment if I could at this point.
cwillu 4 days ago |
Yes it is?
MisterTea 4 days ago |
As someone who has witnessed Windows explode twice from in-place upgrades I would just buy a new disk or computer and start over. I get that this is different but the time that went into that data is worth way more than a new disk. It's just not worth the risk IMO. Maybe if you don't care about the data or have good backups and wish to help shake bugs out - go for it I guess.
Joker_vD 4 days ago |
And the new disk is also likely to have more longevity left in it, doesn't it?
rini17 4 days ago |
If only it had a native filesystem with snapshotting capability...
ssl-3 3 days ago |
Which "it"?
Both btrfs and NTFS have snapshot capabilities.
rini17 3 days ago |
It's so well hidden from users it might not as well exist. And you can't snapshot only a part of the ntfs filesystem.
ssl-3 3 days ago |
The userland tools included with Windows are very lacking, but that's more of a distro problem than a filesystem problem. VSS works fine -- boringly, even -- and people take advantage of it all the time even if they don't know it.
tgma 4 days ago |
Apple did something like this with a billion live OS X/iOS deployments (HFS+ -> APFS). It can be done methodically at scale as other commenters point out, but obviously needs care).
SanjayMehta 4 days ago |
When this first showed up I took 3 backups: two on networked drives and one on an external drive which was then disconnected from the system.
The second time I just went “meh” and let it run.
tgma 4 days ago |
Craig Federighi on some podcast once said they conducted dry-runs of the process in previous iOS updates (presumably building the new APFS filesystem metadata in a file without promoting it to the superblock) and checking its integrity and submitting telemetry data to ensure success.
3eb7988a1663 4 days ago |
You can do all the testing in the world, but clicking deploy on that update must have been nerve wracking.
Gigachad 4 days ago |
Apple doesn’t just deploy to the whole world in an instant though.
First it goes to the private beta users, then the public beta users, and then it slowly rolls out globally. Presumably they could slow down the roll out even more for a risky change to monitor it.
tgma 3 days ago |
Sure, but still whoever wrote the patch had his ass on the line even shipping to a batch of beta users. Remember this is Apple not Google where the dude likely got promoted and left the team right after pressing click :)
ComputerGuru 3 days ago |
You don’t need to look that far. Many of us here lived through the introduction of NTFS and did live migrations from FAT32 to NTFS in the days of Windows 2000 and Windows XP.
I still remember the syntax: convert C: /fs:ntfs
badsectoracula 3 days ago |
IIRC there was a similar conversion tool in Windows 98 for FAT16 -> FAT32.
tgma 3 days ago |
Yea thanks for recalling this. I totally forgot about that because I never trusted Windows upgrade, let alone filesystem conversion. Always backup and clean install. It's Microsoft software after all.
Dwedit 4 days ago |
I would be very surprised if it supported files that are under LZX compression.
(Not to be confused with Windows 2000-era file compression, this is something you need to activate with "compact.exe /C /EXE:LZX (filename)")
ruined 4 days ago |
it seems to contain code that handles LZX, among other formats
https://github.com/search?q=repo%3Amaharmstone%2Fntfs2btrfs%...
cryptonector 4 days ago |
Thinking of how I'd do this for ZFS... I think I'd do something like: add a layer that can read other filesystem types and synthesize ZFS block pointers, then ZFS could read other filesystems, and as it writes it could rewrite the whole thing slowly. If ZFS had block pointer rewrite (and I've explained here before why it does not and cannot have BP rewrite caoabilities, not being a proper CAS filesystem), one could just make it rewrite the whole thing to finish the conversion.
rurban 4 days ago |
Why would someone do that? NTFS is stable, faster than btrfs and has all the same features.
rnd0 3 days ago |
The only reason I can think of is so that they can use the same FS in both windows and linux -but with ntfs, they already can.
Mind you, with openzfs (https://openzfsonwindows.org/) you get windows (flakey), freebsd, netbsd and linux but -as I said; I'm not sure zfs is super reliable on windows at this point.
Mind you, I just stick with ntfs -linux can see it, windows can see it and if there's extra features btrfs provides they're not ones I am missing.
mistaken 3 days ago |
With ntfs you have to create a separate partition though. With btrfs you could create a subvolume and just have one big partition for both linux and windows.
ComputerGuru 3 days ago |
I’m a die-hard ZFS fan and heavy user since the Solaris days (and counting) but I believe the WinBtrfs project is in better (more useable) shape than the OpenZFS for Windows project.
paines 3 days ago |
what ?!?! NTFS has no case sensitivity no compression. And I guess a couple of more things I do not want to miss.
rurban 3 days ago |
NTFS does have case-sensitivity, just nobody dares to activate it. Compression is big, but I thought I've read winbtrfs neither
inetknght 3 days ago |
I activated it back in mid 2010 or so. I had the most amazing pikachuface when random things stopped working because it could no longer find that file it wanted to load with an all-lowercase-string even though the project builds it with CapitalCase. Sigh...
cesarb 3 days ago |
> what ?!?! NTFS has no case sensitivity no compression.
As the sibling comment mentioned, NTFS does have a case-sensitive mode, for instance for the POSIX subsystem (which no longer exists, but it existed back when NTFS was new); I think it's also used for WSL1. And NTFS does have per-file compression, I've used it myself back in the early 2000s (as it was a good way to free a bit of space on the small disks from back then); there was even a setting you could enable on Windows Explorer which made compressed files in its listing blue-colored.
ComputerGuru 3 days ago |
NTFS has per-folder case sensitivity flag. You could set it online at anytime prior to Windows 11, but as of 11 you can now only change it on an empty folder (probably due to latent bugs they didn’t want to fix).
NTFS had mediocre compression support from the very start that could be enabled on a volume or directory basis, but gained modern LZ-based compression (that could be extended to whatever algorithm you wanted) in Windows 10, but it’s unfortunately a per-file process that must be done post-write.
WhyNotHugo 3 days ago |
For fun? To prove that it is possible? As a learning activity?
There are millions of reasons to write software other than "faster" or "more features".
I can imagine this being convenient (albeit risky) when migrating Windows to Linux if you really can't afford a spare disk to backup all your data.
casey2 4 days ago |
Very cool, but nobody will hear about this until at least a week after they format their ntfs drives that they have been putting off formatting for 2 years
npn 3 days ago |
I tried this one before, resulted in a read-only disk. Hope it improves since then.
quasarj 3 days ago |
That's 50% better than losing all your data!
oDot 3 days ago |
Is anyone here using BTRFS and can comment on its current-day stability? I used to read horror stories about it
quotemstr 3 days ago |
I've used it for my personal machines and backups (via btrbk) for years without any issues
einsteinx2 3 days ago |
I’ve been using it for a few years on my NAS for the all the data drives (with Snapraid for parity and data validation), and as the boot drive on a few SBCs that run various services. Also use it as the boot drive for my Linux desktop PC. So far no problems at all and I make heavy use of snapshots, I have also had various things like power outages that have shut down the various machines multiple times.
I’ve never used BTRFS raid so can’t speak to that, but in my personal experience I’ve found BTRFS and the snapshot system to be reliable.
Seems like most (all?) stories I hear about corruption and other problems are all from years ago when it was less stable (years before I started using it). Or maybe I just got lucky ¯\_(ツ)_/¯
AtlasBarfed 3 days ago |
Why use BTRfS raid rather than good old MDadm raid?
einsteinx2 3 days ago |
I don’t use BTRFS raid, I don’t actually use any RAID. I use SnapRAID which is really more of a parity system than real RAID.
I have a bunch of data disks that are formatted BTRFS, then 2 parity disks formatted using ext4 since they don’t require any BTRFS features. Then I use snapraid-btrfs which is a wrapper around SnapRAID to automatically generate BTRFS snapshots on the data disks when doing a SnapRAID sync.
Since the parity is file based, it’s best to use it with snapshots, so that’s the solution I went with. I’m sure you could also use LVM snapshots with ext4 or ZFS snapshots, but BTRFS with SnapRAID is well supported and I like how BTRFS snapshots/subvolumes works so I went with that. Also BTRFS has some nice features over ext4 like CoW and checksumming.
I considered regular RAID but I don’t need the bandwidth increase over single disks and I didn’t ever want the chance of losing a whole RAID pool. With my SnapRAID setup I can lose any 2 drives and not lose any data, and if I lose 3 drives, I only lose the data on any lost data drives, not all the data. Also it’s easy to add a single drive at a time as I need more space. That was my thought process when choosing it anyway and it’s worked for my use case (I don’t need much IOPS or bandwidth, just lots of cheap fairly resilient and easy to expand storage).
tremon 3 days ago |
BTRFS raid is usage-aware, so a rebuild will not need to do a bit-for-bit copy of the entire disk, but only the parts that are actually in use. Also, because btrfs has data checksumming, it can detect read errors even when the disk reports a successful read (however, it will not verify the checksum during regular operation, only during scrub).
gmokki 3 days ago |
BTRFS Raid10 can seemlessly cpmbole multiple raw disks without trying to match capacity.
Next time I just replace my 4T disk in my 5 disk Raid10 with 20T. Currently I have 4+8+8+16+20 disks.
MD raid does not do checksumming. Although I believe XFS is about to add support for it in the future.
I have had my BTRFS raid filesystem survive a lot during the past 14 years: - burned power: no loss of data
- failed ram that started corrupting memory: after a little hack 1) BTRFS scrub saved most of data even though the situation got so bad kernel would crash in 10 minutes
- buggy pcie SATA extension card: I tried to add 6th disk, but noticed after fee million write errors to one disk that it just randomly stopped passing data through: no data corruption, although btrfs write error counters are in 10s of millions now
- 4 disk failures: I have only one original disk still running and it is showing a lot of bad sectors
1) one of the corrupted sectors was in the btrfs tree that contains the checksums for rest of the filesystem and both copies were broken. It prevented access to some 200 files. I patched the kernel to log the exact sector in addition to the expected and actual value. Turns our it was just a single bit flip. So I used hex editor to flip it back to correct value and got the files back
ThatPlayer 3 days ago |
More flexibility in drives. Btrfs's RAID1, isn't actually RAID1 where everything is written to all the drives, but closer to RAID10 it writes all data to copies on 2 drives. So you can have a 1+2+3 TB drive in an array and still get 3TB of usable storage, or even 1+1+1+1+4. And you can add/remove single drives easily.
You can also set different RAID levels for metadata versus data, because the raid knows the difference. At some point in the future you might be able to set it per-file too.
jcalvinowens 3 days ago |
I've used BTRFS exclusively for over a decade now on all my personal laptops, servers, and embedded devices. I've never had a single problem.
It's the flagship Linux filesystem: outside of database workloads, I don't understand why anybody uses anything else.
KennyBlanken 3 days ago |
"Flagship"? I don't know a single person who uses it in production systems. It's the only filesystem I've lost data to. Ditto for friends.
Please go look up survivor bias. That's what all you btrfs fanboys don't seem to understand. It doesn't matter how well it has worked for 99.9% of you. Filesystems have to be the most reliable component in an operating system.
It's a flagship whose fsck requires you to contact developers to seek advice on how to use it because otherwise it might destroy your filesystem.
It's a flagship whose userspace tools, fifteen years in, are still seeing major changes.
It's a flagship whose design is so poor that fifteen years in the developers are making major changes to its structure and depreciating old features in ways that do not trigger an automatic upgrade or informative error to upgrade, but cause the filesystem to panic with error messages for which there is no documentation and little clue what the problem is.
No other filesystem has these issues.
jchw 3 days ago |
Btrfs is in production all over the damn place, at big corporations and all kinds of different deployments. Synology has their own btrfs setup that they ship to customers with their NAS software for example.
I found it incredibly annoying the first time I ran out of disk space on btrfs, but many of these points are hyperbolic and honestly just silly. For example, btrfs doesn't really do offline fsck. fsck.btrfs has a zero percent chance of destroying your volume because it does nothing. As for the user space utilities changing... I'm not sure how that demonstrates the filesystem is not production ready.
Personally I usually use either XFS or btrfs as my root filesystem. While I've caught some snags with btrfs, I've never lost any data. I don't actually know anyone who has, I've merely just heard about it.
And it's not like other well-regarded filesystems have never ran into data loss situations: even OpenZFS recently (about a year ago) uncovered a data-eating bug that called its reliability into question.
I'm sure some people will angrily tell me that actually btrfs is shit and the worst thing to ever be created and honestly whatever. I am not passionate about filesystems. Wake me up when there's a better one and it's mainlined. Maybe it will eventually be bcachefs. (Edit: and just to be clear, I do realize bcachefs is mainline and Kent Overstreet considers it to be stable and safe. However, it's still young and it's upstream future has been called into question. For non-technical reasons, but still; it does make me less confident.)
aaronmdjones 3 days ago |
For example, btrfs doesn't really do offline fsck. fsck.btrfs has a zero percent chance of destroying your volume because it does nothing.
fsck.btrfs does indeed do nothing, but that's not the tool they were complaining about. From the btrfs-check(8) manpage:
Warning Do not use --repair unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck can successfully repair all types of filesystem corruption. E.g. some other software or hardware bugs can fatally damage a volume. [...] DANGEROUS OPTIONS --repair enable the repair mode and attempt to fix problems where possible Note there’s a warning and 10 second delay when this option is run without --force to give users a chance to think twice before running repair, the warnings in documentation have shown to be insufficient
jchw 3 days ago |
Yes, but that doesn't do the job that a fsck implementation does. fsck is something you stuff into your initrd to do some quick checks/repairs prior to mounting, but btrfs intentionally doesn't need those.
If you need btrfs-check, you have probably hit either a catastrophic bug or hardware failure. This is not the same as fsck for some other filesystems. However, ZFS is designed the same way and also has no fsck utility.
So whatever point was intended to be made was not, in any case.
shiroiushi 3 days ago |
>I don't actually know anyone who has, I've merely just heard about it.
Well "yarg", a few comments up in this conversation, says he lost all his data to it with the last year.
I've seen enough comments like that that I don't see it as a trustworthy filesystem. I never see comments like that about ext4 or ZFS.
jchw 3 days ago |
Contrary to popular belief, people on a forum you happen to participate in are still just strangers. In line with popular belief, anecdotal evidence is not a good basis to form an opinion.
shiroiushi 3 days ago |
Exactly how do you propose to form an opinion on filesystem reliability then? Do my own testing with thousands of computers over the course of 15 years?
jchw 3 days ago |
You don't determine what CPUs are fast or reliable by reading forum comments and guessing, why would filesystems be any different?
That said, you make a good point. It's actually pretty hard to quantify how "stable" a filesystem is meaningfully. It's not like anyone is doing Jepsen-style analysis of filesystems right now, so the best thing we can go off of is testimony. And right now for btrfs, the two types of data-points are essentially, companies that have been using it in production successfully, and people on the internet saying it sucks. I'm not saying either of those is great, and I am not trying to tell anyone that btrfs is some subjective measure of good. I'm just here to tell people it's apparently stable enough to be used in production... because, well, it's being used in production.
Would I argue it is a particularly stable filesystem? No, in large part because it's huge. It's a filesystem with an integrated volume manager, snapshots, transparent compression and much more. Something vastly simpler with a lower surface area and more time in the oven is simply less likely to run into bugs.
Would I argue it is perfectly reasonable to use btrfs for your PC? Without question. A home use case with a simple volume setup is exceedingly unlikely to be challenging for btrfs. It has some rough edges, but I don't expect to be any more likely to lose data to btrfs bugs as I expect to lose data from hardware failures. The bottom line is, if you absolutely must not lose data, having proper redundancy and backups is probably a much bigger concern than btrfs bugs for most people.
shiroiushi 3 days ago |
>You don't determine what CPUs are fast or reliable by reading forum comments and guessing, why would filesystems be any different?
Your premise is entirely wrong. How else would I determine what CPUs are fast or reliable? Buy dozens of them and stress-test them all? No, I use online sites like cpu-monkey.com that compare different CPUs' features and performance according to various benchmarks, for the performance part at least. For reliability, what way can you possibly think of other than simply aggregating user ratings (i.e. anecdotes)? If you aren't running a datacenter or something, you have no practical alternative.
At least for spinning-rust HDDs, the helpful folks at Backblaze have made a treasure trove of long-term data available to us. But this isn't available for most other things.
> It's not like anyone is doing Jepsen-style analysis of filesystems right now, so the best thing we can go off of is testimony.
This is exactly my point. We have nothing better, for most of this stuff.
>companies that have been using it in production successfully, and people on the internet saying it sucks
Companies using something doesn't always mean it's any good, especially for individual/consumer use. Companies can afford teams of professionals to manage stuff, and they can also make their own custom versions of things (esp. true with OSS code). They're also using things in ways that aren't comparable to individuals. These companies may be using btrfs in a highly feature-restricted way that they've found, through testing, is safe and reliable for their use case.
> It's a filesystem with an integrated volume manager, snapshots, transparent compression and much more. Something vastly simpler with a lower surface area and more time in the oven is simply less likely to run into bugs.
This is all true, but ZFS has generally all the same features, yet I don't see remotely as many testimonials from people saying "ZFS ate my data!" as I have with btrfs over the years. Maybe btrfs has gotten better over time, but as the American car manufacturers found out, it takes very little time to ruin your reputation for reliability, and a very long time to repair that reputation.
jchw 2 days ago |
> Your premise is entirely wrong. How else would I determine what CPUs are fast or reliable? Buy dozens of them and stress-test them all? No, I use online sites like cpu-monkey.com that compare different CPUs' features and performance according to various benchmarks, for the performance part at least. For reliability, what way can you possibly think of other than simply aggregating user ratings (i.e. anecdotes)? If you aren't running a datacenter or something, you have no practical alternative.
My point is just that anecdotes alone don't tell you much. I'm not suggesting that everyone needs to conduct studies on how reliable something is, but if nobody has done the groundwork then the best thing we can really say is we're not sure how stable it is because the best evidence is not very good and it conflicts.
> Companies using something doesn't always mean it's any good, especially for individual/consumer use. Companies can afford teams of professionals to manage stuff, and they can also make their own custom versions of things (esp. true with OSS code). They're also using things in ways that aren't comparable to individuals. These companies may be using btrfs in a highly feature-restricted way that they've found, through testing, is safe and reliable for their use case.
For Synology you can take a look at what they're shipping since they're shipping it to consumers. It does seem like they're not using many of the volume management features, instead using some proprietary volume management scheme on the block layer. However otherwise there's nothing particularly special that I can see, it's just btrfs. Other advanced features like transparent compression are available and exposed in the UI.
(edit: Small correction. While I'm still pretty sure Synology has custom volume management for RAID which works on the block level, as it turns out, they are actually using btrfs subvolumes as well.)
I think the Synology case is an especially interesting bit of evidence because it's gotta be one of the worst cases of shipping a filesystem, since you're shipping it to customer machines you don't control and can't easily inspect later. It's not the only case of shipping btrfs to the customer either, I believe ChromeOS does this and even uses subvolumes, though I didn't actually look for myself when I was using it so I'm not actually 100% sure on that one.
> This is all true, but ZFS has generally all the same features, yet I don't see remotely as many testimonials from people saying "ZFS ate my data!" as I have with btrfs over the years. Maybe btrfs has gotten better over time, but as the American car manufacturers found out, it takes very little time to ruin your reputation for reliability, and a very long time to repair that reputation.
In my opinion, ZFS and other Solaris technologies that came out around that time period set a very high bar for reliable, genuinely innovative system features. I think we're going to have to live with the fact that just having a production-ready filesystem dropped onto the world is not going to be the common case, especially in the open source world: the filesystem will need to go through its growing pains in the open.
Btrfs has earned a reputation as the perpetually-unfinished filesystem. Maybe it's tainted and it will simply never approach the degree of stability that ZFS has. Or, maybe it already has, and it will just take a while for people to acknowledge it. It's hard to be sure.
My favorite option would be if I just simply don't have to find out, because an option arrives that quickly proves itself to be much better. bcachefs is a prime contender since it not only seems to have better bones but it's also faster than btrfs in benchmarks anyways (which is not saying much because btrfs is actually quite slow.) But for me, I'm still waiting. And until then, ZFS is not in mainline Linux, and it never will be. So for now, I'm using btrfs and generally OK recommending it for users that want more advanced features than ext4 can offer, with the simple caveat that you should always keep sufficient backups of your important data at all times.
I only joined in on this discussion because I think that the btrfs hysteria train has gone off the rails. Btrfs is a flawed filesystem, but it continues to be vastly overstated every time it comes up. It's just, simply put, not that bad. It does generally work as expected.
anonfordays 2 days ago |
>Synology has their own btrfs setup that they ship to customers with their NAS software for example.
Synology infamously/hilariously does not use btrfs as the underlying file system because even they don't trust btrfs's RAID subsystem. Synology uses LVM RAID that is presented to btrfs as a single drive. btrfs isn't managing any of the volumes/disks.
jchw 2 days ago |
Their reason for not using btrfs as a multi-device volume manager is not specified, though it's reasonable to infer that it is because btrfs's own built-in volume management/RAID wasn't suitable. That's not really very surprising: back in ~2016 when Synology started using btrfs, these features were still somewhat nascent even though other parts of the filesystem were starting to become more mature. To this day, btrfs RAID is still pretty limited, and I wouldn't recommend it. (As far as I know, btrfs RAID5/6 is even still considered incomplete upstream.) On the other hand, btrfs subvolumes as a whole are relatively stable, and that and other features are used in Synology DSM and ChromeOS.
That said, there's really nothing particularly wrong with using btrfs with another block-level volume manager. I'm sure it seems silly since it's something btrfs ostensibly supports, but filesystem-level redundancy is still one of those things that I think I would generally be afraid to lean on too hard. More traditional RAID at the block level is simply going to be less susceptible to bugs, and it might even be a bit easier to manage. (I've used ZFS raidz before and ran into issues/confusion when trying to manage the zpool. I have nothing but respect for the developers of ZFS but I think the degree to which people portray ZFS as an impeccable specimen of filesystem perfection is a little bit unrealistic, it can be confusing, limited, and even, at least very occasionally, buggy too.)
anonfordays 2 days ago |
>That's not really very surprising: back in ~2016 when Synology started using btrfs, these features were still somewhat nascent even though other parts of the filesystem were starting to become more mature.
btrfs was seven years old at that point and declared "stable" three years before that.
ZFS is an example of amazingly written code by awesome engineers. It's simple to manage, scales well, and easy to grok. btrfs sadly will go the wayside once bcachefs reaches maturity. I wouldn't trust btrfs for important data, and neither should you. If you experience data loss on a Synology box, the answer you'll get from them is "tough shit, hope you have backups, and here's a coupon for a new Synology unit."
jchw a day ago |
> btrfs was seven years old at that point and declared "stable" three years before that.
The on-disk format was declared stable in 2013[1]. That just meant that barring an act of God, they were not going to break the on-disk format, e.g. a filesystem created at that point would continue to be mountable for the foreseeable future. It was not a declaration that the filesystem was itself now stable necessarily, but especially was not suggesting that all of the features were stable. (As far as I know, many features still carried warning labels.)
Furthermore, the "it's been X years!" thing referring to open source projects has to stop. This is the same non-sense that happens with every other thing that is developed in the open. Who cares? What matters isn't how long it took to get here. What matters is where it's at. I know there's going to be some attempt at rationalizing this bit, but it's wasted on me because I'm tired of hearing this.
> ZFS is an example of amazingly written code by awesome engineers. It's simple to manage, scales well, and easy to grok.
Agreed. But ZFS was written by developers at Sun Microsystems for their commercial UNIX. We should all be gracious to live in a world where Sun Microsystems existed. We should also accept that Sun Microsystems is not the standard any more than Bell Labs was the standard, they are extreme outliers. If we measure everything based on whether it's as good as what Sun Microsystems was doing in the 2000s, we're going to have a bad time.
As an example, DTrace is still better than LTTng is right now. I hope that sinks in for everyone.
However, OpenZFS is not backed by Sun Microsystems, because Sun Microsystems is dead. Thankfully and graciously at that, it has been maintained for many years by volunteers, including at least one person who worked on ZFS at Sun. (Probably more, but I only know of one.)
Now if OpenZFS eats your data, there is no big entity to go to anymore than there is for btrfs. As far as I know, there's no big entity funding development, improvements, or maintenance. That's fine, that's how many filesystems are. But still, that's not what propelled ZFS to where it stood when Sun was murdered.
> btrfs sadly will go the wayside once bcachefs reaches maturity.
I doubt it will disappear quickly: it will probably continue to see ongoing development. Open Source is generally pretty good at keeping things alive in a zombie state. That's pretty important since it is typically non-trivial to do online conversion of filesystems. (Of course, we're in a thread about a tool that does seamless offline conversion of filesystems, which is pretty awesome and impressive in and of itself.)
But for what it's worth, I am fine with bcachefs supplanting btrfs eventually. It seems like it had a better start, it benchmarks faster, and it's maturing nicely. Is it safer today? Depends on who you ask. But it's hard to deny that it doesn't seem like the point at which bcachefs will be considered stable by most will take more than a year or two tops, assuming kernel drama doesn't hold back upstream.
Should users trust bcachefs with their data? I think you probably can right now with decent safety, if you're using mainline kernels, but bcachefs is still pretty new. Not aware of anyone using it in production yet. It really could use a bit more time before recommending people jump over to it.
> I wouldn't trust btrfs for important data, and neither should you.
I stand by my statement: you should always ensure you have sufficient backups for important data, but most users should absolutely fear hardware failures more than btrfs bugs. Hardware failures are an if, not a when. Hardware will always fail eventually. Data-eating btrfs bugs have certainly existed, but it's not like they just appear left and right. When such a bug appears, it is often newsworthy, and usually has to do with some unforeseen case that you are not so likely to run into by accident.
Rather than lose data, btrfs is instead more likely to just piss you off by being weird. There are known quirks that probably won't lose you any data, but that are horribly annoying. It is still possible, to my knowledge, to get stuck in a state where the filesystem is too full to delete files and the only way out is in recovery. This is pretty stupid.
It's also not particularly fast, so if someone isn't looking for a feature-rich CoW filesystem with checksums, I strongly recommend just going with XFS instead. But if you run Linux and you do want that, btrfs is the only mainline game in town. ZFS is out-of-tree and holds back your kernel version, not to mention you can never really ship products using it (with Linux) because of silly licensing issues.
> If you experience data loss on a Synology box, the answer you'll get from them is "tough shit, hope you have backups, and here's a coupon for a new Synology unit."
That suggests that their brand image somewhat depends on the rarity of btrfs bugs in their implementation, but Synology has a somewhat good reputation actually. If anything really hurts their reputation, it's mainly the usual stuff (enshittification.) The fact that DSM defaults to using btrfs is one of the more boring things at this point.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
stuaxo 3 days ago |
I list data on btrfs on a raspberry pi with a slightly dodgy PSU.
We need more testing of filesystems and pulling the power.
I switched to a NAS with battery backup and it's been better.
So that was inconclusive, before that the last time I lost data like that was to Reiserfs in the early 2000s.
danudey 2 days ago |
I agree with what you say, and I would never trust btrfs with my data because of issues that I've seen in the past, My last job I installed my Ubuntu desktop with btrfs and within three days it had been corrupted so badly because of a power outage that I had to completely wipe and reinstall the system.
That said:
> but cause the filesystem to panic with error messages for which there is no documentation and little clue what the problem is.
The one and only time I experimented with ZFS as a root filesystem I got bit in the ass because the zfs tools one day added a new feature flag to the filesystem that the boot loader (grub) didn't understand and therefore it refused to read the filesystem, even read-only. Real kick in the teeth, that one, especially since the feature flag was completely irrelevant to just reading enough of the filesystem for the boot loader to load the kernel and there was no way to override it without patching grub's zfs module on another system then porting it over.
Aside from that, ZFS has been fantastic, and now that we're all using UEFI and our kernels and initrds are on FAT32 filesystems I'm much less worried, but I'm still a bit gunshy. Not as much as with BTRFS, mind you, but somewhat.
cmurf 2 days ago |
Meta (Facebook) has millions of instances of Btrfs in production. More than any other filesystem by far. A few years ago when Fedora desktop variants started using Btrfs by default, Meta’s experience showed it was no less reliable than ext4 or XFS.
eru a day ago |
> Please go look up survivor bias. That's what all you btrfs fanboys don't seem to understand. It doesn't matter how well it has worked for 99.9% of you. Filesystems have to be the most reliable component in an operating system.
Not sure. It's useful if they are reliable, but they only need to be roughly as reliable as your storage media. If your storage media breaks down once in a thousand years (or once a year for a thousand disks), then it doesn't matter much if your filesystem breaks down once in a million years or once in a trillion years.
That being said, I had some trouble with BTRFS.
badsectoracula 3 days ago |
I've been using it for a few years now on my main PC (has a couple SSDs and a large HDD) and my laptop, it was the default of openSUSE and just used that. Then i realized that snapshots are a feature i didn't knew i wanted :-P.
Never had a problem, though it is annoying that whatever BTRFS thinks is free space and what the rest of the OS thinks is free space do not always align. It has rarely been a problem in practice though.
KennyBlanken 3 days ago |
The documentation describes 'btrfs check' as being dangerous to run without consulting the mailing list first.
That sums up btrfs pretty well.
Fifteen years in, and the filesystem's design is still so half-baked that their "check" program can't reliably identify problems and fix them correctly. You have to have a developer look at the errors and then tell you wha to do. Fifteen years in.
Nobody cares about btrfs anymore because everyone knows someone who has been burned by it. Which is a shame, because it can do both metadata and data rebalancing and defragmentation, as well as do things like spread N copies of data across X drives (though this feature is almost entirely negated by metadata not having this capability. Again, fifteen years in, why is this still a thing?) and one can add/remove drives from a btrfs volume without consequence. But...it's not able to do stuff like have a volume made up of mirrored pairs, RAID5/6 are (still) unstable (fifteen years in, why is this still a thing?)
Do yourself a favor and just stick with ext4 for smaller/simple filesystem needs, XFS where you need the best possible speed or for anything big with lots of files (on md if necessary), or OpenZFS.
Now that the BSD and Linux folks have combined forces and are developing OpenZFS together it keeps getting better and better; btrfs's advantages over ZFS just aren't worth the headaches.
ZFS's major failing is that it offers no way to address inevitable filesystem data and free space fragmentation, and while you can remove devices from a ZFS pool, it incurs a permanent performance penalty, because they work around ZFS's architectural inflexibilities by adding a mapping table so it can find the moved chunks of data. That mapping table never goes away unless you erase and re-create the file. Which I suppose isn't the end of the world; technically, you could have a script that walked the filesystem re-creating files, but that brings its own problems.
That the fs can't address this stuff internally is particularly a bummer considering that ZFS is intended to be used in massive (petabyte to exobyte) filesystems where it would be completely impractical to "just" move data to a fresh ZFS filesystem and back again (the main suggestion for fragmentation.)
But...btrfs doesn't offer external (and mirrored!) transaction logging devices, SSD cache, or concepts like pairs of mirrored drives being used in stripes or contiguous chunks.
If ZFS ever manages to add maintenance to the list of things it excels at, there will be few arguments against it except for situations where its memory use isn't practical.
oDot 3 days ago |
Thank you for the thorough explanation
witrak 3 days ago |
>ZFS's major failing is that it offers no way to address inevitable filesystem data and free space fragmentation, and while you can remove devices from a ZFS pool, it incurs a permanent performance penalty, because they work around ZFS's architectural inflexibilities by adding a mapping table so it can find the moved chunks of data.
I'm not an FS specialist but by chance a couple of days ago I found an interesting discussion about the reliability of SSD disks where there was a strong warning about the extreme wearing of commercial SSDs by ZFS (up to suggestion to never use ZFS unless you have heavy duty/raid version of SSD disks). So ZFS also has unfinished work and not only the mentioned by you software improvements-related.
BTW From my - nonspecialist - point of view, it is easier to resist the urge to use unreliable features of Btrfs than to replace a bunch of SSD drives. At least if you pay for them from your own pocket.
anonymousiam 3 days ago |
I tried btrfs for the first time a few weeks ago. I had been looking for mature r/w filesystems that support realtime compression, and btrfs seemed like a good choice.
My use case is this: I normally make full disk images of my systems and store them on a (100TB) NAS. As the number of systems grows, the space available for multiple backup generations shrinks. So compression of disk images is good, until you want to recover something from a compressed disk image without doing a full restore. If I put an uncompressed disk image in a compressed (zstd) btrfs filesystem, I can mount volumes and access specific files without waiting days to uncompress.
So I gave it a try and did a backup of an 8TB SSD image to a btrfs filesystem, and it consumed less than 4TB, which was great. I was able to mount partitions and access individual files within the compressed image.
The next thing I tried, was refreshing the backup of a specific partition within the disk image. That did not go well.
Here's what I did to make the initial backup:
(This was done on an up-to-date Ubuntu 24.04 desktop.)
cd /btrfs.backups
truncate -s 7696581394432 8TB-Thinkpad.btrfs
mkfs.btrfs -L btrfs.backup 8TB-Thinkpad.btrfs
mount -o compress=zstd 8TB-Thinkpad.btrfs /mnt/1
pv < /dev/nvme0n1 > /mnt/1/8TB-Thinkpad.nvme0n1
All good so far. The backup took about three hours, but would probably go twice as fast if I had used my TB4 dock instead of a regular USB-C port for the backup media.
Things went bad when I tried to update one of the backed up partitions:
kpartx -a /mnt/1/8TB-Thinkpad.nvme0n1
pv < /dev/nvme0n1p5 > /dev/mapper/loop20p5
This sort of thing works just fine on a normal, uncompressed ext4 filesystem. I did not really expect this to work here, and my expectations were met.
The result was a bunch of kernel errors, the backup device being remounted r/o, and a corrupt btrfs filesystem with a corrupt backup file.
So for this use case, btrfs is a big improvement for reading compressed disk images on the fly, but it is not suitable for re-writing sections of disk images.
gavinsyancey 3 days ago |
Btrfs has been slowly eating my data; randomly small files or sectors of larger files will be replaced with all nulls.
yarg 3 days ago |
I had it go boom on Tumbleweed (when the drive filled up) less than a year ago.
I tried accessing and fixing the fubar partition from a parallel install, but to no avail.
seanw444 2 days ago |
Haven't had any issues with it after using it for years on my work and home PCs. I use transparent compression, snapshots, and send/receive, and they all work great.
The main complaint was always about parity RAID, which I still wouldn't recommend running from what I've heard. But RAID 1-10 have been stable.
ComputerGuru 3 days ago |
I found the link to Quibble, an open and extensible reverse engineering of the Windows kernel bootloader to be much more intriguing: https://github.com/maharmstone/quibble
SomeoneOnTheWeb 3 days ago |
I would have needed that like 2 months ago, when I had to format a hard drive with more than 10TB of data into from NTFS... ^^
Nic project!