If it's found reasonable to assume that a certain 'copy' of the code could be indistinguishable of that made by another person who didn't look at the original code... then it's probably not similar enough to be infringing.
I would even go so far as to say that I think if such a case _was_ ever brought, that unless someone paid enough money to hire subject matter experts to testify, the case may likely be thrown out because the judge is not able to make an informed decision with such lack of evidence.
Presuming the PR in question is similar, one would likely be able to successfully argue that the code in the PR is trivial enough to not be covered by copyright.
https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St...
I think it's an entirely valid argument given the variables names are defined by the original code and the style is defined by a style guide, the only addition here is a intrinsic utilitarian function without any artistic expression.
That said, I am not a lawyer, so who knows how it would actually play out in court if it went that far.
If you can write a full specification of the code without any code snippets, or write a full TDD test set, and hand it off to someone who can swear they’ve never looked at the source material, you can still pull off a clean-room copy.
I had to do that for a small lib due to European copyright laws. They don’t like Public Domain. There’s some precedent where the author can change their mind and sue because you can’t actually consent to not consenting to people using your stuff. MIT is great, PD is the Bog of Stench.
Couldn’t you get an LLM to do that? “Here’s the code for this function, add conditionals to fix any null pointer bugs”
Or: “Here’s a function and a unit test that exposes a bug in it, modify the function so the unit test passes”. With that approach, the LLM could even (autonomously) try multiple times until the test passed.
> I had to do that for a small lib due to European copyright laws. They don’t like Public Domain.
This shouldn’t be an issue for public domain dedications which contain a fallback copyright license such as CC0. [0] People say the Unlicense also falls in that category, but (unfortunately) its wording is less than completely clear, so it is debatable. Another option is “PD-equivalent licenses” such as 0BSD or MIT-0, which are technically copyright licenses but designed to give you the same rights as PD (e.g. reuse without requiring attribution). Now, what some random German judge is going to make of them, who knows.
[0] Although some people, e.g. Fedora, don’t like its clauses around patents
though, it's dealing with a zero input rather than null
This may be an interesting discussion for the author to have more directly with Foundation leadership and legal on what the expectations are.
There's also yes, the larger discussion on if Foundations such as this are possibly too conservative in their FLOSS bureaucracy/red-tape for smaller contributions to smaller projects. Under the good for the goose/gander assumption it's easy to add the same bots to every project and assume that's good enough, but does it stifle innovation or bug fixes on projects with fewer eyes?
1. https://en.m.wikipedia.org/wiki/Contributor_License_Agreemen...
I have this same concern with GNU. I can imagine a future where some key figures have died or retired and the new org sells out and changes the license to something RMS never would have agreed to.
Text from their contributor agreement:
> PSF understands and agrees that Contributor retains copyright in its Contributions.
But that's not really a risk I care about. The PSF is a nonprofit that's clearly aimed at being a nonprofit, and a Future Evil Board is beyond what I'm going to worry about.
[1]: https://en.m.wikipedia.org/wiki/Developer_Certificate_of_Ori...
The DCO is also very patch oriented, rather than contributor oriented, which only works if your workflow has more contributors than patches (which isn't how most FOSS projects are organized; you usually have only a few contributors, who submit patches.)
Finally, the DCO was put in place to resolve someone being annoying about the licenses rather than existing to unify the copyright of the Kernel behind one entity.
Unifying the copyright behind one entity is the problem with a CLA. Especially if that entity is a company that might have pressures to change the license to a proprietary license.
The reality is that without a CLA, copyright enforcement tends to turn into a complete mess. To be clear - that can absolutely be the point; a completely unenforceable copyright that's still enough of a mess to scare off violators can have it's uses; the Kernel jumps to mind. Linus and Greg have both been open about the fact that the license is there to encourage people to contribute as a carrot, not there as a stick to beat them over the head with. Explaining how the license works and why they'd really appreciate cooperation is much more useful for the LKML than it would be to keep a bunch of lawyers on standby and the fractured license helps achieve that goal.
They're often used by corporations to rugpull a license change, but the original purpose of a CLA is just to ensure that there's one entity in control of the licenses, which is more useful if an entity prefers the stick approach to compliance. (Which the FSF I would say absolutely lands under by-the-by.)
https://docs.github.com/en/site-policy/github-terms/github-t...
> The repository is MIT-licensed, and clearly advertised as such, so it’s reasonable to expect all contributions are made under that license
You don't have to assume anything, given the way pull requests work. It's not like it's a code snippet extracted from one of their comments on the bugtracker and then subsequently integrated upstream. They published something: their fork.
Look at the repo the pull request is coming from—the one the requestor published. What is the license they published it under? Did they just dump a bunch of stuff online that says it's licensed under MIT? Yup. So if they have the rights to grant it to you, then you can use it under the MIT license.
The only time this doesn't apply is when the contributor deletes their repo. The pull request turns into a patch merge request. But the repo doesn't have to remain available indefinitely. The mere fact that it was published under such-and-such license at some time and was available to you/whoever is sufficient.
That seems rather the crux of the problem: did they have the right to upload that patch with the given license, or did they commit fraud first? Being able to see the LICENSE file still intact (which GitHub has promised you can do indefinitely for any PR branch even if the contributing repo got deleted) would not protect against that. The CLA doesn’t protect against it either, but apparently some companies think it is at least a useful additional legal barrier. IANAL and so not qualified to comment on whether such CLA is actually useful for the intended purpose
I find it interesting to consider that open source existed for decades mailing patches (usually sans any license info) to a mailing list without legal trouble, and now that GitHub offers easy and complete traceability of the whole patch context this makes it to HN as a concern
https://en.wikipedia.org/wiki/Developer_Certificate_of_Origi...
That said, the playing field is unequal between proprietary and open source projects. If I contribute open source code to a proprietary project, the odds of this being discovered and rectified are low, since the public doesn't get any right to audit closed-source software for the presence of copyleft code.
I'm not a lawyer, either, but that doesn't mean I'm not qualified to comment about whether it's useful. It's not. It's stupid, and they're wrong, whether they have a lawyer endorsing it or not. Don't let the Gell-Mann amnesia take root. There are just as many* cargo cult lawyers as there are cargo cult programmers.
* this is a conservative estimate
But copyright law isn't always logical.
Don't you mean patent-free? Or maybe you are asking for copyright assignment?
Not sure what "copyright-free" means... Like do you only accept public domain code?
Unfortunately, this would be intentional copyright infringement (assuming the code is copyrightable, blah blah blah), since you are doing it on purpose with knowledge that it is copyrighted.
In a number of countries, copyright infringement is also strict liability - it doesn't matter if you had any intent to commit it, but if you did, the damages often start much much higher. So the former case you'd probably have some nominal statutory damages, assuming you can't prove any actual loss. But in the later case, those damages get quite high.
In the US, for example, statutory damages for intentional copyright infringement (IE you don't have to prove any actual damage) are 150k per infringement.
I make no claims any of this makes sense, or someone will actually sue you, or that you should do anything different than "nothing".
My only claim is that "and I find it hard to see how damages could be levied in this situation." is totally the wrong view in a lot of countries - you should expect, if it did get to that point, you would have plenty of damages levied against you.
The author appears to be in the UK, where statutory damages for infringement were historically not available. but post-brexit, they were actually doing consultation/blah blah blah on making them available. I have no idea what happened.
But even if they have no statutory damages, it won't prevent you from being sued wherever the contributor is, and having that law apply rather than your home law :)
It just makes it harder to collect.
I was of the possibly wrong impression that statuatory damages require that the work have been registered with the library of congress. Which seems unlikely for a small patch that might not even be copyrightable anyway.
You are somewhat correct on the statutory damages part.
You can also register after learning of infringement there, as long as it is within 3 months of publication: https://www.law.cornell.edu/uscode/text/17/412
This is just the statutory damages side. The author is also wrong in the "no loss" thing anyway because what courts consider loss is much greater than the average expectation of people :)
I also think they severely underestimate the ease with which they can cut off contributory/etc infringement liability claims.
They seem to think they just get to remove the code and go about their life, but in practice, it's not that simple.
A confirmation is simply unnecessary. Can't it work like, writing this somewhere that says, by creating a pull request, you agree all your code and the discussions around the pull request is now copyright free? Saves everybody time and avoid hassles like this.
The other side of this is I get very annoyed by CLAs -- there have been a number of times I want to contribute to Google and Microsoft's open source projects, but they all require CLAs which require me to get explicit permissions from my employer to contribute to those projects. It is possible, but is a slow and complicated process that nobody wants to go through at my company. So instead of creating a pull request to address the problem, I open an issue and mention how it can be addressed. Which may or may not be picked up by someone else who wants to work on this. This is just frustrating.
It already does, implicitly. If Author A has a repository with a MIT LICENSE and Author B forks that to their own GitHub account (which they must do in order to open the PR), that fork already has a LICENSE file (usually) together with their change, so the change already been made copyright free.
I think it is perfectly reasonable to say “you contributed to the project knowing that the project is licensed X, therefore we can assume that you are ok with your contribution being under the same license and we just merge it”. Not saying that it is wise legally, but it feels to be a coherent theory at least. (Again not a lawyer.)
But if you have a “copyright-assignment” nagging robot that kinda reveals that you think one needs to jump that extra hoop. After that if you ignore that the robot’s question went unanswered it is harder to argue that you just went with the default assumption. Since if you feel it is worth asking the question that means you did not believe that the implied agreement is enough.
I do not think that getting rid of intellectual property entirely is the best solution, but it would be an improvement.
Surely we can do better?
You actually don't use their code. Understand what is being fixed and write it yourself.
It's already part of github TOS as quoted in another comment, so you're just creating repetition hurdles (which won't save you anyway in case of real trouble)
Maybe there is a way to do it without one, but I couldn't figure it out.
The barrier for that sort of thing is really low, but copyright does have an exemption for stuff like that.
(Upd: Looked at the PR in other comments; yeah I don't think the null guard meets copyrightability, but I Am Not A Lawyer. The attached test case might be more dubious.)
So I'd get rid of the bot asking people to confirm the change is copyright-free (since it's already implicitly copyright-free, they've pushed it to GitHub already), and merge the PRs without making contributors jump through additional hoops.
But it seems like when corps like Microsoft et al does open source, they like to sprinkle in a bit of bureaucracy to the process for the sake of bureaucracy, should hardly come as a surprise to anyone.
> The contribution bot asks for confirmation the code change is copyright-free, but the contributor doesn’t respond.
Maybe the author is summarizing though and the bot asks the author to confirm it's copyright free and also to sign the CLA/DCO. If so, unlikely their corporate overlord would be OK with them just merging the PR without explicitly signed CLA/DCO.
This issue of "how can we ensure this code is under copyright terms XYZ" seems to have been popularly solved by CLA's. And thus a pall has been cast over the traditional open source dynamic, where there'd normally be no question that a fork with a license in the repo is obviously licensed under the same terms.
I've heard this on legal podcasts as well. Where lawyers spend like 30+ minutes talking about whether a license in the project root counts. To the traditional free software / open source movement, this seems really silly, because that's how software projects worked for almost the entire existence of free software / open source software until the last 5-ish years.