Missing open-source contributor presents a dilemma when accepting their PR

90 points by FrankRay78 4 days ago | 68 comments

bhouston 4 days ago |
Write your own if it is very simple. If he is gone it is just to just write your own version. The contributor agreement that requires signature is there for a reason.
stavros 4 days ago |
Write your own what? Code? If so, it could be argued in court that you're still violating copyright, because you looked at the code beforehand. At least, that's what happens with anti-reverse-engineering clauses.
ranger_danger 4 days ago |
I think it depends on how different it is, as to whether it will be a violation. And just how different is going to be a subjective decision made by a judge on a case by case basis... if anyone ever bothered to take the issue that far to begin with.
If it's found reasonable to assume that a certain 'copy' of the code could be indistinguishable of that made by another person who didn't look at the original code... then it's probably not similar enough to be infringing.
I would even go so far as to say that I think if such a case _was_ ever brought, that unless someone paid enough money to hire subject matter experts to testify, the case may likely be thrown out because the judge is not able to make an informed decision with such lack of evidence.
unsnap_biceps 4 days ago |
I tried finding the pr they referenced, but wasn't able to in a minute of looking, but I did find https://github.com/spectreconsole/spectre.console/pull/1403 which is a null fix.
Presuming the PR in question is similar, one would likely be able to successfully argue that the code in the PR is trivial enough to not be covered by copyright.
https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St...
I think it's an entirely valid argument given the variables names are defined by the original code and the style is defined by a style guide, the only addition here is a intrinsic utilitarian function without any artistic expression.
That said, I am not a lawyer, so who knows how it would actually play out in court if it went that far.
stavros 4 days ago |
You're right, for something that small it probably wouldn't be an issue, but something larger or non-obvious would be risky. I'm cautioning people against thinking "if I rewrite it, it's fine".
hinkley 4 days ago |
You’re allowed to describe how it works to someone else and have them write it.
If you can write a full specification of the code without any code snippets, or write a full TDD test set, and hand it off to someone who can swear they’ve never looked at the source material, you can still pull off a clean-room copy.
I had to do that for a small lib due to European copyright laws. They don’t like Public Domain. There’s some precedent where the author can change their mind and sue because you can’t actually consent to not consenting to people using your stuff. MIT is great, PD is the Bog of Stench.
stavros 4 days ago |
Yep, this is correct, you probably should document the process of description and implementation.
skissane 4 days ago |
> You’re allowed to describe how it works to someone else and have them write it.
Couldn’t you get an LLM to do that? “Here’s the code for this function, add conditionals to fix any null pointer bugs”
Or: “Here’s a function and a unit test that exposes a bug in it, modify the function so the unit test passes”. With that approach, the LLM could even (autonomously) try multiple times until the test passed.
> I had to do that for a small lib due to European copyright laws. They don’t like Public Domain.
This shouldn’t be an issue for public domain dedications which contain a fallback copyright license such as CC0. [0] People say the Unlicense also falls in that category, but (unfortunately) its wording is less than completely clear, so it is debatable. Another option is “PD-equivalent licenses” such as 0BSD or MIT-0, which are technically copyright licenses but designed to give you the same rights as PD (e.g. reuse without requiring attribution). Now, what some random German judge is going to make of them, who knows.
[0] Although some people, e.g. Fedora, don’t like its clauses around patents
hinkley 4 days ago |
When people pay $100 million for your product they expect to be fully indemnified. It was a pain in my ass but I can’t really blame them.
EDEdDNEdDYFaN 4 days ago |
thought this would be a mystery about a coder who disappeared that surfaced via pull request
bigiain 4 days ago |
Anyone seen Jia Tan around lately?
tonygiorgio 4 days ago |
I was looking forward to the story actually
ranger_danger 4 days ago |
Nice article, written by someone who IMO clearly has some first-hand experience in law, carefully considering multiple angles of what might be considered "reasonable" actions to make and their possible consequences.
sowbug 4 days ago |
It's interesting that the workflow would allow submitting a PR without consenting to terms. Nearly every website or app today makes you agree to terms right at the start.
paulgb 4 days ago |
Technically, to get that far you have to accept GitHub’s ToS, which does have terms (linked elsewhere here) that contributions are assumed to have the license of the repo unless otherwise noted.
ndiddy 4 days ago |
The author didn't link to the actual PR so I can't see the full context, but I don't see the point in setting up a bot to make contributors agree to copyright terms if the maintainers just ignore it when someone does a PR and then doesn't engage with the bot. It seems like a waste of time for all parties.
politelemon 4 days ago |
I think it might be this one: https://github.com/spectreconsole/spectre.console/pull/991
though, it's dealing with a zero input rather than null
FrankRay78 3 days ago |
It was this one. I glossed over the description of the exact change above, given the issue is broader than just the PR in question.
WorldMaker 3 days ago |
The extra useful context I spotted at the top of the blog post was that the project falls under the auspices of the .NET Foundation [0]. The .NET Foundation like several of the other FLOSS foundations/conservatories/archive/consortiums requires a CLA as a CYA in extra part because of the legality concerns that for a project in the Foundation they want to make sure that you understand you are contributing not just to that specific project, but in general as a collective effort towards the Foundation.
This may be an interesting discussion for the author to have more directly with Foundation leadership and legal on what the expectations are.
There's also yes, the larger discussion on if Foundations such as this are possibly too conservative in their FLOSS bureaucracy/red-tape for smaller contributions to smaller projects. Under the good for the goose/gander assumption it's easy to add the same bots to every project and assume that's good enough, but does it stifle innovation or bug fixes on projects with fewer eyes?
[0] https://dotnetfoundation.org/
deadbunny 4 days ago |
I don't like them and won't contribute to projects with them but isn't this the exact point of a CLA[1]? A textfile in the repo seems a lot easier to track and audit than PR comments and a bot to chase people.
1. https://en.m.wikipedia.org/wiki/Contributor_License_Agreemen...
thayne 4 days ago |
No. The purpose of a CLA is so that the owner of the project can use the code in a commercial product that might not comply with the OSS license (particularly if that license is a copyleft licence such as GPL, AGPL, or MPL) and/or they can change the license more easily.
NegativeK 4 days ago |
Python has a CLA that allows the PSF board to relicense the code to "any other open source license approved by unanimous vote".
lupusreal 3 days ago |
Legally speaking, once they own the copyright, is there anything stopping the PSF from selling out and changing their policy to permit proprietary licensing?
I have this same concern with GNU. I can imagine a future where some key figures have died or retired and the new org sells out and changes the license to something RMS never would have agreed to.
cowsandmilk 3 days ago |
The PSF Contributor Agreement doesn’t assign copyright to the PSF, so your question doesn’t apply.
Text from their contributor agreement:
> PSF understands and agrees that Contributor retains copyright in its Contributions.
NegativeK 3 days ago |
To go along with the other response, they don't own your copyright -- but I don't know if the language in the Python CLA actually holds them to the "we can only relicense to open source licenses" if challenged in a court.
But that's not really a risk I care about. The PSF is a nonprofit that's clearly aimed at being a nonprofit, and a Future Evil Board is beyond what I'm going to worry about.
bawolff 4 days ago |
That's more the risk than the purpose. Some people do CLA's for that purpose, but sometimes it really is about having a paper trail that the software is open source, or to make it easier to sue people who violate the license.
thayne 4 days ago |
A DCO [1] would serve that purpose better. It's possible that the desire to have the flexibility to change the license at some point in the future is initially well-intentioned (for example you may start out as GPL, but want the option to change to Apache 2.0 later), so having a CLA doesn't necessarily imply they plan on doing a rug pull. And of course there is probably also some cargo cult of using a CLA because that is what other projects do, but there are other ways of insuring everything is open source that don't require the contributor to give you unlimited rights.
[1]: https://en.m.wikipedia.org/wiki/Developer_Certificate_of_Ori...
noirscape 3 days ago |
Keep in mind that a DCO is not the same thing as a CLA; kemitchell has written about the subject[0]. In short - the DCO is specifically written to meet the needs of the Kernel and some of it's expectations assume the workflow of the LKML and the code style of the kernel. He lists 6 conditions that you'd need to meet before the DCO is useful for your project. The most notable ones are that or-later licenses aren't a good idea with a DCO, that you must put the license text in a file header and that there's a Signed-Off-By element in your commits.
The DCO is also very patch oriented, rather than contributor oriented, which only works if your workflow has more contributors than patches (which isn't how most FOSS projects are organized; you usually have only a few contributors, who submit patches.)
Finally, the DCO was put in place to resolve someone being annoying about the licenses rather than existing to unify the copyright of the Kernel behind one entity.
[0]: https://writing.kemitchell.com/2021/07/02/DCO-Not-CLA
thayne 3 days ago |
> rather than existing to unify the copyright of the Kernel behind one entity
Unifying the copyright behind one entity is the problem with a CLA. Especially if that entity is a company that might have pressures to change the license to a proprietary license.
noirscape 3 days ago |
...and that's why the Free Software Foundation requires signing CLAs[0], those evil commercial, proprietary product making rapscallions!
The reality is that without a CLA, copyright enforcement tends to turn into a complete mess. To be clear - that can absolutely be the point; a completely unenforceable copyright that's still enough of a mess to scare off violators can have it's uses; the Kernel jumps to mind. Linus and Greg have both been open about the fact that the license is there to encourage people to contribute as a carrot, not there as a stick to beat them over the head with. Explaining how the license works and why they'd really appreciate cooperation is much more useful for the LKML than it would be to keep a bunch of lawyers on standby and the fractured license helps achieve that goal.
They're often used by corporations to rugpull a license change, but the original purpose of a CLA is just to ensure that there's one entity in control of the licenses, which is more useful if an entity prefers the stick approach to compliance. (Which the FSF I would say absolutely lands under by-the-by.)
[0]: https://www.gnu.org/licenses/why-assign.en.html
sgentle 4 days ago |
"Whenever you add Content to a repository containing notice of a license, you license that Content under the same terms, and you agree that you have the right to license that Content under those terms. If you have a separate agreement to license that Content under different terms, such as a contributor license agreement, that agreement will supersede."
https://docs.github.com/en/site-policy/github-terms/github-t...
geenat 4 days ago |
Really good to see Github being pro-active to the benefit of the open source community.
cxr 4 days ago |
It's a stupid synthesis, though, just like the one from the article:
> The repository is MIT-licensed, and clearly advertised as such, so it’s reasonable to expect all contributions are made under that license
You don't have to assume anything, given the way pull requests work. It's not like it's a code snippet extracted from one of their comments on the bugtracker and then subsequently integrated upstream. They published something: their fork.
Look at the repo the pull request is coming from—the one the requestor published. What is the license they published it under? Did they just dump a bunch of stuff online that says it's licensed under MIT? Yup. So if they have the rights to grant it to you, then you can use it under the MIT license.
The only time this doesn't apply is when the contributor deletes their repo. The pull request turns into a patch merge request. But the repo doesn't have to remain available indefinitely. The mere fact that it was published under such-and-such license at some time and was available to you/whoever is sufficient.
manwe150 4 days ago |
> So if they have the rights to grant it to you, then you can use it under the MIT license
That seems rather the crux of the problem: did they have the right to upload that patch with the given license, or did they commit fraud first? Being able to see the LICENSE file still intact (which GitHub has promised you can do indefinitely for any PR branch even if the contributing repo got deleted) would not protect against that. The CLA doesn’t protect against it either, but apparently some companies think it is at least a useful additional legal barrier. IANAL and so not qualified to comment on whether such CLA is actually useful for the intended purpose
I find it interesting to consider that open source existed for decades mailing patches (usually sans any license info) to a mailing list without legal trouble, and now that GitHub offers easy and complete traceability of the whole patch context this makes it to HN as a concern
anticorporate 3 days ago |
It's a legitimate question. I'm generally opposed to CLAs and prefer DCOs (Developer Certificates of Origin) as that's the only thing I want an open source project to do when validating an individual's contribution - that is, to ensure that they have a right to make it, as opposed to forcing them to consent to other terms like potential future relicensings.
https://en.wikipedia.org/wiki/Developer_Certificate_of_Origi...
That said, the playing field is unequal between proprietary and open source projects. If I contribute open source code to a proprietary project, the odds of this being discovered and rectified are low, since the public doesn't get any right to audit closed-source software for the presence of copyleft code.
cxr 3 days ago |
> The CLA doesn’t protect against it either, but apparently some companies think it is at least a useful additional legal barrier. IANAL and so not qualified to comment on whether such CLA is actually useful
I'm not a lawyer, either, but that doesn't mean I'm not qualified to comment about whether it's useful. It's not. It's stupid, and they're wrong, whether they have a lawyer endorsing it or not. Don't let the Gell-Mann amnesia take root. There are just as many* cargo cult lawyers as there are cargo cult programmers.
* this is a conservative estimate
Xylakant 3 days ago |
The problem with this is that it's an agreement between Github and the Contributor which the Maintainer cannot directly enforce. The maintainer can point to it when they get sued, but they would likely need to get Github to enforce it.
thayne 4 days ago |
From a logical standpoint, if someone makes a pull request to an open source project, it should be safe to assume they are ok with it being distributed under the current license of the project they are contributing to.
But copyright law isn't always logical.
fourthark 4 days ago |
I think the concern is more that the person did not have the right to contribute, i.e. their employer owns the copyright to their work.
3np 4 days ago |
To add on to the advice in TFA: Perhaps that bot is exactly the reason the contributor didn't want to bother anymore. It's just unnecessary. Why not remove it? Terms and licenses can be put in the PR template or something.
nialv7 4 days ago |
> ... asks for confirmation the code change is copyright-free
Don't you mean patent-free? Or maybe you are asking for copyright assignment?
Not sure what "copyright-free" means... Like do you only accept public domain code?
ClassyJacket 4 days ago |
Yeah. In most countries copyright is automatic as soon as you create the work. Surely they mean they want confirmation they have a licence to use the copyrighted work?
DannyBee 4 days ago |
" and I find it hard to see how damages could be levied in this situation."
Unfortunately, this would be intentional copyright infringement (assuming the code is copyrightable, blah blah blah), since you are doing it on purpose with knowledge that it is copyrighted.
In a number of countries, copyright infringement is also strict liability - it doesn't matter if you had any intent to commit it, but if you did, the damages often start much much higher. So the former case you'd probably have some nominal statutory damages, assuming you can't prove any actual loss. But in the later case, those damages get quite high.
In the US, for example, statutory damages for intentional copyright infringement (IE you don't have to prove any actual damage) are 150k per infringement.
I make no claims any of this makes sense, or someone will actually sue you, or that you should do anything different than "nothing".
My only claim is that "and I find it hard to see how damages could be levied in this situation." is totally the wrong view in a lot of countries - you should expect, if it did get to that point, you would have plenty of damages levied against you.
The author appears to be in the UK, where statutory damages for infringement were historically not available. but post-brexit, they were actually doing consultation/blah blah blah on making them available. I have no idea what happened.
But even if they have no statutory damages, it won't prevent you from being sued wherever the contributor is, and having that law apply rather than your home law :)
It just makes it harder to collect.
toast0 4 days ago |
> In the US, for example, statutory damages for intentional copyright infringement (IE you don't have to prove any actual damage) are 150k per infringement.
I was of the possibly wrong impression that statuatory damages require that the work have been registered with the library of congress. Which seems unlikely for a small patch that might not even be copyrightable anyway.
DannyBee 4 days ago |
It's also required to file suit anyway, but for that purpose does not have to be done before the infringement (IE you can register after infringement) - see Fourth Estate Public Benefit Corp. v. Wall-Street.com LLC et al
You are somewhat correct on the statutory damages part.
You can also register after learning of infringement there, as long as it is within 3 months of publication: https://www.law.cornell.edu/uscode/text/17/412
This is just the statutory damages side. The author is also wrong in the "no loss" thing anyway because what courts consider loss is much greater than the average expectation of people :)
I also think they severely underestimate the ease with which they can cut off contributory/etc infringement liability claims.
They seem to think they just get to remove the code and go about their life, but in practice, it's not that simple.
rty32 4 days ago |
> The contribution bot asks for confirmation the code change is copyright-free
A confirmation is simply unnecessary. Can't it work like, writing this somewhere that says, by creating a pull request, you agree all your code and the discussions around the pull request is now copyright free? Saves everybody time and avoid hassles like this.
The other side of this is I get very annoyed by CLAs -- there have been a number of times I want to contribute to Google and Microsoft's open source projects, but they all require CLAs which require me to get explicit permissions from my employer to contribute to those projects. It is possible, but is a slow and complicated process that nobody wants to go through at my company. So instead of creating a pull request to address the problem, I open an issue and mention how it can be addressed. Which may or may not be picked up by someone else who wants to work on this. This is just frustrating.
joelhaasnoot 3 days ago |
Time for an "open source contributor"-by proxy service ;)
diggan 3 days ago |
> Can't it work like, writing this somewhere that says, by creating a pull request, you agree all your code and the discussions around the pull request is now copyright free?
It already does, implicitly. If Author A has a repository with a MIT LICENSE and Author B forks that to their own GitHub account (which they must do in order to open the PR), that fork already has a LICENSE file (usually) together with their change, so the change already been made copyright free.
krisoft 4 days ago |
I’m not a lawyer at all. What i feel is that the existence of the copyright assignment bot makes this decision worse.
I think it is perfectly reasonable to say “you contributed to the project knowing that the project is licensed X, therefore we can assume that you are ok with your contribution being under the same license and we just merge it”. Not saying that it is wise legally, but it feels to be a coherent theory at least. (Again not a lawyer.)
But if you have a “copyright-assignment” nagging robot that kinda reveals that you think one needs to jump that extra hoop. After that if you ignore that the robot’s question went unanswered it is harder to argue that you just went with the default assumption. Since if you feel it is worth asking the question that means you did not believe that the implied agreement is enough.
fourthark 4 days ago |
It's not copyright assignment in this case. The bot is nagging about the code being copyright-free and license-compatible. So it's in the gray area that you describe in your second paragraph.
worik 4 days ago |
The actual problem here is the existence of "intellectual property". It is corrupting all sorts of things and making plenty of simple things very hard, as in this case.
I do not think that getting rid of intellectual property entirely is the best solution, but it would be an improvement.
Surely we can do better?
matt3210 4 days ago |
The USA is overly litigious so I can understand why people might be worried about this, especially a large firm like Microsoft who has a lot to loose.
incompatible 4 days ago |
If the change is small enough, it wouldn't contain any original literary expression, as required to be copyrightable, and would be automatically in the public domain. Especially if there's only one way to make the change, you can't rewrite it some other way while preserving the intent.
dgellow 4 days ago |
3 is the solution. Close, then rewrite it yourself (not an exact copy). It’s fairly common, I had this happening to me relatively often
kazinator 4 days ago |
The proper thing to do in this situation is this: treat it as a bug report which was accompanied by a patch that was not used. Credit the bug finder, and acknowledge that the fix is very closely based on their proposed solution.
kazinator 4 days ago |
The proper thing to do in this situation is this: treat it as a bug report which was accompanied by a patch that was not used. Credit the bug finder, and acknowledge that the fix is very closely based on their proposed solution.
You actually don't use their code. Understand what is being fixed and write it yourself.
eviks 4 days ago |
5. Remove the annoying bot and merge it anyway.
It's already part of github TOS as quoted in another comment, so you're just creating repetition hurdles (which won't save you anyway in case of real trouble)
wordofx 3 days ago |
Idiot joined the DNF. A completely worthless foundation. First thing to do would be to leave.
max-privatevoid 3 days ago |
This problem is entirely self-imposed from the CLA bullshit they're trying to pull off, which also defeats the point of FOSS. I bet they "love Open Source".
camel-cdr 3 days ago |
This is why I can't contribute to google projects, you need to sign their CLA, which requires a google account.
Maybe there is a way to do it without one, but I couldn't figure it out.
noirscape 3 days ago |
In this particular case, I don't think adding a null input guard would meet the minimum level of copyrightability (the same way you can't actually copyright "hello world" or basic shapes).
The barrier for that sort of thing is really low, but copyright does have an exemption for stuff like that.
(Upd: Looked at the PR in other comments; yeah I don't think the null guard meets copyrightability, but I Am Not A Lawyer. The attached test case might be more dubious.)
kjs3 3 days ago |
5. Digest/understand the contribution, explain it to another dev who hasn't seen contribution and have them implement the fix clean-room style.
diggan 3 days ago |
In order to open a PR to a repository, you need to push the commit somewhere, usually your own fork. And since that fork already contains the same LICENSE as the upstream project (your project), the author of the PR has essentially already licensed the code under the same LICENSE you use.
So I'd get rid of the bot asking people to confirm the change is copyright-free (since it's already implicitly copyright-free, they've pushed it to GitHub already), and merge the PRs without making contributors jump through additional hoops.
But it seems like when corps like Microsoft et al does open source, they like to sprinkle in a bit of bureaucracy to the process for the sake of bureaucracy, should hardly come as a surprise to anyone.
veggieroll 3 days ago |
The problem is that licensing under the same terms is not enough for most corporate open source stewards. They want copyright assignment, meaning you give them all of your rights.
diggan 3 days ago |
My understanding is that if that was the bot would be asking for, they'd ask for a CLA/DCO or similar, which doesn't seem to be what it currently asks for
> The contribution bot asks for confirmation the code change is copyright-free, but the contributor doesn’t respond.
Maybe the author is summarizing though and the bot asks the author to confirm it's copyright free and also to sign the CLA/DCO. If so, unlikely their corporate overlord would be OK with them just merging the PR without explicitly signed CLA/DCO.
veggieroll 3 days ago |
Absolutely. In this case, CLA isn't an issue. But I bring it up, because it seems to me that this article is a consequence of the prevalence of CLA's in projects advised by professional legal advice.
This issue of "how can we ensure this code is under copyright terms XYZ" seems to have been popularly solved by CLA's. And thus a pall has been cast over the traditional open source dynamic, where there'd normally be no question that a fork with a license in the repo is obviously licensed under the same terms.
I've heard this on legal podcasts as well. Where lawyers spend like 30+ minutes talking about whether a license in the project root counts. To the traditional free software / open source movement, this seems really silly, because that's how software projects worked for almost the entire existence of free software / open source software until the last 5-ish years.
ramdac 3 days ago |
this seems to be the most logical course of action taken.