C: Simple Defer, Ready to Use

175 points by ingve 5 days ago | 151 comments

stefanos82 5 days ago |
This should have been in C ages ago, but better late than never!
cmovq 5 days ago |
This isn't new. The only C23 feature this article uses is the attribute syntax, but you could just use `__attribute((cleanup(F)))` instead.
robertlagrant 5 days ago |
Tis a joke my friend.
marmaduke 5 days ago |
i like how the contraction it’s and abbreviation T’is are anagrams
sigzero 5 days ago |
It is not T'is, it is Tis. No apostrophe.
hollerith 5 days ago |
I've only ever seen it spelled 'tis
dkjaudyeqooe 5 days ago |
It's a contraction of "it is" so " 'tis " is correct.
jcranmer 5 days ago |
The apostrophe is in the wrong spot, but 'tis is the correct spelling, and the only one I've ever seen.
IncRnd 5 days ago |
The word is 'tis with an apostrophe.
https://www.britannica.com/dictionary/%27tis
https://www.merriam-webster.com/dictionary/%27tis
https://dictionary.cambridge.org/us/dictionary/english/tis
webstrand 5 days ago |
Very nice, using the goto trick to perform cleanups always felt dirty to me.
foooorsyth 5 days ago |
gotos can be used without shame. Dijkstra was wrong (in this rare case).
defer is cleaner, though.
jerf 5 days ago |
Dijkstra was not wrong. Modern programmers are wrong in thinking that the goto that they use is what Dijkstra was talking about, merely because of the fact it happens to be called the same thing. I mean, I get how that can happen, no sarcasm, but the goto Dijkstra was talking about and what is in a modern language is not the same. https://jerf.org/iri/post/2024/goto/
The goto Dijkstra is talking about is dead. It lives only in assembler. Even BASIC doesn't have it anymore in any modern variant. Modern programmers do not need to live in fear of modern goto because of how awful it was literally 50 years ago.
Spivak 5 days ago |
Or the tl;dr in modern parlance Dijkstra was railing against the evils of setjmp.
ryao 5 days ago |
Lua uses it for error handling. It is really hard to understand the lua code. :/
zzo38computer 5 days ago |
Sometimes setjmp is useful and I had occasionally used it, but usually it is not needed. There is certain considerations you must make in order to be careful when you are using setjmp though.
(Free Hero Mesh uses setjmp in the execute_turn function. This function will call several other functions some of which are recursive, and sometimes an error occurs or WinLevel or LoseLevel occurs (even though these aren't errors), in which case it will have to return immediately. I did try to ensure that this use will not result in memory leaks or other problems; e.g. v_set_popup allocates a string and will not call longjmp while the string is allocated, until it has been assigned to a global variable (in which case the cleanup functions will handle this). Furthermore, the error messages are always static, so it is not necessary to handle the memory management of that either.)
kbolino 5 days ago |
No, even setjmp/longjmp are not as powerful or dangerous. The issue is not the locality of the jump, but the lack of any meaningful code structure enforced by the language. Using setjmp and longjmp properly still saves and restores context. You still have a function call stack, a convention for saving and restoring registers, locally scoped variables, etc. Though, using setjmp/longjmp improperly on some platforms might come close, since you're well into undefined behavior territory.
Parent is correct that this doesn't really exist outside of assembly language anymore. There is no modern analogue, because Dijkstra's critique was so successful.
readthenotes1 5 days ago |
"when I see modern code that uses goto, I actually find that to be a marker that it was probably written by highly skilled programmers. "
He should have said "correct code", not "modern code" because the times I remember seeing goto the code was horribly incorrect and unclear.
(With break and continue, someone has to be doing something extra funky to need goto. And even those were trigger signs to me, as often they were added as Hail Mary's to try to make something work)
{I typically reviewed for clarity, correctness, and consistency. In that order}
huhtenberg 5 days ago |
The interpretation of Dijkstra's sentiment in your blog post is plain wrong.
His paper [1] clearly talks about goto semantics that are still present in modern languages and not just unrestricted jmp instructions (that may take you from one function into the middle of another or some such). I'd urge everyone to give it a skim, it's very short and on point.
[1] https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.p...
jerf 4 days ago |
Well, there you get that I don't believe in letting certain people own ideas and then stick to them as if they were given revelation from on high about how things should work. The distinctive thing about goto that breaks structured programming, to the point that functions as we think of them today can't even exist in such an environment, is the ability to jump arbitrarily.
I'm way less worried about uses of goto that are rigidly confined within some structured programming scope. As long as they stay confined to specific functions, they are literally orders of magnitude less consequential than the arbitrary goto, and being an engineer rather than an academic, I take note of such things.
I don't ask Alan Kay about the exact right way to do OO, I don't ask Fielding about the exact right way to do REST interfaces, and I don't necessarily sit here and worry about every last detail of what Dijkstra felt about structured programming. He may be smarter than me, but I've written a lot more code in structured paradigms than he ever did. (This is not a special claim to me; you almost certainly have too. You should not ignore your own experiences.)
jcranmer 5 days ago |
Your analysis is wrong. Dijkstra was a big proponent of structured programming, and the fundamental thesis of his argument is that the regular control flow structures we're used to--if statements, loops, etc.--all represent a tree-based data structure. In essence, the core argument is that structured programming allows you to mentally replace large blocks of code with black boxes whose exact meanings may not be important.
The problem with GOTO, to Dijkstra, is that it violates that principle. A block can arbitrarily go somewhere else--in the same function, in a different function (which doesn't exist so much anymore)--and that makes it hard to reason about. Banning GOTO means you get the fully structured program that he needs.
(It's also worth remembering here that Dijkstra was writing in an era where describing algorithms via flowcharts was common place, and the use of if statements or loops was far from universal. In essence, this makes a lot of analysis of his letter difficult, because modern programmers just aren't exposed to the kind of code that Dijkstra was complaining about.)
Since that letter, modern programming has embraced the basic structured programming model--we think of code almost exclusively of if statements and loops. And, in many language, goto exists only in extremely restricted forms (break, continue, and return as anything other than the last statement of a function). It should be noted that Dijkstra's argument actually carries through to railing against the modern versions of break et al, but the general program of structured programming has accepted that "early return" is an acceptable deviation from the strictly-single-entry-single-exit that is desired that doesn't produce undue cognitive overhead. Even where mind-numbing goto exists today (e.g., C), it's similarly largely used in ways that are similar to "early return"-like concepts, not the flowchart-transcribed-to-code-with-goto-as-sole-control-flow style that Dijkstra is talking about.
And, personally, when I work with assembly or LLVM IR (which is really just portable assembly), I find that the number one thing I want to look at a very large listing is just something that converts all the conditional/unconditional jumps into if statements and loops. That's really the main useful thing I want from a decompiler; everything else as often as not just turns out to be more annoying to work with than the original assembly.
jerf 4 days ago |
I struggle with how you claim my "analysis is wrong", and then basically reiterate my entire point. I know it's not that badly written; other people got it just fine and complain about what it actually does say.
The modern goto is not the one he wrote about. It is tamed and fits into the structured programming paradigm. Thus, ranting about goto as if it is still the 1960s is a pointless waste of time. Moreover, even if it does let you occasionally violate structured programming in the highly restricted function... so what? Wrecking one function is no big deal, and generally used when it is desirable that a given function not be structured programming. Structured programming, as nice as it is, is not the only useful paradigm. In particular state machines and goto go together very nicely, where the "state machine" provides the binding paradigm for the function rather than structured programming. It is perhaps arguably the distinctive use case that it lives on for in modern languages.
jcranmer 4 days ago |
> The modern goto is not the one he wrote about. It is tamed and fits into the structured programming paradigm.
No, it doesn't, not the goto of C or C++ (which tames it a smidge because it has to). That's the disconnect you have. It's not fine just because you can't go too crazy and smash other functions with it anymore. You can still go crazy and jump into the middle of scopes with uninitialized variables. You can still write irreducible loops with it, which I would argue ought to be grounds for the compiler to rm -rf your code for you.
There are tame versions of goto--we call them break, continue, and return. And when the C committee discussed adding labeled break, and people asked why it was necessary because it's just another flavor of goto, I made some quite voluminous defense of labeled break because it was a tame goto, and taming it adds more possibility.
And yes, the tame versions of goto violate Dijkstra's vision. But I also don't think that Dijkstra's vision is some sacrosanct thing that must be defended to the hilt--the tame versions are useful, and you still get most of the benefits of the vision if you have them.
In summary:
a) when Dijkstra was complaining about goto, he would have included the things we call break, continue, and early return as part of that complaint structure.
b) with the benefit of decades of experience, we can conclude that Dijkstra was only partially right, and there are tame goto-like constructs that can exist
c) the version of goto present today in C is still too untamed, and so Dijkstra's injunction against goto can apply to some uses of it (although, I will note, most actual uses of it are not something that would fall in that category.)
d) your analysis, by implying that it's only the cross-function insanity he was complaining about, is wrong in that implication.
jerf 4 days ago |
"You can still go crazy and jump into the middle of scopes with uninitialized variables."
It is difficult when speaking across languages, but in many cases, no, you can't.
https://go.dev/play/p/v8vljT91Rkr
C isn't a modern language by this standard, and to the extent that C++ maintains compatibility with it (smoothing over a lot of details of what that means), neither is it. Modern languages with goto do not generally let you skip into blocks or jump over initializations (depending on the degree to which it cares about them).
The more modern the language, generally the more thoroughly tamed the goto is.
dfawcus 3 days ago |
It doesn't even need to be a modern language to protect against that:
BEGIN INT x := 1; print(("x is", x, newline)); GOTO later; INT y := 2; later: print(("y is", y, newline)) END
for which we have:
$ a68g goto.a68 x is +1 11 print(("y is", y, newline)) 1 a68g: runtime error: 1: attempt to use an uninitialised REF INT value (detected in [] "SIMPLOUT" collateral-clause starting at "(" in this line).
Although admittedly it is a runtime error.
However if y is changed to 'INT y = x + 2;', essentially a "constant", then there is no runtime error:
$ a68g goto.a68 x is +1 y is +0
ryao 5 days ago |
I really like the goto trick. It makes functions readable, versus the mess you have without it.
jefbyokyie 5 days ago |
Indeed!
cjensen 5 days ago |
I use a few strategies instead of goto:
(1) For simpler cases, wrap in do {} while (0) and break from the loop
(2) For multiple cleanups, use same technique combined with checks to see if the cleanup is required. E.g. if (f != null) fclose (f)
(3) put the rest of the stuff in another function so that the exit code must run on the way out.
In 35 years of coding C/C++, I've literally never resorted to goto. While convenient, this new defer command looks like the kind of accidental complexity that templates brought to C++. That is, it provides a simple feature meant to be solve simple problems in a simple way that accidentally allows architecture astronauts the ability to build elaborate footguns.
dktoao 5 days ago |
`the do {} while (0)` block with breaks does exactly what goto does but it is so much more hacky, less flexible and harder to follow IMHO.
armitron 5 days ago |
35 years of coding C don't amount to much if you ignore best practices. Look at the Linux kernel to learn how gotos can be used to make the code safer and improve clarity.
There's probably something wrong if a substantial project in C does NOT use gotos.
John_Cena 4 days ago |
Would any particular piece of the kernel stand out to you in this manner? Last time I looked I didn't think too much about it being used.
throw16180339 4 days ago |
How do you handle breaking out of a nested for loop without goto?
for (…) { for (…) { if (…) { … goto found; } } } found:
This is straightforward with goto and may even be vectorizable. I guess you could move the loop to a separate function or add additional flags to each loop, but neither of these seems like an improvement.
cjensen 4 days ago |
Obviously first two strats don't work, so I would use the third.
loeg 5 days ago |
Do defers get processed if an exception is thrown? I know C doesn’t have exceptions, but real world C code often interacts with C++ code. So it is necessary imo to define that interaction.
In C++ we have something pretty similar already in the form of Folly ScopeGuard (SCOPE_EXIT {}).
kevin_thibedeau 5 days ago |
GCC has the __cleanup__ attribute which works in conjunction with exceptions in C to provide an RAII mechanism.
eqvinox 5 days ago |
No it doesn't; C functions don't set up appropriate registrations with the C++ exception / stack unwinding systems, thus C functions are simply skipped over on the way up towards the nearest exception handler.
__attribute__((cleanup(…))) is purely a scope-local mechanism, it has absolutely nothing to do with exceptions.
leni536 5 days ago |
"If -fexceptions is enabled, then cleanup_function is run during the stack unwinding that happens during the processing of the exception. Note that the cleanup attribute does not allow the exception to be caught, only to perform an action. It is undefined what happens if cleanup_function does not return normally."[1]
While it's true that it -fexceptions is disabled for C by default, some C libraries need to enable it anyway if they want to interact with C++ exceptions this way. For example C++ requires that qsort and bsearch propagate the exception thrown by the comparison callback normally, so libc implementations that are also used from C++ do enable it.
[1] https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...
eqvinox 4 days ago |
Ah, I forgot about -fexceptions, Thanks. Though I need to point out it's non-default and rarely used, and in particular:
> some C libraries need to enable it anyway if they want to interact with C++ exceptions this way. For example C++ requires that qsort and bsearch propagate the exception thrown by the comparison callback normally
-fexceptions is not needed for this, C++ exceptions will just transparently bubble through C functions without its use.
I don't think I know of any project using -fexceptions… time to google…
leni536 4 days ago |
glibc allocates in its fallback algorithm within qsort. It has to free the allocation when a C++ exception is thrown in a comparison, unless it wants to leak. I recently filed a ticket for this. They kinda have to compile at least qsort with -fexceptions, unless they use a fully in-place sorting algorithm.
eqvinox 4 days ago |
Oof, what an ugly issue… I kinda thought (hoped?) it'd be in-place anyway and thus not need any heap interactions, but I guess it does and they have no leeway to change the (fallback) algorithm…
It's a bit weird for the C++ standard to require behavior off C standard functions though, I gotta say… seems more sensible to use a C++ sort…
loeg 4 days ago |
-fexceptions used to be the only control for emitting unwind tables, IIRC, which C codebases may have desired for backtraces. (Now there is -fasynchronous-unwind-tables.)
kevin_thibedeau 5 days ago |
GCC has comon exception handling across its supported languages. They have exceptions as a C extension that interoperate with C++. Cleanups get called during a stack unwind because they are considered the same as a destructor.
eqvinox 4 days ago |
The sibling post to yours has a correct description of the full picture
ryao 5 days ago |
I am reminded of this:
https://news.ycombinator.com/item?id=42532979
Interactions with stack unwinding are not considered by C, so I doubt this would be. Compilers would be free to make it work, however. This is just my guess.
cryptonector 5 days ago |
`extern "C"` functions coded in C++ should really not throw! I would say that they MUST NOT throw.
loeg 3 days ago |
In practice, they can though. So whether they should or not doesn't have much bearing on the situation.
andrewmcwatters 5 days ago |
I know you can do these things, but I've always tried to avoid macro programming all together in C and C++ unless forced to use them by dependencies in idiomatic usage.
accelbred 5 days ago |
Unfortunately, this requires an executable stack, and only works on gcc, though an alternate implementation can work on clang.
After a few attempts at defer, I ended up using a cleanup macro that just takes a function and a value to pass to it: https://github.com/aws-greengrass/aws-greengrass-lite/blob/8...
Since the attribute or a function-like macro in the attribute position broke the c parsing in some tooling, I made the macro look like a statement.
uecker 5 days ago |
It does not need an executable stack: https://godbolt.org/z/K1GTa4jh4
accelbred 5 days ago |
The macro from the article uses nested functions which on gcc are implemented with tramolines that need executable stack.
https://thephd.dev/lambdas-nested-functions-block-expression...
uecker 5 days ago |
The trampolines are not generated in this case.
eqvinox 5 days ago |
To be fair you're not communicating very well — there are no trampolines being used here because the call site is directly in the same function, so no trampoline is needed with already being in the correct stack frame.
(And also on -O1 and higher the entire call is optimized out and the nested function inlined instead.)
uecker 5 days ago |
Fair, I was not explaining why the statement was wrong. Anyhow, the trampoline is not related to the call site being in the same function but whether a pointer is generated to the nested function and escapes. Nested functions in general do not need trampolines or executable stack.
accelbred 5 days ago |
Nice, if this is reliable across gcc versions and optimization levels, I might consider it for future stuff. Though making it such that treesitter and other tools dont barf on it would still need investigation.
Gibbon1 5 days ago |
I've used nested functions for a very long time with gcc and they are completely reliable. Since my code is embedded without an MMU the oh noes executable stack doesn't fill me with any dread.
It's unfortunate a lot of the standards guys are horrified by anything that isn't C89. Because if the executable stack is an issue it's worth fixing.
Side note: 20 years ago people thought if they made the stack non executable that would totally fix stack smashing attacks and unfortunately it only slows down script kiddies.
eru 5 days ago |
> Side note: 20 years ago people thought if they made the stack non executable that would totally fix stack smashing attacks and unfortunately it only slows down script kiddies.
Slowing them down is good. And: separating data and code helps simplify managing the caches.
uecker 4 days ago |
One could easily remove the need for trampolines by having a wide function pointer type. In C, this would even be allowed without having a new type, but ABI compatibility makes this not practical. With a new qualifier this would be no problem though. The real reason some people are against it is that it diverges from C++.
uecker 5 days ago |
And I should mention that GCC nowadays can also allocate the trampoline on the heap: -ftrampoline-impl=heap
eqvinox 4 days ago |
Well, the heap is also non-executable on a whole bunch of platforms these days (and for good measure, w^x is a good concept to go by; even JITed code shouldn't be writable and executable at the same time)
kragen 5 days ago |
I feel that uecker went above and beyond the call of duty by including a godbolt link in their first comment which shows the full assembly-language implementation of this behavior by GCC without using an executable stack, with syntax highlighting, and full C source for reproducing the behavior on your own machine. I don't see how anything they could possibly have written as a comment could be clearer or more convincing.
uecker 4 days ago |
I edited the comment to add the link a couple of minutes later, so maybe it was missed.
Night_Thastus 5 days ago |
I really, really do not like what that could do to the readability of code if it was used liberally.
It's like GOTOs, but worse, because it's not as visible.
C++'s destructors feel like a better/more explicit way to handle these sorts of problems.
skinner927 5 days ago |
But thats C++
Night_Thastus 5 days ago |
I am aware. I'm just pointing it out because they're similar ideas.
accelbred 5 days ago |
Destructors are hidden control flow, and can be non-obvious for structs from other files. I find they make code significantly harder to follow than in plain C. Defer does not have the problem, as all the logic in your function is explicitly there.
quelsolaar 5 days ago |
Not right there, some other place in the function. Also people will start adding defer in macros and then things will go sideways.
jefbyokyie 5 days ago |
> Not right there, some other place in the function.
Exactly. Both C++ RAII (constructors/destructors) and C23 defer are awful. They make the behavior implicit; in effect they tie an indefinitely large (and growing) baggage to scope termination without the source code reflecting that baggage.
Cascading gotos (or the arrow pattern) are much better. Cleanup code execution is clearly reflected by location in the source code; the exit path can be easily stepped through in a debugger.
Analemma_ 5 days ago |
I don't mind destructors but I don't understand your insistence that they're significantly better than defer for visibility: they're invisible and invoke a distant jump to some bit of code which might not even be in your codebase; defer is "right there" (and with proper editor support could even be made to look like it happens at the end of the function instead of where it's declared). I think they're both fine for their respective languages.
cryptonector 5 days ago |
> I really, really do not like what that could do to the readability of code if it was used liberally.
> It's like GOTOs, but worse, because it's not as visible.
> C++'s destructors feel like a better/more explicit way to handle these sorts of problems.
But what C++ gives you is the same thing:
> It's like GOTOs, but worse, because it's not as visible.
!
The whole point of syntactic sugar is for the machinery to be hidden, and generated assembly will generally look like goto spaghetti even when your code doesn't.
What this implementation of defer does under the covers is not interesting unless you're trying to make it portable (e.g., to older MSVC versions that don't even support C99 let alone C23 or GCC local function extensions) or efficient (if this one isn't).
jefbyokyie 5 days ago |
> The whole point of syntactic sugar is for the machinery to be hidden
And when the machinery fails, you'll not only have the machinery to debug, but the syntactic sugar too.
restalis 5 days ago |
Still, the machinery gets more eyeballs and scrutiny to how it works than what some Joe's given codebase does, so chances are that the cause of failure will be somewhere on Joe's side. Then, the code will still need to be debugged, and you'll be glad having to deal with well-known implementations of DRY rule under "syntactic sugar" category instead of whatever Joe happened to come up with instead. Or, maybe there was no DRY rule in Joe's mind and therefore there will be a lot more code to maintain and debug.
cryptonector 4 days ago |
Yeah, compiler bugs happen.
kubb 5 days ago |
How do you write C code that needs to do this (set up several resources, and clean only some of them up depending on where the function returns) so that it’s easy to follow?
jefbyokyie 5 days ago |
When constructing an object in C, we may need some permanent sub-objects, and some temporary objects.
If we only need permanent sub-objects, then we set those up gradually, and build an error path in reverse order (with gotos or with the arrow pattern); upon success, the sub-objects' ownership is transferred to the new super-object, and the new super-object is returned, just before the error path is reached. Otherwise, the suffix of the error path that corresponds to the successful prefix of the construction path is executed (rollback). This approach cannot be rewritten with "defer" usefully. Normally you'd defer the rollback step for a sub-object immediately after its successful construction step, but this rollback step (= all rollback steps) will run upon successful exit too. So you need a separate flag for neutering all the deferred actions (all deferred actions will have to check the flag).
If we only need temporaries (and no permanent sub-objects), then (first without defer, again) we build the same code structure (using cascading gotos or the arrow pattern), but upon success, we don't return out of the middle of the function; instead, we store the new object outwards, and fall through to the rollback path. IOW, the rollback steps are used (and needed) for the successfully constructed temporaries regardless of function success or failure. This can be directly expressed with "defer"s. The problem is of course that the actual rollback execution order will not be visible in the source code; the compiler will organize it for you. I dislike that.
If we need both temporaries and permanent sub-objects, then we need the same method as with temporaries, except the rollback steps of the permanent sub-objects need to be restricted to the failure case of the function. This means that with either the cascading gotos or the arrow pattern, some teardown steps will be protected with "if"s, dependent on function return value. Not great, but still quite readable. With defer, you'll get a mix of deferred actions some of which are gated with ifs, and some others of which aren't. I find that terrible.
kubb 4 days ago |
So, you recommend using goto, which indeed is the only available mechanism.
Someone 4 days ago |
> using goto, which indeed is the only available mechanism.
There are other options, but none of them is better, IMO. You can use nested functions:
char * p = malloc(10); if(p!=0) { doTheRealWork(p); }; free(p);
In gcc, doTheRealWork can be a nested function or it you can force them to be inlined.
You can also (more readable than that first alternative, IMO) wrap code in a single-iteration for or while loop and then use break to exit it:
char *p = null; for(int i=0;i==0;i=1) { p = malloc(10); if(p==0) break; … } free(p);
Both will get uglier if you need to free multiple resources, but will work.
scott_s 5 days ago |
Look at the Linux kernel. It uses gotos for exactly this purpose, and it’s some of the cleanest C code you’ll ever read.
C++ destructors are great for this, but are not possible in C. Destructors require an object model that C does not have.
up2isomorphism 4 days ago |
GOTO is not necessarily a bad thing to begin with. Not to mention this does not really resemble GOTO.
tuveson 5 days ago |
For parity, C should add a `prefer` keyword that hoists statements to the top of the function.
rurban 5 days ago |
No, C is not perl. They do have BEGIN blocks, but these are constexpr.
eru 5 days ago |
You have top and bottom, but what about hoisting sideways?
shakna 5 days ago |
That seems actually feasible, with instruction scheduling. But would probably ruin a lot of the memory alignment optimisations.
e12e 4 days ago |
Sounds like came_from might be helpful in implementing prefer?
fuhsnn 5 days ago |
The C++ lambda version can be a foot gun: https://godbolt.org/z/Wd66GcrdG, if the return value is a struct, NRVO (named return value optimization) may be applied and the lambda will be called in different order.
As for the n3434 proposal, given that the listed implementation experiences are all macro-based, wouldn't it be more easily adopted if proposed as a standard macro like <stdarg.h>?
funny_falcon 4 days ago |
OMG!
returns_struct looks actually correct to me, ie it is expected (by me). Golang's defer works this way.
Do both examples follow standard? Or is it common misinterpretation by all compilers?
1oooqooq 4 days ago |
template<typename T> struct __df_st : T { [[gnu::always_inline]] inline __df_st(T g) : T(g) { // empty } [[gnu::always_inline]] inline ~__df_st() { T::operator()(); } };
#define __DEFER__(V) __df_st const V = [&](void)->void #define defer __DEFER(__COUNTER__) #define __DEFER(N) __DEFER_(N) #define __DEFER_(N) __DEFER__(__DEFER_VARIABLE_ ## N)
#include <stdio.h>
struct S { int r; ~S(){} };
(i know hn will mangle this but i won't indent this on mobile...)
people really write cpp like this or is this a intentionally obscure example?
fuhsnn 4 days ago |
That's from the OP blog, the intention of those macros is to generate uniquely named local lambda.
Cleaner version is like: https://github.com/llvm/llvm-project/issues/100869#issue-243...
worik 5 days ago |
Does C really need this?
Do languages need to grow in this way?
The overriding virtue of C is simplicity.
vkazanov 5 days ago |
Yes it does, and numerous projects already use variants of this trick.
fuhsnn 5 days ago |
Including Linux kernel https://github.com/torvalds/linux/blob/master/include/linux/...
pjmlp 5 days ago |
That was lost long time ago, after K&R C became part of WG14.
ahoka 5 days ago |
C semantics are far from simple.
cryptonector 5 days ago |
> Does C really need this?
Yes.
> Do languages need to grow in this way?
Yes.
> The overriding virtue of C is simplicity.
C is not simple. It used to be, but it's not been for a long time.
cassepipe 5 days ago |
About your last point : https://queue.acm.org/detail.cfm?id=3212479
jefbyokyie 5 days ago |
> Does C really need this?
No.
> Do languages need to grow in this way?
No.
> The overriding virtue of C is simplicity.
It's no longer simple, especially not with the C11 memory model (which got retrofitted from C++11, and is totally incomprehensible without reading multiple hundreds of pages of PhD dissertations); however, gratuitously complicating C shouldn't be done.
Quis_sum 4 days ago |
C definitely does not need this - in fact I would go as far and consider it harmful.
I agree that a feature like this could be useful, but then there are other useful features which should be added too. Where do you stop? I hope the C standards committee does not succumb to the feature bloat trend we see in many other languages (hand on heart: how many people fully understand/master all of C++/23's features?).
Need proper arrays instead of pointers to contiguous parts of memory which are interpreted as arrays? Proper strings (with unicode)? Etc. - use a different language suitable for the job.
We really need to let go of the notion that complex software must be written in a single language. Use C for the low level stuff where you do not need/want an assembler or where you want full control. For everything else there are more suitable tools.
eqvinox 5 days ago |
Unfortunately, nested functions are one of the few GCC extensions not implemented by clang, as such this really only works on GCC.
loeg 5 days ago |
If this proposal is adopted in C2Y, surely Clang will implement it.
pjmlp 5 days ago |
Most likely yes, although being on ISO doesn't mean much, there are plenty of examples of features that not every compiler fully supports.
pkhuong 4 days ago |
I've used blocks with a dummy RTL and extracted the callback from the block descriptor... it worked.
bangaladore 5 days ago |
I'm a strong believer that if C had defer, many bugs would disappear. The number 1 issue I find myself having when switching from C++ to C is missing RAII (Main C++ way of implementing defer).
zwieback 5 days ago |
I agree about missing RAII when switching away from C++ but it seems like defer cleans up after enclosing function exits while RAII cleans up when object goes out of scope, which is more fine-grained. Maybe I'm misunderstanding exactly when defer would clean up but it seems more like a safety feature. As people pile more stuff into the function the cleanup would be deferred more and more while RAII encapsulates the management right where you need it, e.g. exiting a loop or something.
bangaladore 5 days ago |
Yeah, defer within scope is the most ideal form in my opinion.
accelbred 5 days ago |
This defer (using attribute cleanup) as well as Zig's are run when going out of scope. Go's runs at function exit.
Arnavion 5 days ago |
defer-based cleanup also has the problem that, if the function plans to return a value it has to make sure to *not* defer its cleanup, and the caller then has to remember to clean it up. Destructor-based cleanup avoids these problems, but of course it's reasonable for languages (existing ones like C and even newer ones like Zig) to not want destructors, so defer is the next best thing.
Note that C++ destructors are also not ideal because they run solely based on scope. Unless RVO happens, returning a value from a function involves returning a new value (created via copy ctor or move ctor) and the dtor still runs on the value in the function scope. If the new value was created via copy ctor, that means it unnecessarily had to create a copy and then destroy the original instead of just using the original. If the new value was created via move ctor, that means the type has to be designed in such a way that a "moved-out" value is still valid to run the dtor on. It works much better in Rust where moving is not only the default but also does not leave "moved-out" husks behind, so your type does not need to encode a "moved-out" state or implement a copy ctor if it doesn't want to, and the dtor will run the fewest number of times it needs to.
quelsolaar 5 days ago |
...and a we would get a host of new bugs that would be a lot harder to fix. Invisible jumps are very bad.
bangaladore 5 days ago |
Nearly every modern language supports defer in some sense. Unless you are talking about RAII hiding the defer, but that's not the case with a custom Defer type or similar that takes a closure in the constructor.
Defer is a way better solution than having to cleanup in every single failure case.
jefbyokyie 5 days ago |
> cleanup in every single failure case
you're only tempted to clean up [fully] in every single failure case if you don't know how to implement cascading gotos or the arrow pattern.
bangaladore 4 days ago |
I'll be honest that you are in the minority here.
goto is widely considered an anti-pattern and so is the arrow pattern.
jefbyokyie 4 days ago |
> you are in the minority here
Yes, I am.
jefbyokyie 5 days ago |
> Invisible jumps are very bad
Agreed; they're terrible. Implicit sucks, explicit rules.
pjmlp 5 days ago |
The major source of bugs in C is located on string.h, and nothing has been made in 50 years to fix that.
Really fix, not mitigations with their own gotchas.
Dwedit 5 days ago |
There have been things done to try to fix that, see "strcpy_s" and other related functions. Visual Studio even considers use of the classic string functions (like "strcpy") to be a compiler error. You need to define a specific macro before you are allowed to use them.
e4m2 5 days ago |
The much dreaded Annex K functions are perhaps the worst possible example of an attempt at "fixing" anything safety related in C. A waste of ink.
pjmlp 4 days ago |
You missed my second paragraph.
jefbyokyie 5 days ago |
> number 1 issue I find myself having when switching from C++ to C is missing RAII
That's because you've become complacent; you've gotten used to the false comfort of destructors. C++ destructors promise that you can ignore object destruction in business logic, and that's a false promise.
Assume you have a function that already has a correct traditional (cascading gotos, or arrow pattern) exit path / error path. Assume that you have the (awkward) defer-based implementation of the same function. Assume you need to insert a new construction step somewhere in the middle. In the defer-based implementation, you insert both the new action and its matching "defer" in the same spot.In the traditional implementation, you locate the existent construction steps between which you insert the new construction step; you locate the corresponding existent destruction steps (which are in reverse order), and insert the new destruction step between them. The thinking process is more or less the same, but the result differs: without defer, your resultant source code shows what will actually happen on the exit path, and in precisely what order, and you can read it.
I think defer is awful.
bangaladore 4 days ago |
See but the issue is that humans are not perfect, and time is usually a resource.
Without defer like mechanisms objects get leaked, mutexes held after return, etc...
In a perfect world everything could be perfectly explicit as infinite time and energy has gone into ensuring nothing is forgotten and everything / every code path is well exercised.
Even then, scoped based resource acquisition / releasing still feels more ergonomic to me.
jagrsw 5 days ago |
Ages old (~2014 IIRC) defer implementation for gcc and for clang:
https://github.com/google/honggfuzz/blob/c549b4c31815e170d3b...
joshmarinacci 5 days ago |
Unfortunately the first thing I see is a giant ad scroll down over the header blocking the entire viewport on my phone, so immediately hit the back button. Are all personal blogs automatically monetized now?
wgjordan 5 days ago |
Free sites on WordPress.com are:
https://wordpress.com/support/no-ads/
senderista 5 days ago |
For C++, scope_guard has been around forever.
dataflow 5 days ago |
What are the unwind semantics on GCC, Clang, MSVC?
aleden 5 days ago |
Nobody knows about BOOST_SCOPE_DEFER?
#include <boost/scope/defer.hpp>
BOOST_SCOPE_DEFER [&] {
close(fd);
};
Implementation here: https://github.com/boostorg/scope/blob/develop/include/boost...
bdamm 5 days ago |
Only for those willing to stomach the entire boost hairball, which in embedded environments might be too much to ask.
aleden 5 days ago |
Is that why I got down voted? This thing is header-only, I'm guessing it doesn't need rtti or have a big footprint.
btown 5 days ago |
I once worked on a student robotics project where if you didn't gracefully shut down the connection to a specialized camera, it had a significant chance of physically bricking the camera.
We were using C++ and essentially instrumented code review processes, despite being a student group, to ensure nothing was ever called in a way where the destructor wouldn't be called - and broke out vision processing into a separate process so if other processes crashed we'd still be okay. In retrospect it was amazing training for a software engineering career.
But I always look at the words "simple" and "defer" and shudder when I see them next to each other!
Just like you'd have a "threat model" for cybersecurity, make sure you understand the consequences of a defer/destructor implementation not functioning properly.
xorvoid 4 days ago |
Yikes. So the OOM Killer bricks your camera then!?!
btown 4 days ago |
Just (carefully) power down the robot before the memory leaks exceed available RAM!
gizmo686 4 days ago |
Yikes. What is it with high school robotics and buggy camera APIs? Back when I competed, we were using Java with the FRC provided vision library. When we tried adding the camera, we discovered that our Java code started to segfault after a few minutes of running. This turned out to be comming from the native camera code being invoked through the FFI.
This wasn't as bad as what you saw, as a restart would fix it (for a few minutes at least), but would disable us for the rest of a match if it ever happened at a competition.
At the end of the day, we worked around this by implementing a null pointer check. It was something along the lines if:
if (camera.toString().contains("(null)") return;
I assume there was still a race condition where the garbage collector would trigger whatever ffi destructor nulled out the relevent pointers; but we never ran into it.
To your point, this looks to be an alternative to the 'goto end' pattern in C. In my experience that is the least bug prone pattern of destructors for C code.
pohuing 4 days ago |
Starting a video stream in the ios dji sdk would permanently leak some 150MB of memory and I never found the cycle. So I just made sure the leak could only occur once. There really is something about video..
lilyball 5 days ago |
I don't think "you can implement this slightly modified version of the proposal in GCC macros" is a good reason to change the defer proposal. It doesn't work in clang¹ or any other non-GCC C implementation, a trailing semicolon after the close brace is an ugly wart, and saying "if you want to implement this all you have to do is implement [[gnu::cleanup]] and nested local functions" is presumably much more work for other C implementations than just implementing defer is. I also question whether this even works in GCC with executable stacks disabled (I know GCC can optimize away the trampoline since the function doesn't escape the local scope, but does GCC let you write nested functions at all without executable stacks?).
¹The author says it works in clang using Apple's Blocks feature, but Blocks should not be required for defer and the variable semantics are wrong so it's a non-starter.
eru 5 days ago |
Why do you think you need executable stacks for nested functions?
I don't know what GCC is doing, but functional programming languages usually 'compile away' the nesting of functions fairly early, see https://en.wikipedia.org/wiki/Lambda_lifting
Update: Oh, I see, it's because GCC doesn't want to change how function are passed around ('function pointers' in C speak), so they need to get creative.
Functional languages use something closer to what C people would call a 'fat pointer' to fix this.
uecker 4 days ago |
GCC creates trampolines only when it needs the address of the nested function, which it does not in this case (formally it does but not really). Also new version can place the trampoline on the heap (-ftrampline-impl=heap) and then you also need no executable stack. I wide pointer would be better though.
eru 4 days ago |
> Also new version can place the trampoline on the heap (-ftrampline-impl=heap) and then you also need no executable stack. I wide pointer would be better though.
Wouldn't you need an executable heap, though?
(That's not as outlandish, because eg a JIT would need that to. But most straightforward C programs don't need to create any executable memory contents at runtime.)
uecker 4 days ago |
It does use executable heap space, but I think it uses specific pages which might also be pre-initialized, but I need to check how they implemented it.
kenferry 5 days ago |
I don't think they're advocating not doing defer in C? They're saying you can backport the functionality if needed, or if you want to start using it now.
lilyball 4 days ago |
They're recommending changes to the proposal though, such as requiring a trailing semicolon after the close brace. It also changes the syntactical category of the defer statement, though it's not clear to me what that actually affects.
teo_zero 5 days ago |
The macro implementation requires braces. Even when the deferred action is a single statement you can't use this:
defer fclose(f);
Not a serious problem, just inelegant.
antics 5 days ago |
One big flaw in Go-style `defer` is that error handling and scoping are an afterthought, and it's disappointing to see this is not improved upon in this proposal (at least as I understand the proposal).
I do think this matters in practice. If I have a `func whatever() error` IME it's a common to accidentally do something like `defer whatever()` without catching handling the error. To work around that you'd need to do something like the following.
var err error defer func() { err = whatever() }
For me personally: ugh. I understand the "received wisdom" is to just structure your code differently. I don't think that's reasonable, because cleanup code is often complicated, often can fail, and is generally vastly less well-tested/exercised. YMMV, my experience is that, because of these things, the dispose-path bugs tend to be disproportionately bad. Error handling facilities that treat the dispose path as by-default not worth insuring are IMO setting users up to have a very bad time later.
While we're fixing up the error handling, I really think it's not an accident that "`defer` in a loop does not terminate in the loop, but at the end of the function" is such a frequent misunderstanding. Yes the docs are clear, but all languages spend a lot of time training you on lexical scoping, and lexically-defined disposal lifetimes (e.g., C#'s `using` or Python's `with`) are a natural extension of that training. Those implementations to varying degrees have the same error handling problem, but with some effort I really think a better world is possible, and I don't think we have to expand this to full-on RAII to provide such semantics.
accelbred 4 days ago |
__attribute__((cleanup)) cleans up on scope exit, unlike go.
pjmlp 4 days ago |
> One big flaw in Go-style `defer` is that error handling and scoping are an afterthought....
Like many other things in Go's approach to language design, still better than using plain old C, though.
antics 4 days ago |
Yes, for better or for worse, I think the ethos of Go is pretty much "a mulligan on C". Every C project re-implements a bespoke hash table, struggles with Unicode strings, spends a lot of time on memory management, gets tripped up on dynamic linking, and so on. If otherwise like C but are tired of those things, guess which language is perfect for you?
seodisparate 5 days ago |
One can use C's `cleanup` attribute like a defer. https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...
The function signature is `void(MyType*)` if it is used like:
__attribute__((cleanup(MyTypeCleanupFunction))) MyType myType;
If it's a pointer, it'll call the cleanup as a double ptr, and etc. Note that the specified cleanup function can do anything you want to, though the intended purpose is to cleanup the variable when it goes out of scope.
0xTJ 5 days ago |
That isn't a C feature, it's a feature of GCC and other compilers. While I have used (and horribly overused) `cleanup` in personal projects, it's not a replacement for a feature being in the standard.
wakawaka28 4 days ago |
I think what you really want in C++ is ScopeGuard. It is an old idea but it has been improved in recent versions of C++.
LelouBil 4 days ago |
I'm still not sure which is more readable between defer and try/finally (not in any language particularly).
I know I hate one and love the other, but I still don't know which.
EPWN3D 4 days ago |
I like the approach of attributing variables better to be honest, e.g. Apple's Libc has a header with attributes for some common resource types.
https://github.com/apple-oss-distributions/Libc/blob/Libc-16...
gnuhigh 4 days ago |
You can implement defer using computed goto or longjmp. Problem is that it doesn't work if user returns early so you need a custom return macro.
Ericson2314 4 days ago |
Everyone that doesn't like a new standardized defer is doing https://knowyourmeme.com/memes/no-take-only-throw but as "no forest, only trees"
The explicit alternative for defer is an unreadable mess. And guess what? If you really need to see it, we should having tooling that desugars the defer to gotos. That's the best of both worlds!
- The "master" source code is high level "say what I mean"
- Low level view that is still much higher than dropping down into assembly still exists!
You really should be able to express the required cleanup semantics with "defer", and if you cannot, you can always just replace the original with the low level view and edit it instead --- good luck getting that past code review, however :).
jefbyokyie 3 days ago |
> You really should be able to express the required cleanup semantics with "defer"
I'm perfectly able to do that, I just don't want it, because the result is terrible. The syntax no longer shows what happens when and where.
And, if you look at n3434 <https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3434.htm>, you can see the following gem:
> the deferred block may itself contain other defer blocks, that are then executed in chain according to the same rules
This is the worst possible outcome. It means that a continuation passing style sub-language gets embedded in C.
{ defer { a(); defer { b(); }; c(); }; defer { d(); }; }
At the final closing brace shown, d() will be invoked first, then a(), then c(), then finally b(). Syntactically, the invocations appear in a-b-c-d order in the source code, but the actual execution is neither that nor the inverse d-c-b-a nor the single-level inverse order d-a-b-c.
THAT is an unreadable mess. Good luck debugging that. Not dissimilar to chaining futures in modern async C++ (lambdas deeply nested in lambdas). Which is an abomination.
Your source code syntax is now completely detached from the execution order within a single thread.
Good luck getting that past code review.
Your reference to "no take only throw" makes no sense to me. The gist of that meme is "various characters making contradictory demands". Nobody is making demands here. There's no need for an "explicit alternative". Just don't defer at all, regardless of style. Write the error path / exit path explicitly, using gotos or the arrow pattern.
Ericson2314 2 days ago |
d, a, c, b makes perfect sense to me --- it's naturally compositional!
There is a style of grey-beard that never wants to reason in parts. He simple loads the whole program into his big brain and works from there: trees through the eyes, forest (maybe, certainly subjective) in the mind.
I too am big-brained, but I wholly reject this approach --- it makes for code that is sorely lacking in motivation --- all "what", no "why".
With "goto" I need to reconstruct what the arbitrary control flow means. It might be cleanup, it might be something else entirely! With "defer", we have the essence of RIAA without any bad OOP nonsense. The cleanup obligations themselves are literal in the code.
You don't need to know the order d, a, c, b run in 99% of the time. The reverse-order semantics indicate the LIFO semantics, which should be enough and tied to e.g. the deadlock avoidance or inter-resources-that-need-cleanup dependency management. And again, whenever you do want to see what the actual order is, the desugarer should always be a click away, perfectly your questions.
nathanmills 4 days ago |
This is completely unnecessary. Just use a goto statement with labels. The whole point of C is that it's a simple language, and adding more complexity like 'defer' just makes it harder to understand. This proposal is just making C more like those bloated modern languages. If you need fancy features like this, use Rust instead.