Removing global state from LLD, the LLVM linker

37 points by ingve 3 days ago | 9 comments

beeforpork 3 hours ago |
Why not use thread_local instead of passing a param everywhere? What's the drawback there?
mrkeen 3 hours ago |
Thread-local is way too magical for me. I wouldn't want to debug a system that made use of it.
Also, if you pass a param, then it can be shared.
geocar 2 hours ago |
> Thread-local is way too magical for me. I wouldn't want to debug a system that made use of it.
There's a perfectly cromulent register just begging to be used; the circuitry has already been paid for, generating heat whether you like it or not, what magic are you afraid of here?
> Also, if you pass a param, then it can be shared.
Maybe, but if you design for sharing you'll never use your program might be bigger and slower as a result. Sometimes that matters.
cesarb 44 minutes ago |
> > Thread-local is way too magical for me.
> There's a perfectly cromulent register just begging to be used; [...] what magic are you afraid of here?
Most of the magic is not when using the thread-local variable, but when allocating it. When you declare a "static __thread char *p", how do you know that for instance this is located at the 123th word of the per-thread area? What if that declaration is on a dynamic library, which was loaded late (dlopen) into the process? What about threads which were started before that dynamic library was loaded, and therefore did not have enough space in their per-thread area for that thread-local variable, when they call into code which references it? What happens if the thread-local variable has an initializer?
The documentation at https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html links to a 81-page document describing four TLS access models, and that's just for Unix-style ELF; Windows platforms have their own complexities (which IIRC includes a per-process maximum of 64 or 1088 TLS slots, with slots above the first 64 being handled in a slightly different way).
rwmj 2 hours ago |
Certain linker operations can be multi-threaded (not sure if this is specifically true for LLD). Particularly LTO in the GNU toolchain, but also there's been a lot of effort recently to make linking faster by actually having it use all available cores.
ComputerGuru 8 minutes ago |
thread_local is usually considered the hack to make unthreaded code littered with static variables useable from multiple thread contexts. It has overhead and reduces the compiler’s ability to optimize the code as compared to when parameters are used.
Also, until very recently, a lot compilers/platforms were unable to handle thread_local variables larger than a pointer size making it difficult to retrofit a lot of old code.
high_na_euv 2 hours ago |
Ive always struggled to understand the need to have linker
Like, you could easily write your compiler to do not have to rely on such machinery
Meanwhile they add complexity and decrease quality of error messages (in cpp)
mschuster91 2 hours ago |
> Like, you could easily write your compiler to do not have to rely on such machinery
You need a linker as soon as you are dealing with either multiple languages in one project (say, C++ and ASM) or if you include other libraries.
Joker_vD 2 hours ago |
Separate compilation. Of course, if your compiler is fast enough to rebuild the whole universe in 6 seconds and then rest on the seventh — an approach Wirth advocated in one of his papers about an implementation of Pascal system — you won't need a linker. But most compilers are not that fast.
Besides, there is more than one programming language, so that's something we have to deal with somehow.
And to be fair, merging modules in the compiler, as you go by, while not that difficult, is just annoying. If you link them properly together, into big amalgamated text/rodata/data sections, then you need to apply relocations (and have them in the first place). If you just place them next to each other, then you have to organize the inter-module calls via some moral equivalent of GOT/PLT. In any case, all this logic really doesn't have much to do with code generation proper, it's administrativia — and logic for dealing with has already been written for you and packed in the so called "link editor".