What do you mean by that ?
C++ adding the same features means it'll be possible, but it has a much larger intersection of alternative features to introduce edge cases with, and _many_ implementations that will have this feature with varying quirks: all of this means C++ will have to do a _much_ better job than Zig does here to achieve nearly the same result.
I think both will be nice.
Hmm. Actually, now that makes me want to learn Zig.
I bet a Fortran 77 developer will think the same of Fortran 2023, a COBOL 60 developer of COBOL 2023, a K&R C developer from C23, a 1975 Scheme developer from R7RS, a Python 1.0 developer from Python 3.13,... even Go 1.0 developers from 1.24 with generics, generators,...
This lets the language throw away bad ideas, without throwing away the code people wrote in the era when we didn't realise that's a bad idea.
It doesn't cover standard library, scenarios with binary libraries doing cross calls across epochs and several others.
I still don't see it any different from a language version switch like in many other languages, specially in JVM and .NET ecosystem.
Languages with good macro systems have the upper hand in that regard.
Which is what people always forget when comparing language grammars.
Rust has had an almost usable implementation for affine types, and being the second coming of Ada, to win the hearths of the industry, including all major OS vendors and hyperscallers.
Go got lucky with Docker and Kubernetes rewrites, and their adoption across the industry.
So far Zig is basically Modula-2 with C like syntax, and compile time execution, relies on the same tooling that C and C++ have had for decades for use-after-free, doesn't support a binary libraries ecosystem by design, and really Bun isn't going to be the killer project that triggers a Rewrite in Zig movement.
It remains to be seen if Zig 1.0 happens, and how its adoption story at scale will be like.
But people really, REALLY want to get off c and c++ for all the numerous reasons everybody knows.
Any language that's older than 10 years is going to make questionable life choices. It's very easy to be Captain Hindsight, and ask why didn't you do X, 5 years ago? But adding feature X also makes another feature or property impossible, either via opportunity cost or features/properties being at odds.
That said, what do you mean by questionable life choices?
Yes, true. Every language has to make some early foundational choices, as few as possible, and try to carefully think about any new addition to the core because of the extra congnitive load that comes with it.
Go is an extreme example here, leaning towards the conservative side. C as well. Zig. Not a fan of Java but it also is kinda slow to add things. Python used to be very careful as well but that epoch is gone.
C++ is the opposite example. It tries to add as much as possible, and it was always the case. C compatibility! And classes! Templates! RAII! Metaprogramming! More of everything! Until it reached a point where it's unforgiving hard to add things. Or even learn it properly.
Now, Rust feels like a C++ reimplementation, complete with a culture of adding as much as possible as quickly as possible, and ignoring the resulting cognitive load.
I mean, it's a choice. Rust definitely has some great, even amazing, ideas to it. But I am afraid of thinking what the language will feel like in 10 years.
Ok, but I did ask for what specifically do you mean by questionable life choices? I feel Java is moving at a fast pace (and adding everything and the kitchen sink). Hence, why I wanted specific examples. Can you separate your feelings from facts, and see from where the feelings come from? I'm not saying you're wrong, I'm saying I want to understand your basis for that.
> Go is an extreme example here, leaning towards the conservative side.
Is it? Didn't it also start adding features that it swore not to add (generics)?
I mean, there are things I like: bits taken from the ML language family, tooling, sane approach to OOP, error handling.
But the culture of trying to pull everything in... It is a road to hell.
The difference between Rust and Go here is that Go took 10 years to come up with a generics proposal, and it does solve a massive problem with the language.
> But the culture of trying to pull everything in... It is a road to hell.
What part do you mean? Is it async? I'm as well very critical of it as well.
Or do you mean. Const? The compile time reflections?
I've already mentioned things that are universally praised.
I don't like having 2 macro systems, async, convoluted syntax and also am not sure that the (unsafe) consequences of using a borrow checker are worth it.
To me Rust sounds like the story of C++ (or Common Lisp) all over again: what if we add all the cools stuff? Like, ALL THE THINGS. And Rust is a single implementation language. Nothing limits the speed of thought!
Additionally extensions from Turbo C, Borland C, MSVC C, xlC, aC, Green Hills C, TI C, ARM C, Microbit C, clang C, GCC C,Intel C, CUDA C, ISPC,OpenCL C, Renderscript C...
But in all honesty, starting with ANSI C and onwards the language itself didn't change all that much, at least not in the same way C++ did. C99 was a serious cleanup but that's about it.
Go the language evolves at a similar pace, most things happen outside of the core.
Now what's interesting is that C++ is (almost) all C's quirks multiplied by the latest things WG21 decides to pull into the language. And WG21 does indeed add a lot.
At some point I just lost the point of it all.
It might be enough to make the zig ecosystem viable. This along with tiger beetle (they have raised tens of millions).
I think a lot of time is spent right now on the tooling, I hope that in a near feature the zig team will be able to switch to the event loop / standard library topics which really need love.
Also it is quite telling that outside IDE friendly languages, debbugers are kind of stuck in the 80's, so no wonder that many think 80% of Visual Studio and friends is good enough.
Mein Fuehrer, Dieses feature... dieses feature wird von MVCC nicht eingebaut.
Is there a language that does generics in such a way that doesn't send compile times to the moon?
The upside is that if you only call a generic function with a u32, you don't instantiate an f32 as well. The downside is that when you do decide to call that function with an f32, all the comptime stuff suddenly gets compiled for the f32 and might have an error.
In practice, I feel that I gain way more from the fast compile than I lose from having a path that accidentally never got compiled as my unit tests almost always force those paths to be compiled at least once.
It's been a long time since I've dealt with templated C++, but I thought this was how C++ does it too.
C++ will only generate functions for template parameters that are actually used, because it compiles a version of the templated function for each unique template parameters.
C++ compile times are due to headers. Which in case of templates result in a lot of redundant work then deduplicated by the linker.
Rust suffers because they compile everything from source, and the frontend sends piles of unprocessed LLVM IR to the traditional slow backend.
This can be improved with better tooling, one example is the Cranelift backend, there could be an interpreter, and so on.
Examples of languages that don't send compile times to the moon with similar polymorphic power, Standard ML, OCaml, Haskell, D, Ada.
This was initially done so that Rust could compile things in parallel between crates by with spawning more rustc processes, which is obviously much easier than building a parallel compiler directly, but in the end it's suboptimal for performance.
It's actually a DSL, tuned to the specific use-case it's doing.
Looking at the language design, I really prefer Zig to Rust, but as an incompetent, amateur programmer, I couldn't write anything in Zig that's actually useful (or reliable), at least for now.
On the other hand, I do like the concept of comptime vs Rust macros.
[1]: https://github.com/nim-lang/Nim/wiki/Organizations-using-Nim
Alternatively, there is also a Nim effects tracking system that lets the compiler help you track the hidden control flow for you. So, at the top of a module, you can say {.push raises: [].} to make sure that you handled all exceptions somewhere. So, it may not be as "Wild West" as other exceptions systems that you are used to.
As with so many aspects, Nim is Choice. For many choice is good. For others they want language designers to constrain choice a lot (Go is probably a big recent example, because fresh out of school kids need to be kept away from sharper tools or similar rationales). A lot of these prog.lang. battles mirror bigger societal debates/divides between centralized controls and more laissez-faire arrangements. Nim is more in the Voltaire/Spiderman's Uncle Ben "With great power comes great responsibility" camp, but how much power you use is usually "up to you" (well, and things you choose to depend upon).
Will this transitively enforce exception handling? i.e. if a 3rd-party dependency that I am using calls into another dependency that raises exceptions, but doesn't handle them in any way (including not using that pragma), will Nim assert that? Otherwise, that's precisely the function coloring problem I mentioned: if you can't statically assert that a callee, or it's descendant callees, doesn't throw an exception then you have to assume that it will.
Yes. The module you put the pragma in is about the "roots" of the call graph, but it covers the whole call stack from those roots down wherever the code is defined. Sorry I didn't make that clear enough { but in my defense I was trying to address the problem you raised. :-) }
Some of the issues that come to mind:
* Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.
* It's not clear to me how Zig handles recursive generic types. Generally, type systems are lazy to allow recursion. So I can write something like
type Example = Something[Example]
(Yes, this is useful.)
* Type checking and compile-time computation can interact in interesting ways. Does type checking take place before compile-time code runs, after it runs, or can they be interleaved? Different choices give different trade-offs. It's not clear to me what Zig does and hence what tradeoffs it makes.
* The article suggests that compile-time code can generate code (not just values) but doesn't discuss hygiene.
There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html
And some features in your list are of questionable value IMHO (e.g. the "reasoning over a function type signature" - Rust could be a much more ergonomic language if the compiler wouldn't have to rely on function signatures alone but instead could peek into called function bodies).
There are definitely some tradeoffs in Zig's comptime system, but I think the more important point is that nothing about it is surprising when working with it, it's only when coming from languages like Rust or C++ where Zig's comptime, generics and reflection might look 'weird'.
This path leads to unbounded runtime for the type checker/borrow checker. We’re not happy about build times as is.
> nothing about it is surprising when working with it
I think there's a contradiction here - when you get deep into using this kind of feature in a complex way is precisely when you most need it to behave consistently, and tends to be where this kind of ad-hoc approach breaks down.
> Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.
Do you mean reasoning about a function in the sense of just understanding what a functions does (or can do), i.e. in the view of the practical programmer, or reasoning about the function in a typed theoretical system (e.g. typed lambda calculus or maybe even more exotic)? Or maybe a bit of both? There is certainly a concern from the theoretical viewpoint but how important is that for a practical programming language?
For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation. While there are many mundane issues with C++ templates, breaking parametricity is not a very big deal in practice. In contrast, it enables optimisations that are not otherwise possible (for templates). Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char). Maybe this is even required by the standard, I don't recall. My point is that in makes sense for C++ to allow this, I think.
C++ breaks parametricity even with normal templates, since you can e.g. call a method that exists/is valid only on some instantiations of the template.
The issue is that the compiler can't help you check whether your template type checks or not, you will only figure out when you instantiate it with a concrete type. Things get worse when you call a templated function from within another templated function, since the error can then be arbitrarily levels deep.
> My point is that in makes sense for C++ to allow this, I think.
Whether it makes sense or not it's a big pain point and some are trying to move away from it (see e.g. Carbon's approach to generics)
I might be wrong here, but as I understand it "parametricity" means loosely that all instantiations use the same function body. To quote wikipedia:
"parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions, which captures the intuition that all instances of a polymorphic function act the same way"
In this view, C++ does not break parametricity with "normal" (i.e. non-specialised) templates. Of course, C++ does not type check a template body against its parameters (unless concepts/trairs are used), leading to the problems you describe, but it's a different thing as far as I understand.
e.g.
template<typename T>
void f() {
if constexpr (is_int<T>) { return 0; }
else ...
Your example is considered a misfeature and demonstrates why breaking parametricity is a problem: the specialized vector<bool> is not a standard STL container even though vector<anythingelse> is. That's at best confusing -- and can leads to very confusing problems in generic code. (In this specific case, C++11's "auto" and AAA lessens some of the issues, but even then it can cause hard-to-diagnose performance problems even when the code compiles)
See https://stackoverflow.com/a/17797560 for more details.
(The Hysterical Raisins are understandable, esp. given that it wasn't even possible to specify Concepts in C++ until 2020...)
Proper parametricity is only a starting point: types that specify alignment, equality and lifetimes would be needed to make it useful.
This means you cannot write
fn sort<A>(elts: Vec<A>): Vec<A>
because you cannot compare values of type A within the implementation of sort with this definition. You can write
fn sort<A>(elts: Vec<A>, lessThan: (A, A) -> Bool): Vec<A>
because a comparison function is now a parameter to sort.
This helps both the programmer and the compiler. The practical upshot is that functions are modular: they specify everything they require. It follows from this that if you can compile a call to a function there is a subset of errors that cannot occur.
In a language without parametricity, functions can work with only a subset of possible calls. If we take the first definition of sort, it means a call to sort could fail at compile-time, or worse, at run-time, because the body of the function doesn't have a case that knows how to compare elements of that particular type. This leads to a language that is full of special cases and arbitrary decisions.
Javascript / Typescript is an example of a language without parametricity. sort in Javascript has what are, to me, insane semantics: converting values to strings and comparing them lexicographically. (See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) This in turn can lead to amazing bugs, which are only prevented by the programmer remembering to do the right thing. Remembering to do the right thing is fine in the small but it doesn't scale.
Breaking parametricity definitely has uses. The question becomes one about the tradeoffs one makes. That's why I'd rather have a discussion about those tradeoffs than just "constime good" or "parametricity good". Better yet are neat ideas that capture the good parts of both. (E.g. type classes / implicit parameters reduce the notational overhead of calling functions with constrained generic types, but this bring their own tradeoffs around modularity and coherence.)
- https://scalawithcats.com/ is a book I'm writing. There is an associated newsletter to which I post blog articles and the like.
- https://noelwelsh.com/ is my personal site, which hosts my blog.
sort<A> where A implements Comparable
Simpler explanation IMO.
So a language _with_ that is superior. All zig needs to do is add some way to allow for constraints.
Your point still stands though. Modern programming languages don’t constrain you much at all with their type systems.
I spent a little time in Haskell a few years ago and it the kind of reasoning you can do about functions is wild. Eg, if a function has the type signature of A -> A, we know the function has to be the identity function because that’s the only function that matches the type signature. (Well that or the “bottom function”, which loops forever). Lots of functions are like that - where the type definitions alone are enough to reason about a lot of code.
Ada takes this approach.
SBCL, which is a very popular Common Lisp implementation, is indeed strongly typed. Coalton, which is an addon, is even statically typed
(I "vouched" for your comment to which this is a reply, and that seems to have been sufficient to un-dead it. Unless the system's set up so that vouching for a shadowbanned comment un-deads it only for the person who does the vouching, I guess...)
You can't use the binding early like this, but inside of the type definition you can use the @This() builtin to get a value that's the type you're in, and you can presumably do whatever you like with it.
The type system barely does anything, so it's not very interesting when type checking runs. comptime code is type checked and executed. Normal code is typechecked and not executed.
comptime is not a macro system. It doesn't have the ability to be unhygienic. It can cleverly monomorphize code, or it can unroll code, or it can omit code, but I don't think it can generate code.
[0] https://ziglang.org/download/0.12.0/release-notes.html#Compt...
So at least address your points here:
* I do agree this is a direct trade-off with Zig style comptime, versus more statically defined function signatures. I don't think this affects all code, only code which does such reasoning with types, so it's a trade-off between reasoning and expressivity that you can make depending on your needs. On the other hand, per the post's view 0, I have found that just going in and reading the source code easily answers the questions I have when the type signature doesn't. I don't think I've ever been confused about how to use something for more than the time it takes to read a few dozen lines of code.
* Your specific example for recursive generic types poses a problem because a name being used in the declaration causes a "dependency loop detected" error. There are ways around this. The generics example in the post for example references itself. If you had a concrete example showing a case where this does something, I could perhaps show you the zig code that does it.
* Type checking happens during comptime. Eg, this code:
pub fn main() void {
@compileLog("Hi");
const a: u32 = "42";
_ = a;
@compileLog("Bye");
}
Gives this error: when_typecheck.zig:3:17: error: expected type 'u32', found '*const [2:0]u8'
const a: u32 = "42";
^~~~
Compile Log Output:
@as(*const [2:0]u8, "Hi")
So the first @compileLog statement was run by comptime, but then the type check error stopped it from continuing to the second @compileLog statement. If you dig into the Zig issues, there are some subtle ways the type checking between comptime and runtime can cause problems. However it takes some pretty esoteric code to hit them, and they're easily resolved. Also, they're well known by the core team and I expect them to be addressed before 1.0.* I'm not sure what you mean by hygiene, can you elaborate?
It’s possible but tedious and error-prone to avoid this problem by hand by generating unique identifier names for all macro-defined runtime variables (this usually goes by the Lisp name GENSYM). But what you actually want, arguably, is an extended notion of lexical scope where it also applies to the macro’s text and macro user’s program as written instead of the macroexpanded output, so the macro’s and user’s variables can’t interfere with each other simply because they appear in completely different places of the program—again, as written, not as macroexpanded. That’s possible to implement, and many Scheme implementations do it for example, but it’s tricky. And it becomes less clear-cut what this even means when the macro is allowed to peer into the user’s code and change pieces inside.
(Sorry for the lack of examples; I don’t know enough to write one in Zig, and I’m not sure giving one in Scheme would be helpful.)
Generating a text file via a writer with the intent to compile it as source code is no worse in Zig than it is in any other language out there. If that's what you want to do with your life, go ahead.
thats still powerful, you could probably build a compile time ABNF parser, for example.
Doing it this way is more verbose but sidesteps all hygiene issues.
Though, if you really wanted to do stupid things, you could use @embedFile to load a Zig source file, then use the Zig compiler's tokenizer/ast parser (which are in the standard library) to parse that file into an AST. Don't do that, but you could.
Based on my experience with Rust, a lot of what people want to do with its "constant generics" probably would be easier to do with a feature like comptime. Letting you do math on constant generics while maintaining parametricity is hard to implement, and when all you really want is "a trait for a hash function with an output size of N," probably giving up parametricity for that purpose and generating the trait from N as an earlier codegen step is fine for you, but Rust's macros are too flexible and annoying for doing it that way. But as soon as you replace parametric polymorphism with a naive code generation feature, you're in for a world of hurt.
* Documentation. In a sufficiently-powerful comptime system, you can write a function that takes in a path to a .proto file and returns the types defined in that file. How should this function be documented? What happens when you click a reference to such a generated type in the documentation viewer?
* IDE autocompletions, go to definition, type hinting etc. A similar problem, especially when you're working on some half-written code and actual compilation isn't possible yet.
The language doesn't see wide adoption in industry, so maybe its most important lessons have yet to be learned, but one problem with meta-programming is that it turns part of your program into a compiler.
This happens to an extent in every language. When you're writing a library, you're solving the problem "I want users to be able to write THIS and have it be the same as if they had written THAT." A compiler. Meta-programming facilities just expand how different THIS and THAT can be.
Understanding compilers is hard. So, that's at least one potential issue with compile-time programming.
"Understanding compilers is hard."
I think this is just unnecessarily pessimistic or embracing incompetence as the norm. It's really not hard to understand the concept of an "inline" loop. And so what if I do write a compiler so that when I do `print("%d", x)` it just gives me a piece of code that converts `x` to a "digit" number and doesn't include float handling? That's not hard to understand.
This has nothing to do with compile-time execution, though. You can reason about a function from its declaration if it has a clear logical purpose, is well named, and has well named parameters. You can consider any part of a parameter the programmer can specify as part of the name, including label, type name, etc.
> There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html
That's actually not a great article. While I agree with the conclusion stated in the title, it's a kind of "debate team" approach to argumentation which tries to win points rather than make meaningful arguments.
The better way to frame the debate is flexibility vs complexity. A fixed function generics system in a language is simpler (if well designed) than a programmable one, but less flexible. The more flexibility you give a generics system, the more complex it becomes, and the closer it becomes to a programming language in its own right. The nice thing about zig's approach is that the meta-programming language is practically the same thing as the regular programming language (which, itself, is a simple language). That minimizes the incremental complexity cost.
It does introduce an extra complexity though: it's harder for the programmer to keep straight what code is executing at compile time vs runtime because the code is interleaved and the context clues are minimal. I wonder if a "comptime shader" could be added to the language server/editor plugin that puts a different background color on comptime code.
I think "reason" in gp's context is "compile-time reasoning" as in the compiler's deterministic algorithm to parse the code and assign properties etc. This has downstream effects with generating compiler errors, etc.
It's not about the human programmer's ability to reason so any "improved naming" of function names or parameters still won't help the compiler out because it's still just an arbitrary "symbol" in the eyes of the parser.
The compiler doesn't do anything you, the programmer, don't tell it to do. You tell it what to do by writing code using a certain syntax, connecting identifiers, keywords, and symbols. That's it. If the meaning isn't in the identifiers you provide and how you connect them together with keywords and symbols, it isn't in there at all. The compiler doesn't care what identifier names you use, but that's true whether the identifier is for a parameter label, type name, function name or any other kind of name. The programmer gives those meaning to human readers by choosing meaningful names.
Anyway, zig's compile errors seem OK to me so far.
Actually, the zig comptime programmer can do better than a non-programmable compiler when it comes to error messages. You can detect arbitrary logical errors and provide your own compiler error messages.
There are many ways one can reason about functions, and I think all of us use multiple methods. Parametricity provides one way to do so. One nice feature is that its supported by the programming language, unlike, say, names.
zig generates a compile error when you try to pass a non-conforming type to a generic function that places conditions/restrictions on that type (such as by calling a certain predicate on instances of that type).
It's probably important to note that parametricity is a property of specific solution spaces, and not really in the ultimate problem domain (writing correct and reliable software for specific contexts), so isn't necessarily meaningful here.
Sure, but only after it's fully expanded, which is much harder to debug. And if a generic function doesn't fail to compile but rather silently behaves differently (e.g. if it calls a function that behaves unexpectedly, but still exists, on the type in question) then you don't get an error at all.
> parametricity is a property of specific solution spaces, and not really in the ultimate problem domain (writing correct and reliable software for specific contexts)
Nonsense. Without parametricity your software is not compositional and it becomes impossible to write correct software to solve complex problems.
Code goes into the compiler. Either compiled code or errors come out. There's no partial expansion step to cause confusion.
You're probably referring to something about the flexibility zig's comptime allows, but it's important to note a zig programmer can be as picky as they want about what types a generic function will accept. People are really just talking about what the syntax for expression type restrictions is.
> Without parametricity your software is not compositional and it becomes impossible to write correct software to solve complex problems.
You can hold that opinion, but it's not a fact. The overall question isn't binary. It's one of balancing complexity and flexibility. A fixed system for specifying type restrictions is simpler and provides fewer opportunities for mistakes (assuming it's well designed), and may have parametricity. However, the lack of flexibility can just push the complexity elsewhere, e.g., leading to convoluted usage patterns, which could lead to more mistakes. A programmable system for specifying type restrictions offers more flexibility at the cost of more up-front complexity, but in a well-designed system the flexibility could lead to less overall complexity, and fewer mistakes. A nice thing about zig's approach is that the generics metaprogramming language is pretty much the same as the regular language, which mitigates the increase in complexity. I actually think it should be possible to create some kind of generics system that could credibly be said to be programmable and have parametricity, though I don't think there's any point to doing so.
You could make the same argument against having a separate compilation step at all - code goes into the language, it gets executed, any other step would just add confusion. But most of us tend to find that having a compilation step that catches errors earlier is helpful and makes it easier to produce correct code. Similarly, being able to build and check generic code as-is (in the simplest case, because generic code really is just parametric code and isn't getting monomorphised) is a lot nicer than only being able to build and check individual expansions of it.
> the lack of flexibility can just push the complexity elsewhere, e.g., leading to convoluted usage patterns, which could lead to more mistakes. A programmable system for specifying type restrictions offers more flexibility at the cost of more up-front complexity, but in a well-designed system the flexibility could lead to less overall complexity, and fewer mistakes
Some way of doing ad-hoc polymorphism is probably desirable, but only if it's set up so that people don't default to it. Generic things should be parametric most of the time, and non-parametricity should be visible, so that people don't do it accidentally. It's similar to e.g. nullability - yes, you probably do want to have some way to represent absent/missing/error states, but if you just make every value nullable and say that any function that wants non-nullable inputs can check itself, that "flexibility" ends up being mostly ways to shoot yourself in the foot.
Economy of mechanism is powerful though, it's one of the reasons C is still so popular. The comptime approach that provides both parametric and ad-hoc polymorphism using a single mechanism seems to fit Zig quite well. I'm still a bit of a typaholic, but I've really come to appreciate economy of mechanism instead of deeply inscrutable types.
I think a good language would take something like Zig's approach to comptime, where the template/macro language is the same as the value language, with a deep consideration of TURNSTILE from "Type Systems as Macros":
https://www.khoury.northeastern.edu/home/stchang/pubs/ckg-po...
You can even get to dependent type systems as macros:
https://www.williamjbowman.com/resources/wjb2019-depmacros.p...
As far as I can see from a quick skim your links are one meta level up, about using macros to implement a type system for a DSL (which may have polymorphism and the like within that DSL) rather than using them to implement typing itself. There's still a distinction between types and macros, it's just that macros are being used to process types.
In theory there is no difference between theory and practice. In practice there is.
However, in other situations seeing "comptime" in Zig code has makes me go "oh no" because, like Lisp macros, it's very easy to use comptime to avoid a problem that doesn't exist or wouldn't exist if you structured other parts of your code better. For example, the OP's example of iterating the fields of a struct to sum the values is unfortunately characteristic of how people use comptime in the wild--when they would often be better served by using a data-structure that is actually iterable (e.g. std.enums.EnumArray).
Maybe the real WTF is the friends we made along the way. <3 <3 <3
Can only be fixed by fixing humans.
Then I start thinking things like: "if he was using Clojure he wouldn't be having the problems with nconc that he talks about" and "I can work most of the examples in Python because the magic is mostly in functions, not in the macros" and "I'm disappointed that he doesn't do anything that really transform the tree"
(It's still a great book that's worth reading but anything about Lisp has to be seen in the context the world has moved on... Almost every example in https://www.amazon.com/Paradigms-Artificial-Intelligence-Pro... can be easily coded up in Python because it was the garbage collection, hashtables on your fingertips, first class functions that changed the world, not the parens)
Lately I've been thinking about the gradient from the various tricks such as internal DSLs and simple forms of metaprogramming which are weak beer compared to what you can do if you know how compilers work.
Yeah, one would write the implementation in Java.
Common Lisp (and Lisp in general) often aspires to be written in itself, efficiently. Thus it has all the operations, which a hosted language may get from the imperative/mutable/object-oriented language underneath. That's why CL implementations may have type declarations, type inference, various optimizations, stack allocation, TCO and other features - directly in the language implementation. See for example the SBCL manual. https://sbcl.org/manual/index.html
For example the SBCL implementation is largely written in itself, whereas Clojure runs on top of a virtual machine written in a few zillion lines of C/C++ and Java. Even the core compiler is written in 10KLOC of Java code. https://github.com/clojure/clojure/blob/master/src/jvm/cloju...
Where the SBCL compiler is largely written Common Lisp, incl. the machine code backends for various platforms. https://github.com/sbcl/sbcl/tree/master/src/compiler
The original Clojure developer made the conscious decision to inherit the JIT compiler from the JVM, write the Clojure compiler in Java and reuse the JVM in general -> this reuses a lot of technology maintained by others and makes integration into the Java ecosystem easier.
The language implementations differ: Lots of CL + C and Assembler compared to a much smaller amount of Clojure with lots of Java and C/C++.
CL has for a reason a lot of low-level, mutable and imperative features. It was designed for that, so that people code write efficient software largely in Lisp itself.
But yeah, CL was one of the first languages specified by adults and set the standard for other specs like Java that read cleanly and don't have the strange circularity that you notice in K&R. So many people have this abstract view that a specification should be written without implementation in mind, but really the people behind CL weren't going to be implementable and clearly they'd given up on the "Lisp machine" idea and made something that the mainstream 32 bit machine could handle efficiently. It's quite beautiful and had a larger impact on the industry than most people admit.
(I think how C is transitional between failures like PL/I and modern languages like CL and Java that can be specced out in a way that works consistently)
In normal programming, functions "return" their values. In Continuation Passing Style (CPS), functions never return. Instead, they take another function as input; and instead of returning, they call that function (the "continuation"). Instead of returning their output, they pass their output as input to the continuation.
(Some optimizations are used such that this style of call, the "tail call", does not cause the stack to grow endlessly.)
Why would you write code in this style? Generally, you wouldn't. It's typically used as an internal transformation in some types of interpreters or compilers. But conceptualizing control flow in this way has certain advantages.
Then there are terms like the "continuation" of a program at a certain point in the code, which just means "whatever the program is going to do next, after it returns (or would return) from the code that it's about to execute". That's what "call with current continuation" (call/cc) is about. It captures (or reifies) "what will the program do next after this?" as a function that can be called to do, well, do that thing. If your code is about to call `f();`, then the 'continuation' at that point is whatever the code will do next after `f()` returns with its return value.
Thus if you had some code `g(f())`, then the continuation just as you call `f()` is to call `g()`. CPS restructures this so that `f()` takes the "thing to do next" as input, which is `g()` in this case. The CPS transformation of this code would be `f(g)`, where `g` is the continuation that `f` will invoke when it's done. Instead of returning a value, `f` invokes `g` passing that value as input.
You can use continuations to implement concepts like coroutines. With continuations, functions never need to "return". It's possible to create structures like two functions where the control flow directly jumps between between them, back and forth (almost like "goto", but much more structured than that). Neither one is "calling" the other, per se, because neither one is returning. The control flow jumps directly between them as appropriate, when one function invokes a continuation that resumes the other. The functions are peers, where both can essentially call into the other's code using continuations.
That's probably a little muddy as a first exposure to continuations, but I'm curious what you think. I generally think of continuations as a niche thing that will likely only be used by language or library implementors. Most languages don't support them.
Also, I'd probably argue that regular asynchronous code is a better way to structure similar program logic in modern programming languages. Or at least, it's likely just as good in most ways that matter, and may be easier to reason about than code that uses continuations.
For example, one use-case for coroutines is a reader paired with a writer. It can be elegant because the reader can wait until it has input, and then invoke the continuation for the writer to do something with it (in a direct, blocking fashion, with no "context switch"). But you can model this with asynchronous tasks pretty easily and clearly too. It might have a little more overhead, to handle context switching between the asynchronous tasks, but unlikely enough to matter.
That feels like the wrong word for the thing you're describing. Linguistic arguments aside, yes, you're absolutely right.
In Zig though, that issue is completely orthogonal to generics. The first implementation `foo` is the "only" option available for "truly arbitrary" `T` if you don't magic up some extra information from somewhere. The second implementation `bar` uses an extra language feature unrelated to generics to return a different valid value (it's valid so long as the result of `bar(T, x)` is never accessed). The third option `baz` works on any type with non-zero width and just clobbers some data for fun (you could golf it some more, but I think the 5-line implementation makes it easier to read for non-Zig programmers).
Notice that we haven't performed a computation with `T` and were still able to do things that particular definition of parametricity would not approve of.
fn foo(T: type, x: T) T {
return x;
}
fn bar(T: type, x: T) T {
_ = x;
return undefined;
}
fn baz(T: type, x: T) T {
var result: T = x;
const result_ptr: *T = &result;
const dangerous_shenanigans_ptr: *u8 = @ptrCast(result_ptr);
dangerous_shenanigans_ptr.* = 42;
return result;
}
Zig does give up that particular property (being able to rely on just a type signature to understand what's going on). Its model is closer to "compile-time duck-typing." The constraints on `T` aren't an explicitly enumerated list of constraints; they're an in vivo set of properties the code using `T` actually requires.That fact is extremely annoying from time to time (e.g., for one or two major releases the reference Reader/Writer didn't include the full set of methods, but all functions using readers and writers just took in an `anytype`, so implementers either had to read a lot of source or play a game of whack-a-mole with the compiler errors to find the true interface), but for most code it's really not hard to handle.
E.g., if you've seen the `Iterator` pattern once, the following isn't all that hard to understand. Your constraints on `It` are that it tell you what the return type is, that return type ought to be some sort of non-comptime numeric, and it should have a `fn next(self: *It) ?T` method whose return values after the first `null` you're allowed to ignore. If you violate any of those constraints (except, perhaps, the last one -- maybe your iterator chooses to return null and then a few more values) then the code will fail at comptime. If you're afraid of excessive compiler error message lengths, you can use `@compileError()` to create a friendlier message documenting your constraints.
It's a different pattern from what you're describing, but it's absolutely not hard to use correctly.
fn sum(It: type, it: *It) It.T {
var total: T = 0;
while (it.next()) |item|
total += item;
return total;
}
> recursive genericsA decent mental model (most of which follows from "view 4" in TFA, where the runtime code is the residue after the interpreter resolves everything it can at comptime) is treating types as immutable and treating comptime evaluation like an interpreted language.
With that view, `type Example = Something[Example]` can't work because `Example` must be fully defined before you can pass it into `Something`. The laziness you see in ordinary non-generic type instantiations doesn't cross function boundaries. I'm not sure if there's a feature request for that (nothing obvious is standing out), but I'd be a fan @AndyKelley if you're interested.
In terms of that causing problems IRL, it's only been annoying a few times in the last few years for me. The most recent one involved some comptime parser combinators, and there was a recursive message structure I wanted to handle. I worked around it by creating a concrete `FooParser` type with its associated manually implemented `parse` function (which itself was able to mostly call into rather than re-implement other parsers) instead of building up `FooParser` using combinators, so that the normal type instantiation laziness would work without issues.
> when does type checking run
Type inference is simplistic enough that this is almost a non-issue in Zig, aside from the normal tradeoffs from limited type inference (last I checked, they plan to keep it that way because it's not very important to them, it actively hinders the goal of being able to understand code by looking at a local snapshot, and that sort of complexity and constraint might keep the project from hitting more important goals like incremental compilation and binary editing). They are interleaved though (at least in the observable behavior, if you treat comptime execution as an interpreter).
Ive ran experiments where a neural net is implemented by creating a json file from pytorch, reading it in using @embedFile, and generating the subsequent a struct with a specific “run” method.
This in theory allows the compiler to optimize the neural network directly (I havent proven a great benefit from this though). Also the whole network lived on the stack, which is means not having any dynamic allocation (not sure if this is good?).
I guess in theory you could compile once into a static library and just link that into a main program. Also there will be incremental compilation in zig I believe, maybe that helps? Not sure on the details there.
How do you people debug and test these meta programs? Mine are just regular C programs that uses the exact same debuggers and tools as anything else.
> Arbitrary compile-time execution in C:
> cl /nologo /Zi metaprogram.c && metaprogram.exe
> cl /nologo /Zi program.c
> Compile-time code runs at native speed, can be debugged, and is completely procedural & arbitrary
> You do not need your compiler to execute code for you
Debugging .. well, you have to do a bit more work to set up a nice test framework, but you can then run the compiler with your plugin from inside your standard unit test framework, inside the interactive debugger.
Cool, you now invented your own DSL and half-baked meta programming macro language for something that shall have been in the language to begin with.
In addition, any complex interaction between your "own made template engine" and the native code is now a pile of hack. E.g write a generic function: Good luck to interpret any error based on the typing.
Code generation is almost consistently the worst solution to a meta-programming problem.
I'm not sure I'm following your statement. What he said was to use a C program to parse C code and emit additional C code. There is no mention of DSL.
> for something that shall have been in the language to begin with.
The whole point of this discussion is to debate on that.
I have no strong opinion (yet) but the meta program looks easy to understand compared to the pandora box of metaprogrammaing withing the language (since it requires standardization, limitations etc.).
Because it is always the same story:
- You start by writing a little meta-compiler to solve one specific codegen problem in a specific portion your code.
- Then you realize their is many slight variations of this problem in other areas of your project or other projects... because it is exactly what *Genericity* is all about. And we know that since literally the 1970s and freaking LISP.
- To avoid your meta-compiler to become a Frankenstein of options with endless hardcoded logic: you make it interpret some annotations in C comments, some preprocessors or some template files somewhere.
-> Congratulations: you invented your own half baked DSL.
I have seen that many time, in many places. Again and again. Often because there is a category of C programmers that would prefer to swim in their own shit instead of using few C++ templates.
Codegen is consistently a terrible solution to a well studied problem: meta-programming. The fear of the feature creep (templates, macros) shall never be a justification to create some half baked complexity monster that will alaways finish worst than the problem they try to avoid.
If you doubt about that: Just use a lexer or parser generator. Or better, the quintessence of codegen: Autotools. They are a perfect illustration of how terrible and how fucking unmaintainable Codegen is.
I can think of a single use case where a meta-compiler and a DSL are appropriate solutions: *Serialization* (Protobuf, Thrift, CapNProto, ...). Because in this precise case, you actually do want a language neutral way to express your interface: you want an IDL.
Currently here, Zig does the right thing: Comptime execution for meta-programming is one order of magnitude better than anything available in C or C++ before C++20
As for DSL, any DSL is a separate language that needs to be learned, so it is not very different from creating your own processor.
I couldn't find any other answer than using @compileLog to print-debug [1]. In lisp, apparently some implementations allow to trace macros [2]. Couln'd find anything about Nim's macro debugging capabilities.
This whole thing looks like a severe limitation that is not balanced by the benefit of having all code in the same place. Do you know other languages that provide sensible meta-programming facilities?
[1] https://www.reddit.com/r/Zig/comments/jkol30/is_there_a_way_... [2] https://stackoverflow.com/questions/44872280/macros-and-how-...
https://nim-lang.org/docs/macros.html#toStrLit%2CNimNode
https://nim-lang.org/docs/macros.html#astGenRepr%2CNimNode
https://nim-lang.org/docs/macros.html#dumpAstGen.m%2Cuntyped
https://nim-lang.org/docs/macros.html#treeRepr%2CNimNode
It's a fragile horrible mess, and the need to do this was a major reason for me to switch away from Python. It's a bit like asking why we don't just pass all arguments to functions as strings. Yeah, people write stringly typed code, but it should rarely be necessary, and your language should provide means to avoid it.
This describes exactly what people don't want to do.
Furthermore, you want some sort of AST representation, at one level of convenience or another (I include this compgen-style "being 'in' the AST" to be included in that, even if it doesn't necessarily directly manipulate AST nodes), and C isn't particularly great at manipulating those, either, in a lot of different ways.
A consequence of C being the definitive language that pretty much every other language has had to react to, one way or another through however many layers of indirection, for the past 40+ years, is that pretty much every language created since then is better than C at these things. C's pretty long in the tooth now, even with the various polishings it has received over the years.
I don't know a lot about debugging zig comptime, though. I use printf-style debugging and the built-in unit test blocks. That's all I've needed so far. (Perhaps that's all there is.)
This needs to become a standard feature of programming languages IMHO.
It’s actually one of the biggest things I find lacking in Rust which is limited to non-typed macros (last I tried). It’s so limiting not to have it. You just have to hope serde is implemented on the structs in a crate. You can’t even make your own structs with the same fields in Rust programmatically.
I haven’t written a line of either. I could see using Zig, but there’s no plausible scenario where I’d ever write Mojo. Weird proprietary languages tend to be a career pigeonhole: “you’ve been doing what for the last 5 years?”
Like I always say, most languages start off closed, incubated for some years by a tiny group, before being opened. Mojo is no different, in fact, Modular have given a pretty solid timeline about when they plan to open source the compiler - https://youtu.be/XYzp5rzlXqM?si=nmvghH3KWX6SrDzz&t=1025
Looks like I was wrong about having to pay to use Mojo itself. It's their "MAX" product you have to pay for, at least today. The language currently free-of-charge, although proprietary.
In this case the operators should be unsurprising, so they do what one would expect based on the source domain. Multiplying a vector and a scalar for example should return the scaled vector, but one should most likely not implement multiplication between vectors as that would likely cause confusion.
"+ can do anything!" As you said, so can plus().
"Hidden function calls?" Have they never programmed a soft float or microcontroller without a div instruction? Function calls for every floating point op.
I suspect zig might be similar.
In ocaml you can redefine operators... but only in the context of another module.
So if I re-define + in some module Vec3, I can do:
Vec3.(a + b + c + d)
Or even: let open Vec3 in
a + b + c + d
So there you go, no "where did the + operator come from?" questions when reading the source, and still much nicer than: a.add(b).add(c).add(d)
I doubt zig will change though. The language is starting to crystallize and anything that solved this challenge would be massive.My ideal solution would be for the language to introduce custom operators that clearly indicate an overload. Something like a prefix/postfix (e.g. `let c = a |+| b`). That way it is clear to the person viewing the code that the |+| operation is actually a function call.
This is still open to abuse but I think it at least removes one of the major concerns.
[1] https://ziglang.org/documentation/master/#Vector
[2] https://clang.llvm.org/docs/LanguageExtensions.html#vectors-...
> Here the comptime keyword indicates that the block it precedes will run during the compile.
D doesn't use a keyword to trigger it. What triggers it is being a "const expression". Naturally, const expressions must be evaluatable at compile time. For example:
int sum(int a, int b) => a + b;
void test()
{
int s = sum(3, 4); // runs at run time
enum e = sum(3, 4); // runs at compile time
}
By avoiding use of non-constant globals, I/O and calling system functions like malloc(), quite a large percentage of functions can be run at compile time without any changes.Even memory can be allocated with it (using D's automatic memory management).
__gshared uint[256] tytab = tytab_init;
extern (D) private enum tytab_init =
() {
uint[256] tab;
foreach (i; TXptr) { tab[i] |= TYFLptr; }
foreach (i; TXptr_nflat) { tab[i] |= TYFLptr; }
foreach (i; TXreal) { tab[i] |= TYFLreal; }
/* more lines removed for brevity */
return tab;
} ();
The initializer for the array `tytab` is returned by a lambda that computes the array and then returns it.A link to the full glory of it:
https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ba...
Another common use for CTFE is to use it to create a DSL.
How does the D compiler ensure correctness if the machine the compiler runs on is different from the machine the program will execute on?
For example, how does the compiler know that "int s = sum(100000, 1000000)" is the same value on every x86 machine?
I'm thinking there could be subtle differences between generations of CPU, how can a compiler guarantee that a computation on the host machine will result in the same value on the target machine in practice, or is it assuming that host and target are sufficiently similar, as long as the architecture matches? (which is fine, I'm wondering as to what approaches exist)
My pleasure!
> is the same value on every x86 machine?
It's the same value on all machines, because integer types are fixed size (not implementation dependent) and 2's complement arithmetic is mandated.
Floating point results can vary, however, due to different orders in which constants are evaluated. The x87, for example, evaluates to a higher precision and then rounds it only when writing to memory.
Any thoughts on adding something like Zig's setFloatMode(strict)? I have a project idea or 2 where for some of the computation I need determinism then performance. But very much still need the performance floating point can provide.
Your best bet for floating point determinism is to stick with doubles. Then, in 64 bit code, the double math will be done with the XMM registers and instructions, which will stick with 64 bit arithmetic.
int sum(int a, int b) { return a + b; }
_Static_assert(sum(3, 4) == 7, "look ma, check at compile time!");
Why doesn't the C Standard add this? It works great! fn square(num: i32) i32 {
return num * num;
}
pub fn main() void {
_ = square(2);
_ = comptime square(3);
}
...and the comptime invocation will produce a compile error if anything isn't comptime-compatible (which IMHO is an important feature, because it rings the alarm bells if code that's expected to run at comptime accidentially moves into runtime because some input args have changed from comptime- to runtime-evaluated).We have a lot of D users that come from C++, and not one of them has ever asked for `constexpr` that I've heard.
It guarantees that this area of code will always be evaluated at comptime, and otherwise fail to compile. Compilers (or rather their optimizer passes I guess) already try to fold a lot of code into a constant if it can be evaluated at compile time, but it will silently fall back to runtime evaluation when a variable slips into the code later. With a keyword I can at least say "please fail when that happens".
There really is no purpose to adding a keyword.
There's a heap of praise thrown at zig comptime. I can certainly see why. From a programming language perspective it's an elegant and very powerful solution. It's a shame that Rust doesn't have a similar system in place. It works wonderfully if you need to precompute something or do some light reflection work.
But, from an actual user perspective it's not very fun or easy to use as soon as you try something harder. The biggest issue I see is that there's no static trait/interface/concept in the language. Any comptime type you receive as a parameter is essentially the `any` type from TypeScript or `void` from C/C++. If you want to do something specific* with it, like call a specific method on it, you have to make sure to check that the type has it. You can of course ignore it and try to call it without checking it, but you're not going to like the errors. Of course, since there are no interfaces you have to do that manually. This is done by reading the Zig stdlib source code to figure out the type enum/structures and then pattern-matching like 6 levels deep. For every field, every method, every parameter of a method. This sucks hard. Of course, once you do check for the type you still won't get any intellisense or any help at all from your IDE/editor.
Now, there are generally two solutions to this:
One would be to add static interfaces/concepts to the language. At the time this was shot down as "unnecessary". Maybe, but it does make this feature extremely difficult to use for anyone but the absolutely most experienced programmers. Honestly, it feels very similar to how Rust proc macros are impenetrable for most people.
The second one is to take a hint from TypeScript and take their relatively complex type system and type assertions. Eg. `(a: unknown): a is number => typeof a === 'number'`. This one also seems like a bust as it seems to go against the "minimal language" mantra. Also, I don't get the feeling that the language dev team particularly cares about IDEs/LSPs as the Zig LSP server was quite bad the last time I tried it.
Now, the third solution and the one the people behind the Zig LSP server went with is to just execute your comptime functions to get the required type information. Of course, this can't really make the experience of writing comptime any easier, just makes it so that your IDE knows what the result of a comptime invocation was.
So in short it is as difficult to use as it is cool. Really, most of the language is like this. The C interop isn't that great and is severly overhyped. The docs suck. The stdlib docs are even worse. I guess I'm mostly dissapointed since I was hoping Zig could be used where unsafe Rust sucks, but I walked away unsatisfied.
Whether that's a good or bad idea remains to be seen (usually I prefer syntax sugar over such a building blocks system - reminds me too much of the C++ stdlib approach), but a surprising amount of dedicated typesystem features in other languages can be done with comptime coding in Zig.
It kind of reminds me of the other side of static/dynamic polymorphism in that much of the language lives only on undocumented conventions.
At some point, the folks writing Zig decided that they really needed some runtime polymorphism for their stdlib implementation. Of course, this would be implemented in the language (using dyn traits in rust, or OOP in C++) in every other langauge. Ok, but this is Zig, so maybe it's implemented as a std.meta helper in the stdlib? Of course not. Instead you have to figure out what the stdlib does, which is of course to implement it manually for every type that needs it. Making it worse is that at some point the way to do this has changed and so at least at the time there were actually two different ways to do dynamic dispatch in the stdlib and other people's code/tutorials/whatever.
What a mess of a language. So many good ideas hindered by baffling decisions. Writing Zig is like grinding teeth.
"Better errors" seems like something that should be fixable in the compiler, eg. by tracking whether a variable was initialized with a static or comptime type, and if the latter, then triggering a different error path to tracing back to the comptime type variable.