Again: not throwing shade. I think this is a place where Rust is genuinely quite strong.
E.g., a context-free rule S ::= abc|aabbcc|aaabbbccc|... can effectively parse a^Nb^Nc^N which is an example of context-sensitive grammar.
This is a simple example, but something like that can be seen in practice. One example is when language allows definition of operators.
So, how does Rust handle that?
{-# LANGUAGE OverloadedStrings #-}
import Data.Attoparsec.Text
import qualified Data.Text as T
type ParseError = String
csgParse :: T.Text -> Either ParseError Int
csgParse = eitherResult . parse parser where
parser = do
as <- many' $ char 'a'
let n = length as
count n $ char 'b'
count n $ char 'c'
char '\n'
return n
ghci> csgParse "aaabbbccc\n"
Right 3
You used monadic parser, monadic parsers are known to be able to parse context-sensitive grammars. But, they hide the fact that they are combiinators, implemented with closures beneath them. For example, that "count n $ char 'b'" can be as complex as parsing a set of statements containing expressions with an operator specified (symbol, fixity, precedence) earlier in code.
In Haskell, it is easy - parameterize your expression grammar with operators, apply them, parse text. This will work even with Applicative parsers, even unextended.
But in Rust? I haven't seen how it can be done.
fn parse_abc(input: &str, n: usize) -> IResult<&str, (Vec<char>, Vec<char>, Vec<char>)> {
let (input, result) = tuple(( many_m_n(n, n, char('a')),
many_m_n(n, n, char('b')),
many_m_n(n, n, char('c'))
))(input)?;
Ok((input, result))
}
It parses (the beginning of) the input, ensuring `n` repetitions of 'a', 'b', and 'c'. Parse errors are reported through the return type, and the remaining characters are returned for the application to deal with as it sees fit.https://play.rust-lang.org/?version=stable&mode=debug&editio...
If you have to specify N, no, it doesn't
Educational and elegant approach.
Another great talk about making efficient lexers and parsers is Andrew Kelley's "Practical Data Oriented Design" [2]. Summary: "it explains various strategies one can use to reduce memory footprint of programs while also making the program cache friendly which increase throughput".
--
Coroutines for Go - https://research.swtch.com/coro
The parallelism provided by the goroutines caused races and eventually led to abandoning the design in favor of the lexer storing state in an object, which was a more faithful simulation of a coroutine. Proper coroutines would have avoided the races and been more efficient than goroutines.
https://lrparsing.sourceforge.net/doc/examples/lrparsing-sql...
The grammer contains things you won't have seen before, like Prio(). Think of them as macros. It all gets translated to LR(1) productions which you can ask it to print out. LR(1) productions are simpler than EBNF. They look like:
symbol1 := symbol2 symbol3
symbol1 := symbol4 symbol3
symbol3 := token1 symbol2 token2
...
Documentation on what the macros do, and how to get it to spit out the LR1(1) productions is here:https://lrparsing.sourceforge.net/doc/html/
It was used to do a similar task the OP is attempting.
Edit: Never mind. I see it right there under the parser. Thanks!
--
1: https://github.com/antlr/grammars-v4/tree/master/sql/sqlite
Also I'm collecting several LALR(1) grammars here https://mingodad.github.io/parsertl-playground/playground/ that is an Yacc/Lex compatible online editor/interpreter that can generate EBNF for railroad diagram, SQL, C++ from the grammars, select "SQLite3 parser (partially working)" from "Examples" then click "Parse" to see the parse tree for the content in "Input source".
I also created https://mingodad.github.io/plgh/json2ebnf.html to have a unified view of tree-sitter grammars and https://mingodad.github.io/lua-wasm-playground/ where there is an Lua script to generate an alternative EBNF to write tree-sitter grammars that can later be converted to the standard "grammar.js".
I’m imagining seeing the node! macro used, and seeing the macro definition, but still having a tough time knowing exactly what code is produced.
Do I just use the Example and see what type hints I get from it? Can I hover over it in my IDE and see an expanded version? Do I need to reference the compiled code to be sure?
(I do all my work in JS/TS so I don’t touch any macros; just curious about the workflow here!)
$ cargo expand
And you’ll see the resulting code.Rust is really several languages, ”vanilla” rust, declarative macros and proc macros. Each have a slightly different capability set and different dialect. You get used to working with each in turn over time.
Also unit tests is generally a good playground area to understand the impacts of modifying a macro.
it isn't too bad, although the fewer proc macros in a code base, the better. declarative macros are slightly easier to grok, but much easier to maintain and test. (i feel the same way about opaque codegen in other languages.)
[0] https://github.com/ryandv/chesskell/blob/master/src/Chess/Fa...
[1] https://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notati...
Do you see an obvious reason why a similar approach won't work in Rust? E.g. winnow [1] seems to offer declarative enough style, and there are several more parser combinator libraries in Rust.
data Color = Color
{ r :: Word8
, b :: Word8
, c :: Word8
} deriving Show
hex_primary :: Parser Word8
hex_primary = toWord8 <$> sat isHexDigit <*> sat isHexDigit
where toWord8 a b = read ['0', 'x', a, b]
hex_color :: Parser Color
hex_color = do
_ <- char '#'
Color <$> hex_primary <*> hex_primary <*> hex_primary
Sure, it works in Rust, but it's a pretty far cry from being as simple or legible - there's a lot of extra boilerplate in the Rust.Haskell demonstrates the use of parser combinators very well, but I'd still use parser combinators in another language. Parser combinators are implemented in plenty of languages, including Rust, and actually doing anything with the parsed output becomes a lot easier once you leave the Haskell domain.
It’s slightly longer, but more legible.
You don't need switching to Stack (as other commenters suggest) to have isolated builds and project sandboxes etc. If you want to bootstrap a specific compiler version, a-la nvm/pyenv/opam, use GHCup with Cabal project setup: https://www.haskell.org/ghcup/
Has-kill
$
Has-skill
$
{-# LANGUAGE OverloadedStrings #-}
import Prelude hiding (putStrLn)
import Data.Text (Text, replace)
import Data.Text.IO (putStrLn)
transform :: Text -> Text
transform = replace "k" "sk" . replace "ke" "-ki"
main :: IO ()
main = putStrLn $ transform "Haskell"
sed: -e expression #1, char 5: unterminated `s' command
It's like this: sed 's/find/replace/'
Just a few days ago, I wrote a FEN "parser" for an experimental quad-bitboard impelementation. It almost wrote itself.
P.S.: I am the author of chessIO on Hackage
Curious what the rest of the prior art looks like
I wrote it long time ago and it’s not fully implemented tho
[0] https://github.com/salsa-rs/salsa/blob/e4d36daf2dc4a09600975...
this time it got traction. funny how HN works.
A few notes:
* The AST would, I believe, be much simpler defined as an algebraic data types. It's not like the sqlite grammar is going to randomly grow new nodes that requires the extensibility their convoluted encoding requires. The encoding they uses looks like what someone familiar with OO, but not algebraic data types, would come up with.
* "Macros work different in most languages. However they are used for mostly the same reasons: code deduplication and less repetition." That could be said for any abstraction mechanism. E.g. functions. The defining features of macros is they run at compile-time.
* The work on parser combinators would be a good place to start to see how to structure parsing in a clean way.
The author never claimed to be an experienced programmer. The title of the blog is "Why I love ...". Your notes look fair to me, but calling out inexperience is unnecessary IMO. I love it if someone loves programming. I think that's great. Experience will come.
I don't know the author, so it's useful for me to see in the comments that some people think they are not so experienced.
Doesn't mean I won't respect the author at all, it's great that they write about what they do!
"unnecessary" is the same. Who defines what's necessary? Is Hacker News necessary?
I wrote a compiler in school many years ago, but besides thinking "this project is only one a world class expert or an enthusiastic amateur would attempt", I wasn't immediately sure which I was dealing with.
> calling out inexperience is unnecessary IMO. I love it if someone loves programming. I think that's great.
I'll observe that the commenter did not make the value judgement about inexperience that you appear to think they did.
In the context of the blog post, he wants to generate structure definitions. This is not possible with functions.
Much as you might anticipate (although perhaps its designer Sean Baxter did not) this was not kindly looked upon by many C++ programmers and members of WG21 (the C++ committee)
The larger thing that "Safe C++" and the reaction to it misses is that Rust's boon is its Culture. The "Safe C++" proposal gives C++ a potential safety technology but does not and cannot gift it the accompanying Safety Culture. Government programmes to demand safety will be most effective - just as with other types of safety - if they deliver an improved culture not just technological change.
But more importantly, Safe C++ is just not a thing yet. People seem to discount the herculean effort that was required to properly implement the borrow checker, the thousands of little problems that needed to be solved for it to be sound, not to mention a few really, really hard problems, like variance, lifetimes in higher-kinded trait bounds, generic associated types, and how lifetimes interact with a Hindley-Milner type system in general.
Not trying to discount Safe C++'s efforts of course. I really hope they, too, succeed. I also hope they manage to find a syntax that's less... what it is now.
In fact, that eBNF only produces the lexer. The parser part is not that impressive either, 120 LoC and quite repetitive https://github.com/gritzko/librdx/blob/master/JSON.c
So, I believe, a parser infrastructure evolves till it only needs eBNF to make a parser. That is the saturation point.
Though I agree that a little code generation and/or macro magic can make C significantly more workable.
Won't the code here:
https://github.com/gritzko/librdx/blob/master/JSON.lex
accept "[" as valid json?
delimiter = OpenObject | CloseObject | OpenArray | CloseArray | Comma | Colon;
primitive = Number | String | Literal;
JSON = ws* ( primitive? ( ws* delimiter ws* primitive? )* ) ws*;
Root = JSON;
(pick zero of everything in JSON except one delimiter...)I usually begin with the RFCs:
https://datatracker.ietf.org/doc/html/rfc4627#autoid-3
I'm not sure one can implement JSON with ragel... I believe ragel can only handle regular languages and JSON is context free.
I was struggling though with the lack of strong typing in the returned parse tree, though I think some improvements have beenade there which I did not have a chance to look into yet
With that realisation I started looking for another more suitable language - I knew the FP aspects of Rust are what I was looking for so at first I considered something like F# but I didn't like that it's tied to microsoft/.NET. Looking a bit further I could have gone with something like Zig/C but then I lose the FP niceness I'm looking for. I also spent a fair amount of time looking at Go, but eventually decided that 1. I wanted a fair amount of syntax sugar, and 2. golang is a server side language, a lot of its features and library are geared towards this use case.
Finally I found OCaml, what really convinced me was seeing the syntax was like a friendly version of Haskell, or like Rust without lifetimes. In fact the first Rust compiler was written in OCaml, and OCaml is well known in the programming language space. I'm still learning OCaml so I'm not sure I can give a fair review yet, but so far it's exactly what I was looking for.
The rest of the golang ecosystem I found really nice actually, and imo it had a really great set of tools for reading/writing to files - and also I like that everything is apart of the go binary, it certainly is easier than juggling between opam and dune (used for OCaml for example).
The ecosystem and tooling are great, probably the best I've worked with. But the main reason I reach for Go is that it's got tiny mental overhead. There's a handful of language features so it becomes obvious what to use, so you can focus on the actual goal of the project.
There are some warts of course. Heavy IO code can be riddled with err checks (actually, why I find it a bit awkward for servers). Similarly the stdlib is quite verbose when doing file system manipulation, I may try https://github.com/chigopher/pathlib because Python's pathlib is by far my favourite interface.
Do you know how they avoid the GC in the Go implementation of the Go compiler? If I understand correctly they need to implement the Go garbage collector in their Go implementation of the Go compiler. But Go already has a garbage collector. So how do they avoid invoking Go's garbage collector so that they can implement the garbage collector of the Go language they are implementing?
Not sure if I'm making sense but I'd like to know more about this from those who understand this more than I do.
But the shape of the question feels like you're asking about whether an interpreter (which the compiler is not) uses the GC of the host language?
I think the relevant code is https://github.com/golang/go/blob/master/src/runtime/mgc.go and adjacent files. I see some annotations like //go:systemstack, //go:nosplit, //go:nowritebarrier that are probably relevant but I wouldn't know if there's any other specific requirements for that code.
First, that is a problem only for the very first version of X. Then you use X for version X+1.
Second, building from source usually doesn't mean having to build every single dependency. Some .so or .dll are already in the system. Only when one has to build everything from scratch the first step would have to solve the original X from X problem but I think that even a Gentoo full system build doesn't start with a user setting in bytes in RAM with switches (?), setting the program counter of the CPU and its registers to eventually start the bootstrap process.
All this to say: the output of a compiler is by necessity not tied to the language the compiler is written in, instead it is tied to the machine the executable should run on. A compiler "merely" translates instructions from a high level language to a machine executable one. So stuff like a GC must be coded, compiled and then "injected" into the binary so the user's code can interact with it. In an interpreted language this isn't necessary, since the host language is already running and contains these tools which would otherwise have to be injected into the binary.
The compiler executable itself is running in a compilation process P which uses memory and has its own garbage collection. (The compiler executable was itself generated by a compilation, using a compiler written in Go itself(self-hosting) or initially, in another language).
But the compilation process P is unrelated to the process Q in which the generated code, LLC, will run when first executed. The OS which runs LLC doesn't even know about the compiler - LLC is just another binary file. The garbage collection in P doesn't affect garbage collection in Q.
Indeed, it should be easy for the compiler to generate an assembly program which constantly keeps allocating more memory until the system runs out, while compiling say a loop which allocates a struct within a loop running a billion times. Unless, of course, you explicitly also generate a garbage collector as part of the low level code.
Your question does become very interesting in the realm of security, there is a famous paper called "Trusting Trust" where a compiled compiler can still have backdoors even if the compiled code is trustworthy and the compiler code is trustworthy but the code which compiled the compiler had backdoors.
char heap[100000000];
int heap_end;
void *alloc(int n_bytes) {
void *out = &heap[heap_end]
heap_end += n_bytes;
return out;
}
As you can see, it doesn't need to allocate any memory to do this.The garbage collector isn’t part of the compiler, it’s part of the runtime. It’s worth being clear about this distinction because I think it’s the root of the OP’s confusion.
So at some point, someone wrote enough of a Go GC in C to support enough of Go to compile itself.
Why wouldn’t it be able to?
I don’t understand how your question specifically relates to garbage collection, or why the compiler would need to avoid it. The Go compiler is a normal Go program and garbage collection works in it the same way it does in any other Go program.
I've written parsers professionally with Rust for two companies now. I have to say the issues you had with the borrow checker are just in the beginning. After working with Rust a bit you realize it works miracles for parsers. Especially if you need to do runtime parsing in a network service serving large traffic. There are some good strategies we've found out to keep the borrow checker happy and at the same time writing the fastest possible code to serve our customers.
I highly recommend taking a look how flat vectors for the AST and using typed vector indices work. E.g. you have vector for types as `Vec<Type>` and fields in types as `Vec<(TypeId, Field)>`. Keep these sorted, so you can implement lookups with a binary search, which works quite well with CPU caches and is definitely faster than a hashmap lookup.
The other cool thing with writing parsers with Rust is how there are great high level libraries for things like lexing:
https://crates.io/crates/logos
The cool thing with Logos is it keeps the source data as a string under the surface, and just refers to a specific locations in it. Now use these tokens as a basis for your AST tree, which is all flat data structures and IDs. Simplify the usage with a type:
#[Clone, Copy]
struct Walker<'a, Id> {
pub id: Id,
pub ast: &'a Ast,
}
impl<'a, Id> Walker<'a, Id> {
pub fn walk<T>(self, other_id: T) -> Walker<'a, T> {
Walker { id: other_id, ast: self.ast }
}
}
Now you can specialize these with type aliases: type TypeWalker<'a> = Walker<'a, TypeId>;
And implement methods: impl<'a> TypeWalker<'a> {
fn as_ref(&self) -> &'a Type {
&self.ast[self.id]
}
fn name(&self) -> &'a str {
&self.as_ref().name
}
}
From here you can introduce string interning if needed, it's easy to extend. What I like about this design is how all the IDs and Walkers are Copy, so you can pass them around as you like. There's also no reference counting needed anywhere, so you don't need to play the dance with Arc/Weak.I understand Rust feels hard especially in the beginning. You need to program more like you write C++, but with Rust you are enforced to play safe. I would say an amazing strategy is to first write a prototype with Ocaml, it's really good for that. Then, if you need to be faster, do a rewrite in Rust.
> I've written parsers professionally with Rust for two companies now
If you don't mind me asking, which companies? Or how do you get into this industry within an industry? I'd really love to work on some programming language implementations professionally (although maybe that's just because I've built them non-professionally until now),
> Especially if you need to do runtime parsing in a network service serving large traffic.
I almost expected something like this, it just makes sense with how the language is positioned. I'm not sure if you've been following cloudflare's pingora blogs but I've found them very interesting because of how they are able to really optimise parts of their networking without looking like a fast-inverse-sqrt.
> There's also no reference counting needed anywhere, so you don't need to play the dance with Arc/Weak.
I really like the sound of this, it wasn't necessarily confusing to work with Rc and Weak but more I had to put in a lot of extra thought up front (which is also valuable don't get me wrong).
> I would say an amazing strategy is to first write a prototype with Ocaml, it's really good for that.
Thanks! Maybe then the Rust code I have so far won't be thrown in the bin just yet.
You do not need to write programming languages to need parsers and lexers. My last company was Prisma (https://prisma.io) where we had our own schema definition language, which needed a parser. The first implementation was nested structures and reference counting, which was very buggy and hard to fix. We rewrote it with the index/walker strategy described in my previous comment and got a significant speed boost and the whole codebase became much more stable.
The company I'm working for now is called Grafbase (https://grafbase.com). We aim to be the fastest GraphQL federation platform, which we are in many cases already due to the same design principles. We need to be able to parse GraphQL schemas, and one of our devs wrote a pretty fast library for that (also uses Logos):
https://crates.io/crates/cynic-parser
And we also need to parse and plan the operation for every request. Here, again, the ID-based model works miracles. It's fast and easy to work with.
> I really like the sound of this, it wasn't necessarily confusing to work with Rc and Weak but more I had to put in a lot of extra thought up front (which is also valuable don't get me wrong).
These are suddenly _very annoying_ to work with. If you come from the `Weak` side to a model, you need to upgrade it first (and unwrap), which makes passing references either hard or impossible depending on what you want to do. It's also not great for CPU caches if your data is too nested. Keep everything flat and sorted. In the beginning it's a bit more work and thinking, but it scales much better when your project grows.
> Thanks! Maybe then the Rust code I have so far won't be thrown in the bin just yet.
You're already on the right path if you're interested in Ocaml. Keep going.
> If you come from the `Weak` side to a model, you need to upgrade it first (and unwrap), which makes passing references either hard or impossible depending on what you want to do.
You're literally describing my variable environment, eventually I just said fuggit and added a bunch of unsafe code to the core of it just to move past these issues.
Yeah, that's the focus of it, and the thing you can use Rust well.
All the popular Rust parsing libraries aren't even focused on the use that most people use "parser" to name. They can't support language-parsing at all, but you only discover that after you spent weeks fighting with the type-system to get to the real PL problems.
Rust itself is parsed by a set of specialized libraries that won't generalize to other languages. Everything else is aimed at parsing data structures.
F# or even the latest version of C# are what I would recommend. Yes Microsoft are involved but if you're going to live in a world where you won't touch anything created by evil corporations then you're going to have a hard time. Java, Golang, Python, TypeScript/Javascript and Swift all suffer from this. That leaves you with very little choice.
I'd be interested in hearing your thoughts over OCaml after a year or so of using it. The Haskell-likes are very interesting but Haskell itself has a poor learning curve / benefit ratio for me (Rust is similar there actually; I mastered C# and made heavy use of the type system but that involved going very very deep into some holes and I don't have the time to do that with Rust).
Have I missed something GvR or his team did?
But, I can always just write Rust and be happy where I am. Or, to be honest, would not be very unhappy with F#, Haskell or Ocaml either.
What do you mean, exactly?
Anecdotally I would say where a lot of companies would have used Java in the past they are now turning to go for their server-side/backend service implementations.
I only advocate for it on the scenarios where a garbage collected C is more than enough, regardless of the anti-GC naysayers, e.g. see TamaGo Unikernel.
The term you are looking for is sum types (albeit in a gimped form in the case of Pascal). Enumerations refer to the value applied to the type, quite literally, and is identical in Pascal as every other language with enumerations, including Go. There is only so much you can do with what is little more than a counter.
Pascal doesn't require case matching of enumerations to be exhaustive, but this can be turned on as a compiler warning in modern Pascal environments, FreePascal / Lazarus and such.
Go only has enums by convention, hence the "iota dance" referred to. I've argued before that this does qualify as "having enums" but just barely.
It wouldn't have been difficult to do a much better job of it, is the thing.
Normally in Pascal you would not match on the enumeration at all, but rather on the sum type.
type
Foo = (Bar, Baz)
case foo:
Bar: ...
Baz: ...
The only reason for enumerations at all in Pascal (and other languages with similar constructs) is because under the hood the computer needs a binary representation to identify the type, and an incrementing number (an enum) is a convenient identifier.Thus, yes, you can go around the type system and get an enumeration out with Ord(foo), but at that point its just an integer and your chance at exhaustive matching is out the window. It is the type system that allows more flexibility in what the compiler can tell you.
Nothing wrong with that, but it will probably never work for me. Newer versions of Java are much more enjoyable to work with versus Go.
I am one of those. I grok abstractions just fine (have commercially written idiomatically obtuse Scala and C#, some Haskell for fun, etc.), but I don't enjoy them.
I use them, of course (writing everything in raw asm is unproductive for most tasks), but rather than getting that warm fuzzy feeling most programmers seem to get when they finish writing a fancy clever abstraction and it works on the first try, I get it when I look at a piece of code I've written and realize there is nothing extraneous to take away, that it is efficient and readable in the sense of being explicit and clear, rather than hiding all the complexity away in order to look pretty or maximize more abstract concerns (reusability, DRY, etc.).
This mindset is a very good fit for writing compute-heavy numerical code, GPU stuff and lots of systems level code, not so much for being a cog in a large team on enterprise web backends, so I mostly write numerical code for physics simulations. You can write many other things this way and get very fast and bloatfree websites or anything else, but it doesn't work well in large teams or people using "industry best practices". It also makes me prefer C to Rust.
"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
- Antoine de Saint-Exupery
https://www.brainyquote.com/quotes/antoine_de_saintexupery_1...
The vast majority understand abstractions just fine, though each takes time to understand. However most people like their own abstractions best, and those of other people less. To me hell is living in a world of bad abstractions created by someone else.
Every abstraction created adds to cognitive load when reading the code and to the maintenance burden of that code. So you have an abstraction budget, which is usually in overspent IME and needs to be carefully controlled. Most of the most horrible codebases are horrible because they have too many of the wrong sort of abstraction.
Personally, I don't want to write any new code in something that doesn't have ADTs, or the moral equivalent (Java's sealed classes). I've already written a lifetime of code without them, so I suppose part of that is not wanting to write another 20 years of the same code. :)
The standard library is unimpressive (to be generous), it has plenty of footguns like C but none of its flexibility.
Also for some reason parenthesis AND \n are required. So you get the worse of C and python there.
I really like this about go - that it formats code for you, and miss it in other languages where we have linters but not formatters, which is a terrible idea IMO.
Coming from Python, this is one of the major things that I just can't get past with golang (despite having to use it for work). The standard library has a lot of really interesting/impressive/useful things to cover niche cases, but is missing a lot of what I would consider basic functionality that I keep running into requiring me to go get an external module to solve the problem.
Then, on top of that, the documentation for external modules is extremely terrible. In many cases the best you can get is API documentation in the form of "these are the functions, this is what they take and return" with no explanation of what those values need to be, what the function does with them, and so on; a simple list of functions. In others, there is that plus example code which doesn't work because it hasn't been updated since the last time backwards-incompatible changes were made so you end up down a rabbit hole of trying to debug someone else's wrong code.
The only thing letting me write effective golang at this point is that VSCode can autocomplete a lot of method calls, API calls, and so on, and then tell me what parameters they need, but even then I'm just guessing about what function might exist and what it might be called.
The language itself is okay and the more I use it the more I understand why they implemented all the stuff I hate (like a lack of proper error handling leading to half of my lines of code being boilerplate `if err != nil` blocks), but if the tooling around it wasn't so good no one would take it remotely seriously.
Embarrassing that developers are still forgetting nil pointer checks in 2024.
That said, I discovered that Go has the ability to basically encapsulate one error inside of another with a message; for example, if you get an err because your HTTP call returned a 404, you can pass that up and say "Unable to contact login server: <404 error here>". But then the caller takes that error and says "Could not authenticate user: <login error here>", and _their_ caller returns "Could not complete upload: <authentication error here>" and you end up with a four-line string of breadcrumbs that is ostensibly useful but not very readable.
Python's `raise from` produces a much more readable output since it amounts to much more than just a bunch of strings that force you to follow the stack yourself to figure out where the error was.
Is there a term equivalent to "armchair quarterback" in programming? Most programmers are already in armchairs.
It's the equivalent of yelling at the TV that the ultra-successful mega-athlete sucks. I can't imagine the thought process that goes into thinking Ken Thompson, Rob Pike and Robert Griesemers are complete idiots that have no clue of what they were doing.
They made a deliberate decision to design a language that did not take many developments in PL design since the 70's into account.
They had their reasons, which make sense in the context of their employer and their backgrounds.
Many people, myself included, prefer to program with languages that do not focus so much on simplicity
All they knew was C and that they wanted to create a language that compiles faster than C++. That's all.
I've really tried giving Go a go, but it's truly the only language I find physically revolting. For a language that's supposed to be easy to learn, it made sooooo many weird decisions, seemingly just to be quirky. Every single other C-ish language declares types either as "String thing" or "thing: String". For no sane reason at all, Go went with "thing String". etc. etc.
I GENUINELY believe that 80% of Gos success has nothing to do with the language itself, and everything to do with it being bankrolled and used by a huge company like Google.
It's a very simple and straightforward language, which I think is why people like it, but it's just a pain to use. It feels like it fights any attempt at using it to do things optimally or quickly.
I will say that the error propagation is a pain a lot of the time, but I can appreciate being forced to handle errors everywhere they pop up, explicitly.
Java -> Python -> C++ -> Rust -> Go
I have to say, given this progression going to Rust from C++ was wonderful, and going to Go from Rust was disappointing. I run into serious language issues almost daily. The one I ran into yesterday was that defer's function arguments are evaluated immediately (even if the underlying type is a reference!).
https://go.dev/play/p/zEQ77TIP8Iy
Perhaps with a progression Java -> Go -> Rust moving to rust could feel slow and painful.
Turbo Pascal for education, C as professional lingua franca in mid-90s (manual memory management). C++ was all the rage in late 90s (OOP,STL) . Java got hot around 2003 (GC, canonical concurrency library and memory model). Scala grew in popularity around 2010-2012 (FP for the masses, much less verbosity, mainstream ADTs and pattern matching). Kotlin was cobbled together to have the Scala syntactic sugar without the Haskell-on-the-JVM complexity later.
And then they came up with golang which completely broke with any intellectual tradition and went back to before the Java heyday.
Rust feels like a Scala with pointers so the "C++ => Rust" transition looks analogous to the "Java => Scala" one.
they are all actively in-use.. if gp is earlier in their career, it could all be in last 10 years.
I remember that famous rant about how Go’s stdlib file api assumes Unix, and doesn’t handle Windows very well.
If you are against “worse is better” like the author, that’s a show stopping design flaw.
If you are for it, you would slap a windows if statement and add a unit test when your product crosses that bridge.
That's the point. It's a rejection of the keyboard jockeys who become more concerned with the code itself than the problem being solved.
"The key point here is our programmers are Googlers, they’re not researchers. They’re typically, fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt." - Rob Pike
I suppose in a sense this is rejecting the "keyboard jockeys", but probably not in the way you mean.
You cannot separate the tool used to solve a problem from the problem itself. The choice of tool is a critical and central consideration.
Really I think it's more useful to view it as a better C in the less is more tradition, compared to say C++ and Java, which at the time were pretty horrible. That's my understanding of its origin. It makes sense in that context; it doesn't have pretensions to be a super advanced state of the art language, but more of a workhorse, which as Pike pointed out here could be useful for onboarding junior programmers.
Certain things about it I think have proven really quite useful and I wish other languages would adopt them:
* It's easy to read precisely because the base language is so boring * Programs almost never break on upgrade - this is wonderful * Fewer dependencies, not more * Formatters for code
Lots of little things (struct tags for example) I'm not so keen on but I think it's pretty successful in meeting its rather modest goals.
But Go is nothing at all like C, and it's completely unsuitable for most of the situations where C is used. I'm having trouble even imagining what you're getting at with this comparison. The largest areas of overlap I can think of are "vaguely similar syntax style" and "equally bad and outdated type system". Pretty much everything else of substance is different. Go is GC'd, Go has a runtime, etc.
As ugly and ad-hoc as the language feels, it’s hard to deny that what a lot of people want is just good built-in tooling.
I was going to say that maybe the initial lack of generics helped keep compile times low for go, but OCaml manages to have good compile times and generics, so maybe that depends on the implementation of generics (would love to hear from someone with a better understanding of this).
Rust is designed with the philosophy of zero-cost abstractions. (I don’t like the name, because the cost is never zero, but it is what it is.) The abstractions usually involve a lot of function calls and you need a compiler with aggressive inlining in order to get reasonable performance out of Rust. Usage of generics still results in the same non-virtual calls which can be inlined. But the compiler then has to do a lot of work to evaluate inlining for every instantiation of every generic.
Go is designed with the philosophy of simple abstractions, which may come with a cost. Generics are implemented in a way that means you are still doing a lot of dynamic dispatch. If you need speed in Go, you should be writing the monomorphic code yourself. Generics don’t get instantiated for every single type you use them with. They only get instantiated for every “shape” of type.
So when the generated asm is the same between the abstraction and the non-abstraction version, wheres the cost?
A Rust project's cognitive cost budget comes out of what's left over after the language is done spending. This is true of any language, but many language designers do not discount cognitive costs to zero, which, with the "zero cost abstraction" slogan, Rust explicitly does.
Looking at Primeys' comment he actually gave some really interesting suggestions on how to manage this without needing Rc / weak pointers or copying loads of dynamic memory all over the place. Instead you have a flat structure of copy-able elements, giving you better cache locality and a really easy way to work with them.
The more your stuff is held in things that are cheap/free to clone, the less you have to fight the borrow checker… since you can get away with clones!
And for actual interpretation there’s these libraries that can help a lot for memory management with arenas and the like. It’s super specialized stuff but helps to give you perf + usability. Projects like Ruffle use this heavily and it’s nice when you figure out the patterns
Having said that OCaml and Haskell are both languages that will do all of this “for free” with their built in reference counting and GC… I just like the idea of going very fast in Rust
But, it does still run on .NET.
At this point, isn't every major language controlled by one main corporate entity?
Except Python? But Python doesn't have algebraic types, or very complete pattern matching.
My effort has been in adding these features to a front end language that transpiles to an underlying FP language, including but not limited to Rust.
Could you explain your thought process when deciding to not use F# because it runs on top of .NET? (both of which are open-source, and .NET is what makes F# fast and usable in almost every domain)
I agree on F#. It changed my C && OO perspective in fantastic ways, but I too can't support anything Microsoft anymore.
But, seeing as OCaml was the basis for F#, I have a question, though:
Does OCaml allow the use of specifically sized integer types?
I seem to remember in my various explorations that OCaml just has a kind of "number" type. If I want a floating point variable, I want a specific 32- or 64- or 128-bit version; same with my ints. I did very much like F# having the ability to specify the size and signedness of my int vars.
Thanks in advance, OCaml folks.
Other options have worse support and weaker tooling, and often not even more open development process (e.g. you can see and contribute to ongoing F# work on Github).
This tired opinion ".net bad because microsoft bad" has zero practical relevance to actually using C# itself and even more so F# and it honestly needs to die out because it borders on mental illness. You can hate microsoft products, I do so too, and still judge a particular piece of techology and the people that work on it on their merits.
First off, the only way to express union types is with runtime reflection. You might as well be coding in Python (but without the convenient syntax sugar).
Second off, “if err != nil” is really terrible in parsers. I’m actually somewhat of a defender of Go’s error handling approach in servers. Sure, it could have used a more convenient syntax. But in servers, I almost never return an error without handling it or adding additional context. The same isn’t true in parser’s though. Almost half of my parser code was error checks that simply wouldn’t exist in other languages.
For Rust, I think the value proposition is if you are also writing a virtual machine or an interpreter, your compiler front end can be written in the same language as your backend. Your other alternatives are C and C++, but then you don’t have sum types. You could write the front end in Ocaml, but then you would have to write the backend and runtime in some other language anyways.
The railroad diagrams are tremendously useful:
https://www.sqlite.org/syntaxdiagrams.html
I don't think the lemon parser generator gets enough credit:
https://sqlite.org/src/doc/trunk/doc/lemon.html
With respect of the choice of the language, any language with Algebraic Data Types would work great. Even Typescript would be great for this.
FWIW I wrote a small introduction to writing parsers by hand in Rust a while ago:
https://www.nhatcher.com/post/a-rustic-invitation-to-parsing...
I can take my parser combinator library that I use for high-level compiler parsers, and use that same library in a no-std setting and compile it to a micro-controller, and deploy that as a high-performance protocol parser in an embedded environment. Exact same library! Just with fewer String and more &'static str.
So toying around with compilers translates my skill-set rather well into doing embedded protocol parsers.
Me: "How can a programming language be so damn complex? Am I just dumb?"
I do still like declarative parsing over imperative, so I wrote https://docs.rs/inpt on top of the regex crate. But Andrew Gallant gets all the credit, the regex crate is overpowered.