Homa, a transport protocol to replace TCP for low-latency RPC in data centers https://news.ycombinator.com/item?id=28204808
Linux implementation of Homa https://news.ycombinator.com/item?id=28440542
https://people.csail.mit.edu/alizadeh/papers/homa-sigcomm18....
It's easy to imagine how, in the hands of tech journalists and youtubers optimising for clicks, "Google likes QUIC" and "Some ML clusters use infiniband" could get distorted into several faangs and the complete elimination of TCP.
I think the statement is not true, in a literal sense. I do not think there are multiple FAANG companies with data centres where TCP has been entirely eliminated.
But the statement is ambiguous enough you could come up with true interpretations, if you diluted it to the point of meaninglessness.
Therefore, without a clear public statement of what is being claimed, it's very difficult for me to be impressed.
Yep. Sure; but, what happens when it becomes overloaded?
> Homa manages congestion from the receiver, not the sender. [...] but the remaining scheduled packets may only be sent in response to grants from the receiver
I hypothesize it will not be a great day when you do become "systemically" overloaded.
Ehrm. Looks like core saturation all over again.
Edit: Just popped to my mind. What prevents maliciously reducing "receive quotas" on compromised receivers to saturate an otherwise capable core? Looks like it's a very low bar for a very high impact DOS attack. Ouch.
With all the bandwidth, processing power and free cooling, a server is always a good target, and sometimes people will come equipped with 0-days or exploits which are very, very fresh.
Been there, seen and cleaned that mess. I mean, reinstallation is 10 minutes, but the event is always ugly.
I'm not a datacenter expert, but is "not enough ports" really a showstopper? It just sounds like bad planning to me. And still not a protocol issue.
A datacenter has layers of networking, and some of it is what we call "the core" or "the spine" of the network. Sometimes you need to shuffle things around, and you need to move things closer to core, and you can't because there are no ports. Or you procure the machines, and adding new machines requires some ports closer or at the core, and you are already running at full capacity there.
I mean, it can be dismissed like "planning/skill issue", but these switches are fat machines in terms of bandwidth, and they do not come cheap, or you can't get them during lunch break from the IT shop at the corner when required.
Being able to carry 1.2Tbit/sec of aggregated network from a dozen thin fibers is exciting, but scaling expensive equipment at a whim is not.
At the end of the day "network core" is more monolithic than your outer network layers and server layout. It's more viscous and evolves slowly. This inevitably corners you in some cases, but with careful planning you can postpone that, but not prevent.
I mean, I don't actually see why you would need more ports at all. You still just have a certain number of machines that want to exchange a certain amount of traffic. That number is either above or below what your core can handle (guessing, again, at what the author means by "systemically overloaded").
RPC is something of a red flag as well. RPCs will never behave like local procedure calls, so the abstraction will always leak (the pendulum of popularity keeps swinging back and forth between RPC and special purpose protocols every few years, though).
And that's okay!
But there are other sorts of computer people than website writers and business application devs, and they're some of the people this would be interesting for!
Wrong. I've experienced most of the complaints in the paper when developing multiplayer video games. These days I simply use websockets instead of raw TCP because it is not worth the effort and yet you still have to do manual heartbeats.
But it is often used for block storage in datacenters. Using it for anything else is going to be hard, as it is incompatible with TCP.
The problem with not using TCP is the same thing HOMA will face - anything already speaks TCP, nearly all potential hires know TCP and most problems you have with TCP have been solved by smart engineers already. Hardware is also easily available. Once you drop all those advantages, either your scale or your gains need to be massive to make that investment worth it, which is why TCP replacements are so rare outside of FAANG.
I guess I am old. Everytime I see new tech that wants to be hyped, completely throw out everything that is widely supported and working for 80-90% of uses cases, not battle tested and may be conceptually complex I will simply pass.
I’d love to just play with QUIC a bit because it’s pretty neat, but I always get distracted by this problem and end up reading the RFCs, which so far I haven’t had the patience to get through.
https://en.wikipedia.org/wiki/Unix_domain_socket#:~:text=The....
Puma creates a `UnixServer` which is a ruby stdlib class, using the defaults, which is extending `UnixSocket` which is also using the defaults
https://github.com/puma/puma/blob/fba741b91780224a1db1c45664...
Those defaults are creating a socket of type `SOCK_STREAM`, which is a tcp socket
> SOCK_STREAM will create a stream socket. A stream socket provides a reliable, bidirectional, and connection-oriented communication channel between two processes. Data are carried using the Transmission Control Protocol (TCP).
https://github.com/ruby/ruby/blob/5124f9ac7513eb590c37717337...
You still have the tcp overhead when using a local unix socket with puma, but you do not have any network overhead.
UDS using SOCK_STREAM does not do that; ie, it is not using IPPROTO_TCP.
> SOCK_STREAM will create a stream socket. A stream socket provides a reliable, bidirectional, and connection-oriented communication channel between two processes. Data are carried using the Transmission Control Protocol (TCP).
> SOCK_DGRAM will create a datagram socket.[b] A Datagram socket does not guarantee reliability and is connectionless. As a result, the transmission is faster. Data are carried using the User Datagram Protocol (UDP).
> SOCK_RAW will create an Internet Protocol (IP) datagram socket. A Raw socket skips the TCP/UDP transport layer and sends the packets directly to the network layer.
I don't claim to be an expert, I just have a certain confidence that I'm able to comprehend words I read. It seems you can have 3 types of sockets, raw, udp, or tcp.
TCP (in practice) runs on top of (mostly) routed IP networks and network architectures. E.g. a spine/leaf network with BGP. Fibre Channel as I understand it is mostly used in more or less point to point connections? I do see some mention of "Switched Fabric" but is that very common?
Drop the "pretending to be block devices" part and you basically end up with InfiniBand. It works well, if you ignore the small problem of "you need to either reimplement all your software to work with RDMA based IPC, or reimplement Ethernet on top of InfiniBand to remain compatible and throw away most advantages of InfiniBand again".
Back in ~2012 I was setting up a high-frequency network for a forex company and at the time we deployed Mellanox and they had some very (at the time) bleeding edge networking drivers that significantly reduced the overhead of writing to TCP/IP sockets (particularily zero-copy which TL;DR meant data didn't get shifted around in memory as much and was written to the ethernet card's buffers almost straight away) that made a huge difference.
I eventually left the firm and my successors tried to replace it with cisco gear and Intel NICs and the performance plummeted. That made me laugh as I received so much grief pushing for the Mellanox kit (to be fair, they were a scrappy unheard of Israeli company at the time).
Anything on top of Ethernet, and we no longer know where this host is located (because of software defined networking). Could be next rack server, or could be something in the cloud, could be third party service.
And that's a feature, not a bug: because everything speaks TCP: we can arbitrarily cut and slice network just by changing packet forwarding rules. We can partition network however we want.
We could have a single global IP space shared by cloud, dc, campus networks, or could have Christmas Tree of NATs.
as soon as you introduce something other than TCP to the mix, now you will have gateways: chokepoints where traffic will have to be translated TCP<->Homa and I don't want to be a person troubleshooting a problem at the intersection of TCP and Homa.
in my opinion, the lowest level Ethernet should try its best to mirror the actual physical signal flow. Anything on top becomes software-network network
Ping it and you can at least deduce where it's not.
For block storage it is great, if slower than ethernet.
FC was one of the first non-mainframe specific storage area network fabrics. One of the key differences between FC and ethernet is the collision detection/avoidance and guaranteed throughput. All that extra coordination takes effort on big switches, so it cost more to develop/produce.
You could shovel data through it at "line speed" and know that it'll get there, or you'll get an error at the fabric level. Ethernet happily takes packets, and if it can't deliver, then it'll just drop them. (well, kinda, not all ethernet does that, because ethernet is the english language of layer 2 protocols)
Similar things are happening with stuff like Infiniband, it has become far too expensive and Ethernet/ROCE is making inroads in lower- to medium-end installations. Availability is also an issue, Nvidia is the only Infiniband vendor left.
FC seems to work nicely in a single-vendor stack, or at least among specific sets of big-name vendors. that's OK for the "enterprise" market, where prices are expected to be high, and where some integrator is getting a handsome profit for making sure the vendors match.
besides consumer, the original non-enterprise market was HPC, and we want no vendor lock-in. hyperscale is really just HPC-for-VM-hosting - more or less culturally compatible.
besides these vendor/price/interop reasons, FC has never done a good job of keeping up. 100/200/400/800 Gb is getting to be common, and is FC there?
resolving congestion is not unique to FC. even IB has more, probably better solutions, but these days, datacenter ethernet is pretty powerful.
https://fibrechannel.org/preview-the-new-fibre-channel-speed...
Granted I'm also trying to find a switch that supports ROCm and rdma. Not easy to find a high bandwidth switch that supports this stuff without breaking the bank.
It make sense only if you have an websocket based stack and don't want to maintain a second protocol.
Your second point is very dismissive. You're inserting random application requirements that the vast majority of application developers don't care about and then you claim that only in this situation do WebSockets make sense when in reality the vast majority of developers only use WebSockets and your suggestion involves the second unwanted protocol (e.g. the horror that is protobuffers and gRPC).
In what way?
> This is adding complexity without additional benefit
I'm not sure that's a given. For example, WebSockets actually implement a message protocol. You have gauarantees that you sent or received the whole message. That may not be the case for TCP/IP, which is a byte streaming protocol.
WebSockets are long lived socket connection designed specifically for use on the 'web'. TCP is data sent wrapped in packets that is ordered and guaranteed delivery. This causes a massive overhead cost. This is different from UDP which doesn't guarantee order and delivery. However, a packet sent over UDP might arrive tomorrow after it goes around the world a few times.
With fetch() or XMLHttpRequest, the client has to use energy and time to open a new HTTP connection while a WebSocket opens a long lived connection. When sending lots of bi directional messages it makes sense to have a WebSocket. However, a simple fetch() request is easier to develop. A developer needs to good reason to use the more complicated WebSocket.
Regardless, they both send messages using TCP which ensures the order of packets and guaranteed delivery which features have a lot to do with why TCP is the first choice.
There is UDP which is used by WebRTC which is good for information like voice or video which can have missing and unordered packets occasionally.
If two different processes on the same machine want to communicate, they can use a Unix socket. A Unix socket creates a special file (socket file) in the filesystem, which both processes can use to establish a connection and exchange data directly through the socket, not by reading and writing to the file itself. But the Unix Socket doesn't have to deal with routing data packets.
(ChatGPT says "Overall, you have a solid grasp of the concepts, and your statements are largely accurate with some minor clarifications needed.")
Performance. TCP over TCP is pretty bad.
OpenVPN can do that (tcp-based vpn session over a tcp connection) and the documentation strongly advices against that.
Networking Engineering is already convoluted and troublesome as it is right now, using only tcp stack.
When you start using homa inside, but TCP from outside things will break, because a lot of DC requests are created as a response for an inbound request from outside DC (like a client trying to send RPC request).
I cannot imagine trying to troubleshoot hybrid problems at the intersection of tcp and homa, its gonna be a nightmare.
Plus I don't understand why create a a new L4 transport protocol for a specific L7 application (RPC)? This seems like a suboptimal choice, because RPC of today could be replaced with something completely different, like RDMA over Ethernet for AI workloads or transfer of large streams like training data/AI model state.
I think tuning TCP stack in the kernel, adding more configuration knobs for TCP, switching from stream(tcp) to packet (udp) protocols where it is warranted, will give more incremental benefits.
One major thing author missed is security applications, these are considered table stakes: 1. encryption in transit: handshake/negotiation 2. ability to intercept and do traffic inspection for enterprise security purposes 3. resistance to attacks like flood 4. security of sockets in containerized Linux environment
2. term "external" is really vague as modern networks have blended boundaries. Things like availability zone, region make dc-dc connection irrelevant, because at any point of time you will be required to failover to another AZ/DC/region.
3. when I think of inter-Datacenter, I can only think of Ethernet. That's really it. Even in Ethernet, what you think of a peer and existing in your same subnet, could be a different DC, again due to software-defined network.
I could see the former being an issue (if that's even implied by "inside the data center") and I just don't see how it's a problem for the latter.
There is a lot of work going on in the userland, like filling up TCP buffer, parsing HTTP stream, applying bunch of business logic, creating a downstream connection, sending data, getting response, etc.
This is a lot of work in the userland and because of that a default nginx config is like 1024 concurrent connections per core, so not a lot.
L4 load balance on the other hand works purely in a packet switching mode or NAT mode. So the work consists in just replacing IP header fields (src.ip, src.port, dst.ip, dst.port, proto), it can use various frameworks like intel vectorized packet processing or Intel dpdk for accelerated packet switching.
Because of that, L4 load balancer can work perform very very close to the line rate speed, meaning it can load balance connections as fast as packets arrive to the network interface card. Line rate is the theoretical maximum of packet processing.
In case of stateless L4 load balancing there is no upper bound in number of concurrent sessions to balance, it will almost as fast as core router that feeds the data.
As you can see L4 is clearly superior in performance, but the reason L4 LB is possible is because it has TCP inbound and TCP outbound, so the only work required is replace IP header and recalculate CRC.
With Homa, you would need to fully process TCP stream, before you initiate Homa connection, meaning you will waste a lot of RAM on keeping TCP buffers and rebuilding the stream according to the TCP sequence. Homa will lose all its benefits in the load balancing scenario.
Author pitches only one use case for homa: East-West traffic, but again - these days the software is really agnostic of this East-West direction. What your software thinks is running in the server in next rack, could as well be a server in a different Availability Zone or read replica in different geo region.
And that's the beauty of modern infra: everything is a software, everything is ephemeral, and we don't really care if we running this in a single DC or multiple DCs.
Because of that, I think we will still stick to TCP as a proven protocol that will seamlessly interop when crossing different WAN/LAN/VPN networks
I am not even talking about software defined networks, like SD-WAN where transport&signaling is done by the vendor-specfic underlay network, and overlay network is really just abstraction for users that hides a lot network discovery and network management underneath
Another protocol, something completely new? Good luck with that, i would rather bet on global warming to put us out of our misery (/s)...
(SNA was before my time so I have no idea if Homa is similar or not.)
Though it doesn't really replace TCP, it's just that the predominant requirements have changed (as Ousterhout points out). Bruce Davie has a series of articles on this: https://systemsapproach.substack.com/p/quic-is-not-a-tcp-rep...
Also see Ivan Pepelnjak's commentary (he disagrees with Ousterhout): https://blog.ipspace.net/2023/01/data-center-tcp-replacement...
ECN isn't a necessity unless you need truly lossless network but the rest should get you pretty far as long as you are reasonably careful about communication patterns and blocking ratio of spine/core.
For anything that really needs the lowest possible latency at the cost of all other considerations there is still always Infiniband.
QUIC is more focused on the global web applications, but most datacentres also leverage the web protocols (REST on HTTP 1.1 or HTTP/2, gRPC HTTP/2) for their inter-process communication, just with a a lot more east-west traffic (arguably 10x for every 1x N/S flow). There's also a fair amount app-specific messaging stacks (usually L7 over TCP) like Kafka, NATS or AMQP which have their own L7 facilities for dealing with TCP drawbacks that might benefit from a retrofit like Homa, but it's not clear if it's worth the effort.
They are design approaches for solving similar requirements. Yes, homa deals with other things (makes ECMP load balancing easier) but also has blindspots on datacenter requirements like security: a lot of data centre traffic requires hop by hop TLS for authentication, integrity and privacy, QUIC explicitly focuses on improving latency of TLS handshakes.
And as usual, hardware gets faster, better and cheaper over the next years and suddenly the problem isn't a problem anymore - if it even ever was for the vast majority of applications. We only recently got a new fleet of compute nodes with 100gbit NICs. The previous one only had 10, plus omnipath. We're going ethernet only this time.
I remember when saturating 10gbit/s was a challenge. This time around, reaching line speed with tcp, the server didn't even break a sweat. No jumbo frames, no fiddling with tunables. And that actually was while testing with 4 years old xeon boxes, not even the final hw.
Again, I can see how there are use cases that benefit from even lower latency, but thats a niche compared to all DC business, and I'd assume you might just want rdma in that case, instead of optimizing on top of ethernet or IP.
A massively parallel task? Sounds like something doable with GPGPU.
Don't mess with another IP -> UDP -> something
IPv6 anyone? People must start to understand that "Because this is the way it is" is a valid, actually extremely valid, answer to any question like "Why don't we just switch technology A with technology B?"
Despite all the shortcomings of the old technology, and the advantages of the new one, inertia _is_ a factor, and you must accept that most users will simply even refuse to acknowledge the problem you want to describe.
For you your solution to get any traction, it must deliver value right now, in the current ecosystem. Otherwise, it's doomed to fail by being ignored over and over.
If you don't feel much pain, you can and should stay with your current solution. If it's not broken, or not broken badly enough, don't fix it by radical surgery.
why would inertia be a factor? If I want to support protocol ABC in my data center, then I buy hardware that supports protocol ABC, including the ability to down shift to TCP when data leaves the data center. We aren't talking about the internet at large so there is no need to coordinate support with different organizations with different needs.
Google, could mandate you can't buy a router or firewall that doesn't support IPv6. Then their entire datacenter would be IPv6 internally. The only time to convert to IPv4, would be if the local ISP doesn't support v6.
But yeah if you are a large bloated enterprise like amazon or microsoft that owns large amounts of ipv4 address space and expensive ipv4 routing equipment there is not a ton of value in switching
At any time, the receiver could lose power. Or a burst of interference could disrupt the radio link. Or a backhoe could slice through the cable. Or many other things.
IP merely reflects this physical reality.
The Ultra Ethernet Transport specs aren't public yet so I can only quote the public whitepaper [0]:
"The UEC transport protocol advances beyond the status quo by providing the following:
● An open protocol specification designed from the start to run over IP and Ethernet
● Multipath, packet-spraying delivery that fully utilizes the AI network without causing congestion or head-of-line blocking, eliminating the need for centralized load-balancing algorithms and route controllers
● Incast management mechanisms that control fan-in on the final link to the destination host with minimal drop
● Efficient rate control algorithms that allow the transport to quickly ramp to wire-rate while not causing performance loss for competing flows
● APIs for out-of-order packet delivery with optional in-order completion of messages, maximizing concurrency in the network and application, and minimizing message latency
● Scale for networks of the future, with support for 1,000,000 endpoints
● Performance and optimal network utilization without requiring congestion algorithm parameter tuning specific to the network and workloads
● Designed to achieve wire-rate performance on commodity hardware at 800G, 1.6T and faster Ethernet networks of the future"
You can think of it as the love-child of NDP [2] (including support for packet trimming in Ethernet switches [1]) and something similar to Swift [3] (also see [1]).
I don't know if UET itself will be what wins, but my point is the industry is taking the problems seriously and innovating pretty rapidly right now.
Disclaimer: in a previous life I was the editor of the UEC Congestion Control spec.
[0] https://ultraethernet.org/wp-content/uploads/sites/20/2023/1...
[1] https://ultraethernet.org/ultra-ethernet-specification-updat...
[2] https://ccronline.sigcomm.org/wp-content/uploads/2019/10/acm...
[3] https://research.google/pubs/swift-delay-is-simple-and-effec...
In theory for shortlived TCP connections, fastopen ought to be a win. It's very easy to implement in Linux (just a couple of lines of code in each client & server, and a sysctl knob). And the main concern about fastopen is middleboxes, but in a data center you can control what middleboxes are used.
In practice I found in my testing that it caused strange issues, especially where one side was using older Linux kernels. The issues included not being able to make a TCP connection, and hangs. And when I got it working and benchmarked it, I didn't notice any performance difference at all.
But the new law doesn't simply negate the assertion. It comes back with: "Or else, what?"
If this somehow catches on, I recommend the moniker "Valor's Law".
HOMA is great, but not good enough to justify a wholesale ripout of TCP in the "datacentre"
Sure a lot of traffic is message oriented, but TCP is just a medium to transport those messages. Moreover its trivial to do external requests with TCP because its supported. There is not a need to have HOMA terminators at the edge of each datacentre to make sure that external RPC can be done.
The author assumes that the main bottleneck to performance is TCP in a datacentre. Thats just not the case, in my datacentre, the main bottleneck is that 100gigs point to point isnt enough.
Story time: I worked on Google Fiber years ago. One of the things I worked on was on services to support the TV product. Now if you know anything about video delivery over IP you know you have lots of choices. There are also layers like the protocls, the container format and the transport protocol. The TV product, for whatever reason, used a transport protocol called MPEG2-TS (Transport Streams).
What is that? It's a CBR (constant bit rate) protocol that stuffs 7 188 byte payloads into a single UDP packet that was (IPv4) multicast. Why 7? Well because 7 payloads (plus headers) was under 1500 bytes and you start to run into problems with any IP network once you have larger packets than that (ie an MTU of 1500 or 1536 is pretty standard). This is a big issue with high bandwidth NICs such that you have things like Jumbo frames to increase throughput and decrease CPU overhead but support is sketchy on a hetergenous network.
Why 188 byte payloads? For compatibility with Asynchronous Transfer Mode ("ATM"), a long-dead fixed-packet size protocol (53 byte packets including 48 bytes of payload IIRC; I'm not sure how you get from 48 to 188 because 4x48=192) designed for fiber networks. I kind of thought of it was Fiber Channel 2.0. I'm not sure that's correct however.
But my point is that this was an entirely owned and operated Google network and it still had 20-30+ year old decisions impacting its architecture.
Back to Homa, three thoughts:
1. Focusing on at-least once delivery instead of at-most once delivery seems like a good goal. It allows you to send the same packet twice. Plus you're worried about data offset, not ACKing each specific packet;
2. Priority never seems to work out. Like this has been tried. IP has an urgent bit. You have QoS on even consumer routers. If you're saying it's fine to discard a packet then what happens to that data if the receiver is still expecting it? It's well-intentioned but I suspect it just won't work in practice, like it never has previously;
3. Lack of connections also means lack of a standard model for encryption (ie SSL). Yes, encryption still matters inside a data center on purely internal connections;
4. QUIC (HTTP3) has become the de-facto standard for this sort of thing, although it's largely implementing your own connections in userspace over UDP; and
5. A ton of hardware has been built to optimize TCP and offload as much as possible from the CPU (eg checksumming packets). You see this effect with QUIC. It has significantly higher CPU overhad per payload byte than TCP does. Now maybe it'll catch up over time. It may also change as QUIC gets migrated into the Linux kernel (which is an ongoing project) and other OSs.
By the time all of the proverbial planets align, all but the most niche or cutting edge customer is looking at a project the total cost of which could fund 400G endpoint bandwidth with the associated backhaul and infrastructure to support it - twice over. It’s the problem of diminishing returns against the problem of entrenchment: nobody is saying modern TCP is great for the kinds of datacenter workloads we’re building today, but the cost of solving those problems is prohibitively expensive for all but the most entrenched public cloud providers out there, and they’re not likely to “share their work” as it were. Even if they do (e.g., Google with QUIC), the broad vibe I get is that folks aren’t likely to trust those offerings as lacking in ulterior motives.
TCP isn’t the most efficient protocol, sure, but it survives and thrives because of its flexibility, cost, and broad adoption. For everything else, there’s undoubtedly something that already exists to solve your specific gripe about it.
https://www.allthingsdistributed.com/2024/08/continuous-rein...
It's pretty unfortunate that we've landed here. Hordes of venture-backed companies building shareware-like software with an "open source" label has done some severe damage.
Many of these came from companies who created the protocol solely to push products, which meant the protocols themselves had to compete outside of the vacuum chamber of software alone and instead operate in real world scenarios and product lines. This also meant that as we engineers and SysAdmins deployed them in our enterprises, we quite literally voted with our wallets where able on the gear and protocols that met our needs. Unsurprisingly, TCP/IP won out for general use because of its low cost of deployment and ongoing support compared to alternatives, and that point is lost on the modern engineer that’s just looking at this stuff as “paper problems”.
Good times. But it didn't matter because ATM was the future. /s
Many of these came from companies who created the protocol solely to push products
Like 100Base-VG? That was a good laugh.
TCP/IP won out for general use because of its low cost of deployment and ongoing support compared to alternatives, and that point is lost on the modern engineer that’s just looking at this stuff as “paper problems”
Welcome to my world...
Joking aside, I've seen this first thang when using things like ethernet/tcp to transfer huge amounts of data in hardware. The final encapsulation of the data is simple, but there are so many layers on top, and it adds huge overhead. Then stanrds have modes, and even if you use a subset the hardware must usually support all to be compliant, adding much most cost in hardware. A clean room design could save a lot of hardware power and area, but the loss of compatibility and interop would cost later in software.. hard problem to solve for sure.
See the rfc here: https://www.rfc-editor.org/rfc/rfc8257
The DCTCP seems like its not a silver bullet to the issue, but it does seem to address some of the pitfalls of TCP in a HPC or data center environment. Iv even spun up some vm's and used some old hardware to play with it to see how smooth it is and what hurdles might exist, but that was so long ago and stuff has undoubtedly changed.
What's the advantage of making it a kernel level protocol?