In this case the JEP is scoped to just the hotspot runtime. Other implementations are free to represent objects however they want.
https://docs.oracle.com/javase/specs/jvms/se23/html/jvms-2.h...
It would have been better for Object to have a toDebugString method, and to restrict implicit string conversion (concatenation) to classes implementing a StringConvertible interface with a corresponding separate toString method.
The approach I'm most a fan of is functional languages where everything has a fixed canonical string representation (even cooler when you can convert the string directly back to code), and everything else you must explicitly create a function for.
This is a basic feature of inheritance in an object oriented language, you can take an interface that guarantees "this returns some string" and offer a more concrete guarantee "this returns the objects value as decimal" in the implementation.
> and to restrict implicit string conversion (concatenation) to classes implementing a StringConvertible interface with a corresponding separate toString method.
So anyone wanting to make their code trivially loggable now has to implement StringConvertible by copy pasting String toString(){ return toDebugString(); } into every single class they are implementing? You managed to make Java more verbose for no gain at all, please collect your AbstractAwardInstanceFactoryBuilder on your way out.
If you actually want to output a debug representation, you’ll explicitly call toDebugString(). (And a debugger would call it by default.) This would also make the purpose explicit in the code. And you would’t accidentally output a random debug representation (like the default "@xxxxxxxx") as part of regular string concatenation/interpolation, like on a user-level message, or as a JSON value or whatever. This is why it would be wrong to have a toString() forward to toDebugString().
Currently, for most classes I have to add javadoc for the toString() method saying something like: “Returns a debug representation of this object. WARNING: The exact representation may change in future versions, do not rely on it.” For some of these classes a reliable non-debug string representation would conceivably make sense, but I chose not to have one because there is no immediate need. However, callers need to know which it is, and therefore the documentation is needed.
Conversely, whenever I want to use the toString() of a third-party class, I have to check what kind of output it generates, but unfortunately it’s often not documented. And if testing it (or looking at the source) seems to produce a stable value representation, one has no choice but to hope that it will remain stable, instead of that being part of the contract.
Furthermore, for classes with a value representation it often makes sense to have a different debug representation (for example, with safely escaped control characters or additional meta-data). In current Java, it’s safer to have those in a different, non-standard method than toString() (because users expect the latter to provide the value representation), but then there’s the inconvenience that the debug representation won’t be picked up by debuggers by default, due to the non-standard method.
This is all a symptom of the same method being used for different purposes. And a debug representation makes always sense (as evidenced by the default implementation), while a value representation only sometimes makes sense, and might be absent even when it would make sense. But you generally can’t tell from the method.
Having different methods would solve those issues. With a toDebugString() method, one wouldn’t have to document anything, because the javadoc I paraphrased above would already be contained in the Object class. And the toString() method would only be present for classes that do provide a defined value representation that makes sense on the business/domain level of the class.
But a debugger is far from the only time you want to output a debug representation. A properly formatted log message is the most common case I deal with and one where you can generally use String manipulation freely without fear of breaking anything important.
> This would also make the purpose explicit in the code.
So you made the debug output explicit and verbose, but left toString() a mess that according to you will at least be used for "user-level" messages, JSON and "whatever", which are completely distinct use cases that have nothing in common other than that they output a string as a result. Worse, trying random string operations on a JSON output can break the JSON, so your list of example use cases for implicit string conversion already favors brittle stringly typed code.
> (for example, with safely escaped control characters or additional meta-data).
That assumes the class knows exactly how it is going to be displayed. Which only works if you have a well defined debugger interface for that. At which point you are probably better of dropping the stringly typed code and provide the debugger with a type safe interface for its output, especially since you mention "meta-data".
> This is all a symptom of the same method being used for different purposes.
And your "solution" does nothing to fix that. You might be better of killing toDebugString and toString entirely instead of turning one not so great API into two completely arbitrary and just as bad APIs and force people to roll their own. At which point people trying to output an object for anything will have to chain dozens of instanceof checks to see which kind of String representation the object supports, making the language worse to use.
Equality is computed (and the default is trivial, it just compares addresses), hashcode is computed (and the default is trivial, it just returns the object's address), string conversion is computed (and the default is trivial, it just prints the class name and the hashcode IIRC).
This was trivial a long time ago. Now, all Java GCs move objects in memory.
Also, would an optional "Lockable" class interface be a good way to drastically reduce the classes (and objects) that need to maintain lock information?
I am religiously averse to unused but implemented "Positive-Cost Abstractions". I dream of a root class with no methods except "new", and instance methods delete() and isSelf(x).
Even "Polymorphism" could be an opt-in interface. Subclasses of non-polymorphic classes would inherit functionality (yay, reusability), but not be a subtype of their root class.
(With commutativity between subclassing & the polymorphic interface. I.e. If A is a non-polymorphic class, and Ap is a polymorphic version of class A, then all subclasses of Ap and all polymorphic subclasses of A, would be subtypes of Ap.)
Since System.identityHashCode() can be invoked (for example by IdentityHashMap) even for objects of classes that implement a custom hash code, it also can’t be optimized away even for such classes.
Conceivably it could be optimized away for an object unless or until System.identityHashCode() is invoked for it. It could thus be allocated on demand similarly to how the object locks are. Of course, this has all kinds of performance trade-offs.
I've half expected the Java/JVM team to change Object to extend a new "NakedObject", and implementing new interfaces Equalable, Hashable, Finalizable, and Waitable. (Current interface Clonable is a goof, so maybe deprecate it and replace with Copyable.)
Then "NakedObject" would only need getClass method, right?
Then values and records could also extend NakedObject, right?
Then equals and clone/copy could be generic, right?
--
Alternately, maybe prevent the gotchas with missing equals, hashCode, and toString by having the runtime autogenerate something reasonable.
There's no actual gotcha to them not existing. It works perfectly fine in haskell or rust for instance.
Although it's not a fundamentally useful change to make objects for which a sensible equals/hashcode is trivial not have it, and still have it for objects for which it's not. So without the ability to reach back and remove those properties being universal I fail to see what the point would be.
For example, you couldn’t add a NakedObject-but-not-Object to a java.util.List, because what Object would List::get(index) then return for it? (Note that the List’s type parameter doesn’t exist at runtime and may also not exist at compile time.)
IIRC project valhalla includes the latter, or it did at one point.
Really? I thought that PyObject_HEAD only contains two machine words.
- the base requirement is a class pointer and a refcount, that's PyObject_HEAD (and PyObject)
- then unless you have disabled this at compilation time, they have two pointers for the cycle-breaking part of the GC
- a dict pointer (or as many instance value pointers as there are object members when using slots)
- and a weakrefs pointer (except for slotted classes, unless you added it back)
That is however only 6 words (48 bytes), or 4 (32) for slotted classes with no members or weakrefs.
I believe the header is larger when running without GIL on 3.13 because PyObject has 2 more words (a local refcount, a gc bitset, and a mutex packed in a word with some padding, and a tid pointer).
Still nowhere near 308 though, I've no idea where they got that. Maybe whoever wrote that article included the instance dict in their calculations? That would add 200~300 bytes. Or maybe they got mixed up between bits and bytes, calculated 308 bits somehow then wrote that up as bytes.
Comparing the class count of some of today’s Java projects (including dependencies) to two decades ago, I wonder if we won’t risk hitting that limit in another two decades or so, and then revert back to the bigger header size again. ;).
You are forgetting one important use case: defining classes dynamically. You have to count not only every class (including inner classes) of every dependency, but also all classes created at runtime through direct bytecode manipulation.
The other thought I had is that if AI-generated code takes off, this could explode class count. On the other hand, AI could then also be instructed to refactor to minimal class count.