Vesuvius Challenge: First letters found in new scroll

277 points by leocassarani 7 days ago | 27 comments

cerebra 3 days ago |
This is really interesting and a challenge for folks who love to solve puzzles like this. Can't wait to see what folks are able to uncover.
I wonder if any of the techniques used on other similarly decoded scrolls can work here.
prashp 3 days ago |
Glad to see there are more developments in the Vesuvius challenge - it has been one of the most interesting things I've discovered on HN.
ACS_Solver 2 days ago |
It's definitely one of the most fascinating things I've seen here.
The challenge is an amazing example of what today's technology can do. We have rolled up papyrus scrolls that were carbonized two thousand years ago. They look just like bits of charcoal and they're so fragile that any attempt to physically unroll them would be quite destructive.
Yet our technology allows us to read their contents. That's fascinating, and gets even better considering how many of those scrolls are likely new texts to us.
someothherguyy 3 days ago |
Related: https://news.ycombinator.com/item?id=39261861
NKosmatos 3 days ago |
Being Greek and having a bad handwriting I’m sure I could help with deciphering some letters. Heck, I can even see a capital lambda “Λ” in the lowres image :-) This would be a good case for a crowdsourcing science project with Greek universities/high schools.
rrr_oh_man 3 days ago |
iirc the challenge is not so much recognition of letters as calibrating all systems to flesh them out in the first place.
The scrolls are stuck in their rolled state, very fragile, and, well, paper-thin.
brabel 3 days ago |
They say they have Python and C APIs that can be used to explore the scroll. I had a look and they have a "tutorial" in a Python notebook: https://colab.research.google.com/github/ScrollPrize/vesuviu...
But I can't make any sense of that, unfortunately :( can someone perhaps explain in terms a programmer would understand, how would I go about using this API to find the text? As far as I can see the dataset just contains a bunch of vertical and horizontal slices of the scroll and I have a hard time understanding how that can provide anything about what's written in them.
Doxin 3 days ago |
This scroll hasn't ever and can't ever been unrolled. It's too fragile for that. The only thing we DO have is a high resolution scan. The trick then of course is how do you "unroll" the scan. Furthermore the scroll is damaged quite heavily. It has carbonized entirely. This means that even _when_ unrolled, it'd be very hard to read. Some progress has been made reading bits and pieces of the scroll using only the scan, but it's tough going.
Long story short: there's no API for the text, because the text is as of yet unknown. Those slices are all we have to go on.
brabel 2 days ago |
I think you misunderstood my question.
> because the text is as of yet unknown. Those slices are all we have to go on.
Yes, which is why I asked: "how would I go about using this API to find the text?"
That seems to be the crux of the challenge? Given a bunch of scroll cross-sections, find the text? My question is just how you would go about doing that! What techniques are used, algorithms etc.
Doxin 2 days ago |
I mean that's largely an unanswered question. Aparantly there's currently a semi-automated algorithm that works but needs a LOT of supervision, making it much too expensive to use to "unroll" all the rolls.
As to how you would even go about using the slices, keep in mind that a stack of slices gives you a volumetric model. The scan basically provides a high resolution volumetric model of the density of the scrolls. You'd need to somehow use this density information to trace a path along the curl of the scroll. This then gets you a density map of the scroll as if it was unrolled. Then the next trick is to somehow turn that into legible text.
The details of how to do the above are currently unknown. There's a prize for whoever figures it out.
ggm 3 days ago |
Given this is a join over image analysis, text recognition, data science and a huge complex 3D analytical model of scans which has to be mapped to the surface states, unrolled, and then subjected to edge and other discrimination, I think the application of ML and other novel techniques is great.
The potential for applying lessons learned to other problems in complex surface/manifold scanning, "reading" states from disparate imaging systems, it's got big upsides.
I'm not particularly sure anyone is claiming this is an LLM demonstrator or proves AGI is coming so if you permit me to float a strawman: it isn't.
It's great science. Very impressive work.
GistNoesis 2 days ago |
The whole thing is kind of a trolling exercise towards data-science people.
One of the usual dataset tutorial in data-science is something called the swiss-roll [1]. Here the exercise look the same but it's totally different because of the quantity of data available.
In the typical swiss-roll dataset the goal is to make the structure emerge from data, whereas in the Vesuvius challenge, we presumably know the structure : the papyrus has been rolled and we want to extract the data.
All the fancy techniques like manifold learning are therefore irrelevant for this problem. So it's back to the basics : statistical modelling.
You build a probabilistic model with some unknown parameters and you maximize the likelihood of the observed data with strong regularization and handcrafted priors.
So your observation data is 3d voxel volume [2], and your desired output is a 2d image of the unrolled scroll.
So intuitively you may want to define your model as a parametrized by theta unroll function : unrolltheta(x,y,z) -> u,v which map a voxel to a pixel position in the unrolled scroll.
Which you then apply to the voxels, obtain mapped pixels and group-by sum to obtain a 2d image which you pass through a neural network to evaluate whether the thing look like what you want (a prior built on other papyrus from what you expect to see).
Instead what you want to do is the reverse : You want to define you model as a parametrized by theta roll function : rolltheta( u,v ) -> x,y,z which map a pixel position in the unrolled scroll to a voxel in space : From 2d to 3d (aka from lower dimensional space to higher dimensional space).
In the 3d space the data will lie on a 2d manifold, this allows you to discriminate your function by how well they align with the slices : from your rolltheta function you can generate a 3d voxel volume of air or papyrus which you can align with your scan. Then you can unroll your scan by "gathering" the corresponding voxel for each pixel of your unrolled scroll. And then eventually apply the same neural network prior which evaluates scrolliness (if you don't have enough data to build it, now it's not absolutely necessary because we have other regularization terms (the alignment).
What remains to be used is the geometrical properties of a rolled scroll : presumably the paper didn't cross over-itself, so this constrain the rolltheta function with a prior like in repulsive surfaces [3] [4]. And prior based on stretchiness of papyrus (or tearing points of the papyrus).
Once you have your likelihood loss function, it's brute-force time, grab your global optimizer and sample the solution, unroll the papyrus, read the treasure map, find the gold and recoup your investment ; (what!!! are you really saying there is no gold ? how are we gonna pay for the work ? let's make it an open data challenge).
[1] https://scikit-learn.org/dev/auto_examples/manifold/plot_swi... [2] https://colab.research.google.com/github/ScrollPrize/vesuviu... [3] https://www.cs.cmu.edu/~kmcrane/Projects/RepulsiveSurfaces/i... [4] https://www.cs.cmu.edu/~kmcrane/Projects/RepulsiveCurves/ind...
kgeist 3 days ago |
>the sequence τυγχαν may be the beginning of the verb τυγχάνω: “to happen,” or perhaps “not to happen.”
It says "me tunkhan..", which would indeed mean "let it not happen". Particle "me" means imperative "not". Or it can be part of a condition: "in case it does not happen..."
>there might be the beginning of διατροπή, a word found in other Herculaneum papyri that would mean something like “confusion, agitation, or disgust.”
But it could also be just diatropos - "various, diverse". "Diatrope" can also mean "shame" and "attack (of a disease)" (for example, diatropai nautiodeis - "nausea").
akie 2 days ago |
This is amazing! I couldn't imagine such a thing possible, it's basically sci-fi if you think about it. However, I couldn't help but chortle and think of my GP when I read the words "yet more text tantalizingly close to legibility".
Simon_ORourke 2 days ago |
I would defund many police forces belonging to "constitutional sheriffs" just to put together a fund to translate a few of these scrolls. Granted, it's probably "just" an Epicurean library, but all the same it's a good investment.
moregon 2 days ago |
This project is a gem, I invite everybody to read their landing page, especially the page announcing the Grand Prize winner of last year, where they also quickly describe the project [1], and the Master Plan [2], where they talk about their goals.
As a recap: - The real, narrative part of ancient Roman and Greek history comes from the tiny minority of texts survived by being copied through the centuries by medieval monks. We know a lot through archeology, epigraphy (engraved stones) etc., but the meat comes from the few ancient historians, philosopher, poets and so on we can read because medieval clerics thought them worthwile to preserve. - An exception to this are papyri, ancient "paper", on which they wrote both high literature and grocery lists. They were used all over the ancient world, but most of them survived only in Egypt and other dried areas, for obvious reasons. They represent the one direct link to the texts as they were written at the time, apart from engraved stones (which, though, tends to be mostly gravestones, with some laws and political stuff thown in). Unfortunately, the great majority of papyri are fragments, and most of them concern bureaucratic stuff like receipts, contracts and the like, with sometimes a private letter or half a page from a literary work. Precious for historians, but not the kind of thing that changes our knowledge of history. - But here it comes the Villa dei Papiri in Herculaneum, the town that shared the fate of Pompeii and was covered by vulcanic ashed from the Vesuvius' eruption of 79 A.D. The Villa was the home of a Greek philosopher, and there people found, at the end of the 18th century, 300 carbonized scrolls from the studio of the guy. These scrolls represent an absolute rarity: hundreds of complete works, most likely never met before, from the haydays of the Roman Empire. They're probably mostly philosophical books in Greek, but they could also contain lost plays, unknown great poets or histories about periods which have few or no sources about (we know that there were whole histories of the career of Alexander the Great that are now lost, we have dozen-of-years-wide holes in our knowledge of most of classical history etc.), - Unfortunately, these 300 scrolls are just lumps of coal. They've been cooked by the volcano's ashes and fused shut. Any attempt to open them in the past caused the destruction of most of the scroll, and for hundred of years they've been considered lost. - Until today! A breakthrough in CT scanning technology (brought by one of the founding teams of this project) has made possible to scan this kind of ancient scrolls with X-rays, accessing the internal "pages" without destroying them. - Having a scan of the internal volume of the scrolls was all well and good, but still you couldn't read anything! The scan doesn't pick up the ink, and it wasn't at all sure that there was a way to do it. That was the objective of last year challenge, gathering a community of competitors and mates to use computer vision and machine learning to virtually unwrap the scan and detect the ink inside, using AI's ability at finding patterns invisible to the human eye. - In only 8-9 months last years challenge was completed successfully, earning the winning team a big prize (almost a million, if I remember correctly?). We were able to read some pages from inside a sample scroll, showing forever that the task is possible! - The goal of 2024 was to expand this PoC to read 5 whole scrolls and to improve the scanning process. At the moment we don't know if the model developed for the Grand Prize of last year can be applied to the text of other scrolls, and anyway the whole scanning-and-virtual-unwrapping thing is incredibly time consuming and expensive and requires extensive optimization. I don't think there's been any major breakthrough till now, but of course many teams could be waiting the end of the year deadline to publish, since it's still a competition with money involved. - If the project is successful, the long term gains could be astounding. It's not only the 300 scrolls we already possess, but the possibility that a whole library could exist, yet to be excavated, in the still buried part of the Villa. You have to consider that its owner was a rich magnate hosting Greek philosophers for the heck of it. It's probable that he owned a big library, far bigger than the comparatively small one found in the studio of the philosopher. If we can develop a method to reliably read carbonized scrolls, the political impetus to dig the rest of the site would be difficult to resist. I'm Italian, I'd personally go in Rome to protest against the government if they didn't allow it :D - Finding this hypotetical library would be like finding a mini Library of Alexandria, a revolution in our knowledge of the ancient world. If you're even just a little bit interested in this kind of stuff, this is the Holy Grail!
As a programmer (boring CRUD stuff) with a master's degree in ancient history (but I've forgotten most of my Greek and Latin), this project tickles both side of my life, my old academic aspirations and my current career. Unfortunately I'm not advanced enough in any of them to really contribute, since the tech part is super-advanced CV and ML stuff I can't even pronounce and decifring papyri is a whole new ball game compared with the tame texts I was translating at university. That's why I'm trying to evangelize about it, to at least contribute a little to its success!
[1] https://scrollprize.org/grandprize [2] https://scrollprize.org/master_plan
moregon 2 days ago |
I'm sorry for the formatting of the post, I didn't know you needed to put a newline after each point in a bullet list. I've tried to edit it but it seems like it's not displaying my modifications. I hope you'll find the topic interesting enough to endure the wall of text!
asimpletune 2 days ago |
Grazie, Moregon, per aver scritto una sinossi così meravigliosa e completa di questi eventi entusiasmanti. Anch’io sono molto entusiasta di questi eventi. Ti andrebbe di scrivermi un messaggio per salutarmi? Ho guardato il tuo profilo, ma non ho trovato informazioni di contatto.
jaythekiwi 2 days ago |
What an interesting technical challenge and puzzle. The fact these were traded for a few kangaroos is hilarious - I wonder who decided on that exchange rate!
bambax 2 days ago |
> The autosegmentation jumps frequently between adjacent sheets, so is not yet precise enough to reveal contiguous texts, but it coarsely follows the entire scroll.
Maybe a stupid idea, but has anyone tried to make a new scroll with known content and markers/known coordinates, and then cook it so as to bring it to a state close to the ones we're trying to unroll. And then scan it, and use that to fine-tune the software?
There are probably simple insights that are extremely difficult to discover when looking at an entirely new problem, that would become more obvious when one already knows the original inside out.
moregon 2 days ago |
I know they have instructions on where to buy papyrus and how to cook it to resemble the conditions of the original scrolls, but from what I understand, nobody has done what you suggest. It sounds like a good idea to me also, but a few suggestions on why they haven't done it yet:
1) Scanning a scroll costs around $40k, between the trip to London, renting the equipment, paying the staff etc.
2) I'm not sure that just cooking the scroll is enough to reproduce the exact conditions of the original, which were also buried underground for thousand of years. Time, soil pressure and so on could have a big impact on the final composition of the sheets.
3) To actually reproduce a realistic sample, you need a professional papyrologist. It's not enough to copy an Ancient Greek text from an online database, you need to know all the conventions of the handwriting of the time (they didn't use spaces, they didn't use the diacritics and accent marks we use in modern editions, often letters where written in idiosyncratic ways depending on the period etc.). Considering how few papyrologists there are, how busy they are, and how long would take one of them to recreate a decent replica, I think this is maybe the biggest obstacle.
xandrius 2 days ago |
I'd say this is when you have perfect being the enemy of good.
I'm sure a whole scroll is expensive to create, cook and scan but sections of a scroll could be done for a fraction.
Also the realism of the papyrus is less crucial than the initial training of uncooked -> cooked -> recovered.
So, OP's suggestion sounds like a great first step to get more insights on what's possible and what's not relatively quickly.
bambax 2 days ago |
1/ and 2/ are of course good objections; I wasn't aware of the cost of a scan (but this kind of experiment could be done by the organizers, saving on trip costs).
But I don't think step 3 is strictly necessary. The main point would be to improve software unrolling, using information from the structure of the roll. So it may be enough to simply put printer's mark at regular intervals, with references.
moregon 2 days ago |
I see what you mean, before I had assumed you meant an exact replica. If you "just" want to write reference marks to help the segmentation models, then you don't need a papyrologist. It's still something that only the organizers could do, since I don't see a volunteer team being able to afford the expenses. I don't know if the reason they haven't done it until now it's simply that they're a small outfit that has to juggle different priorities, or if they have judged it not worth it technically!
pvaldes 2 days ago |
> The autosegmentation jumps frequently between adjacent sheets
The rests of fibers in the cut are exactly like a barcode. They would need a database of each limit and then something to match barcodes. Easier said than done, of course. Other possibility would be to use fiber angles.
pvaldes 2 days ago |
Thinking about it, the whole part is not necessary. A partial read of the first upper cm could make the process much faster discarding the obviously different parts. If this can be converted to a barcode somehow printed and then read with a barcode scanner to produce a translation, we could have a way to prefilter most of the pieces that are too different to be neighbors.
Eiim 2 days ago |
They already have human segmenters segmenting existing scrolls, which presumably is used to train the program in much the same way.