There are bad things that we humans say and do. But let's stop using AIs to do our police work. Nothing good lies down that path.
I think this is a big step forward compared to policing by moody humans[1]. LLMs are a knife, let's use them for good instead of demonizing them.
I don't find this to be very satisfying. It implies that the "bias of the model" is something we can quantify, understand, and control for during training. I'm sure you can come up with canned pat answers about coarse grained biases that have a lot of social valiance, but how do you know the model is weighted towards or against subtle word choices in millions of different scenarios.
People seem to have enough trouble understanding the nuances of texts that are coming from another cultural perspective or form merely 20 years ago. Its a common enough experience to see young people cringing in response to the popular media of 30 years ago, because cultural expectations change, words shift in meaning, the "biases" of society move. LLMs are trained on a mountain of old books, media, internet forms, twitter and facebook posts, etc. How could you ever get a handle on just the "biases" of the training data from the contemporary internet sources, let alone trying to pick apart the backed in cultural assumptions of a bunch of books spanning decades?
When it comes to giving a model a deliberate bias I agree with you.
But what if controlling bias is impossible because the very format of the data demands its emergence?
I always thought both structuralism and poststructuralism were the worst types of in(s/n)ane navel gazing. I think I even wrote on a course feedback section in a final many, many years ago that these topics were massive wastes of my time. (We didn't have "evals" as such, but some faculty would elicit feedback on a separate sheet of paper you could leave with the final exam.)
Observation: the more time I spend just letting the algorithms converge under a huge variety of treatments, the more convinced I become that the most basic structures of our language cast a spell on human understanding of the world, and that the interactions between the primitives of our language and human meaning-making is FAR more deterministic than we would like. Especially in terms of how they direct certain emergent phenomena in human groups.
Implication: maybe it's reasonable to hypothesize that bias isn't "in the data". Maybe bias emerges, nearly deterministically, from the underlying grammatical structure of the data, at least in certain semantic contexts. Maybe certain utterances, and the persuasive power of those utterances, correspond to a sort of default hard-wired path for a majority subset of the population. A co-occurrence of certain neural pathways and certain grammar/semantics that eventually must arise in a given language and socially situated use of that language.
FWIW: I have almost no social science training, the training I did get was received by me with extreme skepticism, and this sounds bunk to me even as I type it. So, for me, this hypothesis would have sounded stupid -- or at least inane/pointless -- before 2019. But I'm becoming increasingly convinced that the languages we've converged on encode certain types of group preferene/bias at an extremely fundamental level.
...either that, or I'm far FAR worse at filtering/generating data than I give myself credit for. But in at least a few cases I have literally reviewed every token within a certain delta of influencing the models behavior on an input sequence and see the same results.
The way you're framing it is "let's use limited humans to moderate an infinite number of LLM bots" - how do you go about it?
What a stupid world.
But, I'm surprised you conclude what you do because the high level findings seem to indicate that Russia war bloggers are more prone to the more extreme forms of "otherization" per the 4 degrees laid out in the model -- e.g. figure 7 shows Russian war bloggers significantly (but not massively) overindex on the 3rd (Villainization) and 4th (Dehumanization) degrees, which are the worse of the two, compared to Ukrainian war bloggers, who only overindex on the 2nd degree (Survival or Security).
Additionally, Table 4 shows Telegram channels that are more prone to otherization are more central (more influential) in the Russian blogger network than they are in the Ukrainian blogger network, as summarized by the line: "The results, shown in Table 4, reveal statistically significant correlations for both degree centrality and eigenvector centrality and the use of othering language by both groups of war bloggers, with a stronger correlation among Russian war bloggers."
On the other hand I often see Russians define Ukrainians as "Nazis". The main difference is that Nazis is used as a political connotation and is usually referred to Ukrainian politicians and military members. Russian propaganda carefully describes the Ukrainian population as relatives captive of a far-right dictatorship. Ukrainian propaganda instead frequently dehumanizes the Russian population.
On the other hand, if your town was flattened, your family was killed, and you've been living in your basement for the last year, maybe that's just how you're going to be. Maybe no other response is reasonable.
(I still think "orc" is weak though. Something real would be stronger. Even just "beasts". Though, animals are frequently nice. "Monster" is a little metaphorical, but, monsters are also real (in that metaphorical sense). That might be stronger.)
Hordes of people running and driving through open fields against artillery strikes and FPV drones and suffering 90% casualties is the canonical mental image these days.
See the problem, that is incorrect. The term has been in use long before 2022. Originally it was coined during the 2015, and meant to be used for both sides due to how miserable and incompetent the entire thing looked (and was).
That is the problem with the paper in question as well - authors don't seem to be familiar with the topic they're trying to research, thinking it's a single event. The timeframe in the dataset is 2015 to mid-2023, which makes very little sense. The use of Telegram for war reporting and the language have been completely different at various points of this timeframe.
To add insult to injury, they are labeling various channels as pro-R or pro-U based on recent messages, but certain channels literally switched sides. They (and many others as well) wiped their message history multiple times, came back with slightly or completely different narratives, and their actual history can only be found in one of the Telegram-related cache services, if at all, as some of these services are either long dead or the info didn't survive. Some people who have been trying to profit from the war started multiple pro-R and pro-U media, including the Telegram channels, although 2022 quickly made them choose sides.
So much happened in 8 years they tried to shove into an LLM and do a primitive sentiment analysis. Gathering the full picture on this timeframe should have been their main thing, as it's not trivial. Just like with anything on the internet and in real life across 8 years, especially if you don't speak any of the languages. These results are not going to be accurate.
It precludes any legit pursue of an understanding of the situation.
It's great if all you want is to motivate violence against a certain group; I won't deny that.
I occasionally call them orcs too. It’s an apt description.
With regard to Israelis, any sort of othering will be perceived as antisemitism. But, what Israel is doing in Gaza is on par with what Russia is doing in Ukraine.
Rethink[0]. It just takes some basic math to understand that the USA has done much worse, even if we just consider civilian deaths estimates. Hundreds of thousands of civilians were killed in Vietnam, Cambodia, and Laos. And for what, to stop communism? Is that a valid reason to devastate three countries?
> But, I guess we were orcs when we invaded and occupied Iraq
Orcs once, orcs forever. Isn't that how it works? If you can call an entire population "orcs", then there must be some intrinsic evil rooted in their ethnicity, culture, or whatever it is.
- [0] https://en.wikipedia.org/wiki/Vietnam_War_casualties#Deaths_...
I'll happily choose communism over such an incredibly unjust system, even if it caused "100 million" deaths worldwide, as you say. That number is complete BS, by the way.
- [0]: https://www.un.org/en/chronicle/article/losing-25000-hunger-...
https://en.wikipedia.org/wiki/Siege_of_Mariupol
In this example, one among countless others, russia killed tens of thousands of civilians, in a couple of months. russia caused more suffering in a few months of war than the USA in a decade in Vietnam. Comparing the two is utterly dishonest.
Absolutely and obviously false, as you will quickly reveal to yourself by spending a few seconds looking up the toll of civilian deaths and maimings during the US-driven conflict in Vietnam.
Comparing the two is utterly dishonest.
The comparison is in any case completely vacuous. There's no indication that you're being dishonest here, however. Most likely it's a case of simple willful ignorance.
In this example, one among countless others,
There are exactly 3 others since WW2: Chechnya, Syria, Afghanistan.
Their number is not, by any stretch, "countless".
Of course russia committed many more in the wars you cited. And we could add https://en.wikipedia.org/wiki/Hungarian_Revolution_of_1956 https://en.wikipedia.org/wiki/Prague_Spring https://en.wikipedia.org/wiki/Transnistria_War https://en.wikipedia.org/wiki/Black_January https://en.wikipedia.org/wiki/Abkhazia_conflict https://en.wikipedia.org/wiki/Russo-Georgian_War
PS.
War is much much simpler than peace. If you have good words for invaders — then you're traitor.
Not since the war, but since Russians have oppressed Ukrainians. It's quite a normalized and promoted slur, online and offline.
It's a culturally derogatory term like you have common slurs that were used to designate some ethnicities or races, like Chechens. These are cultural slurs, unlike "orcs" as an online slur which is a Western term, from a Western reference.
I think you should look more into how Russia has dehumanized some of its ethnic minorities within the Federation and its neighbors throughout the years and how it has until today.
So you're saying that some cultural symbol used in a derogatory manner to address the "Little Russians".. inferiors to Russia... is humanizing and a show of equal brotherly love?
You chose to empty the word of the meaning into a simple hairstyle, much like the Nazis just made use of cultural symbols to address the Jewish or Polish people.
It doesn't look like you're not being honest.
It's funny because that's one of the Russian twists in their propaganda, "let's focus on the subjective meaning of words... and not the actions!".
Here's my take on it, if someone goes into someone else land to erase their culture and kill as many people as possible, terrorizing them, and trying to make their living unbearable while addressing them by an ethnic slur, I'd say that's enough of a sign of dehumanization.
> You chose to empty the word of the meaning into a simple hairstyle, much like the Nazis just made use of cultural symbols to address the Jewish or Polish people for example.
Using your assessment, the Nazi Germany slur "Schlitzauge" was a "simple" ethnic slur to address Slavic people, or "Polacke" was "just" slur to address people from Poland. If you add the context of propaganda and war, and the actions toward those people I think it's pretty clear it was dehumanizing.
You don't need to be literal to dehumanize a group of people, it's actions taken with a given label that put meaning into a slur.
Before somebody like this jumps down my throat, yes I'm sure Russians are doing it too, and yes I know Russia started the war and Ukraine has a right to defend myself. Spare me the tedious accusations of being a shill for the ""orcs"" and ""ziggers"" (really?), the with-us-or-against-us mentality is part of the extremist radicalization process. Both sides are using it and both can be criticized for it.
Then again I wouldn't know from discussions on Reddit, as I can barely stand to look for more than a few seconds at any of the political content there.
(None of this is in reference to the paper, which of course talks about much broader issues, and doesn't refer to the "orc" term at all).
There is also a subgroup of the population that seems very susceptible to fall for dehumanization. It is like they go mad as soon as there is a somewhat socially accepted victim of it.
As if the people writing that paper, were they to wake up one morning and find their house being shelled for no reason, wouldn't start to "otherize" the responsible party.
That being said, I think the theoretical framework presented here is flawed. Many political scholars would say that the definition of politics is the process of defining ingroups and outgroups. Fundamental to understanding political group dynamics is analyzing power, which a major component is the capacity for violence. This research provides a "Model of Othering" which, when properly understood, is really just synonym for political rhetoric. There are a lot of subtle moral presuppositions being made throughout this research that doesn't seem appropriate for a serious scholarly analysis.
This might also be getting a bit too far off topic, but it also strikes me as fairly tone def to be publishing research comparing Russian war rhetoric to Nazi Germany when politicians from both of America's own political parties are currently calling Iran a "terrorist country".
The tone of the introduction is a bit preachy and the findings as stated are underwhelming and beyond obvious [1].
But when reading this type of papers -- and academic literature in general -- don't consider the application and research questions that were chosen by an academic and/or young graduate student with minimal resources. Look at the actual innovation. In this case, the methods. Then ask what else you can do with the increment of innovation.
It is now possible to methodologically analyze rhetorical patterns in open source communications on a shoestring budget. Because producers are also consumers, you can begin to understand at a very granular level which sequences of words elicit the desired effect in subsets of a population, and how. Work like this used to be laborious, expensive, and required a huge amount of socio-cultural training/expertise/judgement. By comparison, all of the technical work described in the paper has a relatively low barrier to entry.
Also: the obvious next step is to treat this as an optimization problem instead of a categorization problem.
--
[1] "othering intensifies during crises" is framed as a finding in this paper, but that's like using nukes to fish for trout. It's a fact we already know and have known for thousands of years. Therefore, the fact that the method reached this conclusion is best understood as validating a proposed method for automating intelligence collection and analysis in the context of open source war propaganda.
It'll take time to understand and integrate, but I imagine it should make theory that used to depend on small numbers of examples (glued together with rhetoric and charisma) richer and more sophisticated. Exciting.
Fast forward 20 years and we have propaganda machines that have been optimized on a nearly individual level, available to the highest bidder (or, more likely, available only to the people who happen to control the attention platforms at that particular point in history).
But, I suppose it's probably accurate to say that, within the walls of the academy at least, the potential for research on nuclear technology was exciting in the 1930s-1940s.
And now I have to ask: what are the beneficial applications of this research? My gut reaction is to unplug, to go offline, to seek out in-person communication and to shun online media.
But I have watched all this play out gradually over time. What about the younger generations growing up now? Will they be more vulnerable to the ever-increasing sophistication of these techniques? Or will they somehow develop a resistance to them?
Again, my gut instinct here is towards pessimism. I work with young people as a volunteer tutor. Many of them seem to keep shutting down due to the overwhelming burden of information thrown at them. At the same time, their attention is sucked away by media which draws them into a frenetic loop of scrolling behaviour, like a mouse on a treadwheel. It's extremely disturbing to watch.
Keep in mind this is rank speculation 20 years into the future.
If this were 2004 OP would have you convinced 90% of your day would be spent culling spam from your inbox.
In short, they have no idea what they're talking about.
It's a bit amusing that the comment right next to yours (albiet a cousin not a sibling) is insisting that what I'm describing is already happening.
The (naive) weaponization of this tech is already underway. What's more, the tools and methods to turn the categorization work in this paper into a goal-oriented optimization procedure already exist -- see for example the Diplomacy work from Meta; again, focus on methods, not the application.
The tech is here today, the barrier to use is low, and the incentive structure to weaponize exists.
What will take 10-20 years is the trickling of that impact into social processes and then the retrospective reflection of what happened.
Note, for example, that Facebook was founded in 2004, but it took 10-15 years for the impact on social processes to reach a point of significant material impact on global society and politics.
> If this were 2004 OP would have you convinced 90% of your day would be spent culling spam from your inbox.
It's not about the amount of time spent; it's about what happens in the time that is spent.
Spam, and online attention markets more generally, is an incredible apt analogy.
Even for the most strict definition of "spam", the market for prevention is pretty huge. Anti-email-spam tech alone accounts for ~$5B/yr in spend with a 21% CAGR. Going all-in on anti-spam tech in 2004 and capturing even of sliver of the market would make you very wealthy today.
If you include spam-like content on social media platforms, that number certainly more than doubles. On headcount alone, Meta told Congress that they have 40K employees directly working on trust and safety, and TikTok says they have about the same number. That's just headcount, and at just two companies.
Beyond that, the following question will prove only more prescient as time goes on: where does "spam" stop and "algorithmically curated content designed to part consumers from their dollar" start? Even today, the average person in places like the US will spend ~2.5 hours consuming algorithmically curated feeds on social media (in service to online advertising revenue). And many aspects of non-social media now have some aspect of quantitative optimization as well.
Contrary to your take here, with the benefit of hindsight, I think the "Eternal September" doomers of the early naughts significantly under-estimated the impact of information technology on how people spend their time and how society spends its resources. (And, anyways, your strawman is a bit too hyperolic -- no one was seriously arguing that 90% of anyone's day would be spend culling spam from their inbox, least of all me...)
Like, for example, if a username had an automatic country-displayed flag at least it would stop those pretending to be from other nationalities.
Japanese soldiers murdered civilians and raped women. They took “comfort women”. They murdered POWs. They didn’t observe the rules of warfare. They (by mistaken timing) attacked the U.S. before declaring war in what was perceived as a “sneak attack”.
The scale of the war crimes perpetrated by Japanese soldiers no doubt fed the desire to other them - to portray them in the most disgustingly racist ways.
I think it’s important to remember that context.
Unfortunately, the racism continued for a long time after the war.
My distant recollection of 1920-1930s "pulp era" science fiction and comic books is that even by the (for today) relatively lenient standards of the 1960s-1970s, non-white characters tended to obviously racist tropes.
My father once told me about two classmates of his, who had promptly been christened "Chinee 1" and "Chinee 2" by the PE coach and remained thus thereafter ... despite the fact that their families had originally come from korea.
The DoD (as it is now) didn't end segregation in the military until after WWII, with the 1948 https://en.wikipedia.org/wiki/Executive_Order_9981 ; in so doing they were of course well in advance* of the remainder of their society.
For context, a comic cover (of a sympathetically portrayed character!) from 1947, one year before: https://upload.wikimedia.org/wikipedia/en/9/9d/The_Spirit_10...
* consider a judge in 1965: https://en.wikipedia.org/wiki/Racial_segregation_in_the_Unit...
EDIT: damn, that wasn't just said in an interview, the judge wrote that down in his legal opinion: https://discoveryvirginia.org/opinion-judge-leon-m-bazile-ja...