Meta Movie Gen

722 points by brianjking 10 hours ago | 645 comments

brianjking 10 hours ago |
Additional Links: https://x.com/AIatMeta/status/1842188252541043075 https://ai.meta.com/static-resource/movie-gen-research-paper
From Twitter/X:
Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date.
Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike.
More details and examples of what Movie Gen can do https://go.fb.me/kx1nqm
Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt.
Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment.
Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes.
Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video.
We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.
msp26 10 hours ago |
Any chance of this being released open weights? Or is the risk of bad PR too high (especially near a US election)?
It being 30B gives me hope.
thawab 10 hours ago |
Meta text to image model cm3leon[0], was announced july 2023. It wasn't released yet, I think this one might take a while.
[0] https://ai.meta.com/blog/generative-ai-text-images-cm3leon/
diggan 10 hours ago |
> Any chance of this being released open weights?
Considering that Facebook/Meta releases blog posts titled "Open Source AI Is the Path Forward" but then refuses to actually release any Open Source AI, I'm guessing the answer is a hard "No".
They might release it under usage restrictions though, like they did with Llama, although probably only the smaller versions, to limit the output quality.
causal 9 hours ago |
They have released a ton of open source? Llama 3 includes open training code, datasets, and models. Not to mention open-sourcing the foundation of most AI research today, pytorch.
diggan 7 hours ago |
Llama 3 is licensed under "Llama 3 Community License Agreement" which includes restrictions on usage, clearly not "Open Source" as we traditionally know it.
Just because pytorch is Open Source doesn't mean everything Meta AI releases is Open Source, not sure how that would make sense.
Datasets for Llama 3 is "A new mix of publicly available online data.", not exactly open or even very descriptive. That could be anything.
And no, the training code for Llama 3 isn't available, response from a Meta employee was: "However, at the moment-we haven't open sourced the pre-training scripts".
causal 6 hours ago |
Sure, the Llama 3 Community License agreement isn't one of the standard open licenses and sucks that you can't use it for free if you're an entity the size of Google.
Here is the Llama source code, you can start training more epochs with it today if you like: https://github.com/meta-llama/llama3/blob/main/llama/model.p...
It's rumored Llama 3 used FineWeb, but you're right that they at least haven't been transparent about that: https://huggingface.co/datasets/HuggingFaceFW/fineweb
For models I prefer the term "open weight", but to assert they haven't open sourced models at all is plainly incorrect.
diggan 6 hours ago |
> Here is the Llama source code
Correct me if I'm wrong, but that's the code for doing inference?
Meta employee told me just the other day: "However, at the moment-we haven't open sourced the pre-training scripts", can't imagine they would be wrong about it?
https://github.com/meta-llama/llama-recipes/issues/693
> For models I prefer the term "open weight"
Personally, "Open" implies I can download them without signing an agreement with LLama, and I can do whatever I want with it. But I understand the community seems to think otherwise, especially considering the messaging Meta has around Llama, and how little the community is pushing back on it.
So Meta doesn't allow downloading the Llama weights without accepting the terms from them, doesn't allow unrestricted usage of those weights, doesn't share the training scripts nor the training data for creating the model.
The only thing that could be considered "open" would be that I can download the weights after signing the terms. Personally I wouldn't make the case that that's "open" as much as "possible to download", but again, I understand others understand it differently.
causal 6 hours ago |
The source I linked is the PyTorch model, should be all you need to run some epochs. IDK what the pretraining scripts are.
diggan 5 hours ago |
Doesn't the training script need to have a training loop at least? Loss calculation? A optimizer? The script you linked contains neither, pretty sure that's for inference only
causal 40 minutes ago |
Oof you're right - no loss function or optimizer in place, so you'd need add that plus pull in data + tokenizer to get a training loop going.
Apologies - you are right and I was wrong. I would edit my comments but they're past the edit window, will leave a comment accordingly.
causal 39 minutes ago |
Past the edit window - want it to be higher up that only the model architecture is shared, no training scripts, as diggan correctly points out.
imjonse 10 hours ago |
That and the NFSW finetunes that will inevitably follow; unlike the text-gen finetunes these could really cause trouble with deepfakes.
littlestymaar 10 hours ago |
Deepfakes are already a reality, the technology is already there and good enough for harm, the genie is not going back to the bottle.
In fact, the more realistic the deepfakes become, the less harmful actual revenge porn and stolen sex videos can be, because of plausible deniability.
mrguyorama 4 hours ago |
We live in a world where you can just say dumb bullshit about Haitians and millions of people will insist it's real.
This "good deepfakes will prevent harm because of plausible deniability" is absurd copium, and utterly divorced from reality.
Speak to victims some time. You are not helping them.
happyraul 10 hours ago |
Hippos can't actually swim though.
diggan 10 hours ago |
Was my first reaction too when seeing the video at the top. But then after thinking about it, it makes sense as an example, you want to showcase things that aren't real but look realistic. A hippo swimming looks real, but it isn't as they don't swim.
elpocko 9 hours ago |
I have watched some films recently, and they are full of weird mistakes. A bunch of balloons can't lift your house into the air. DeLoreans can't actually travel through time. Gamma rays don't give you superhuman strength. A 6502 CPU couldn't power an advanced AI for killer robots from the future. So unrealistic.
awfulneutral 8 hours ago |
Haha, this is the first thing I thought of too. I knew adult hippos walk on the bottom, but from looking at existing videos it looks like small (baby/pygmy) hippos do too, they don't float at the surface like this.
TrackerFF 10 hours ago |
All the vids have that instantly recognizable GenAI "sheen", for the lack of a better word. Also, I think the most obvious giveaway are all the micro-variations that happen along the edges, which give a fuzzy artifact.
Rinzler89 10 hours ago |
>All the vids have that instantly recognizable GenAI "sheen"
That's something that can be fixed in a future release or you can fix it right now with some filters in post in your pipeline.
surfingdino 10 hours ago |
> "That's something that can be easily fixed in a future release (...)"
This has been the default excuse for the last 5+ years. I won't hold my breath.
ekianjo 10 hours ago |
You had AI videos 5 years ago?
surfingdino 10 hours ago |
AI in general.
bbor 10 hours ago |
…I mean, it was advancing slowly for linguistic tasks until late 2022, that’s fair. That’s why we’re in such a crazy unexpected rollercoaster of an era - we accidentally cracked intuitive computing while trying to build the best text autocomplete.
AI in general is from 1950, or more generally from whenever the abacus was invented. This very website runs on AI, and always has. I would implore us to speak more exactly if we’re criticizing stuff; “LLMs” came around (in force) in 2023, both for coherent language use (ChatGPT 3.5) and image use (DALLE2). The predecessors were an order of magnitude less capable, and going back 5 years puts us back in the era of “chatbots”, aka dumb toys that can barely string together a Reddit comment on /r/subredditsimulator.
surfingdino 4 hours ago |
AI so far has given us ability to mass produce shit content of no use to anybody and the next iteration of customer support phone menu trees that sound more convincingly yet remain just as useless. That and another round of IP theft and mass surveillance in the name of progress.
kelseyfrog 4 hours ago |
This is a consequence of a type of cognitive bias - bad examples of AI are more easily detectable than good examples of AI. Subsequently, when we recall examples of AI content, bad examples are more easily accessible. This leads to the faulty conclusion that.
> AI so far has given us ability to mass produce shit content of no use to anybody
Good AI goes largely undetected, for the simple reason that it closely matches the distribution of non-AI content.
Controversial aside: This is same bias that results in non-passing trans people being representative of the whole. Passing trans folk simply blend in.
airstrike 3 hours ago |
We don't have AI in general today
Rinzler89 10 hours ago |
5 years ago there were no AI videos. A bit over a year ago the best AI videos were hilarious hallucinations of Will Smith eating spaghetti.
Today we have these realistic videos that are still in the uncanny valley. That's insane progress in the span of a year. Who knows what it will be like in another year.
Let'em cook.
authorfly 10 hours ago |
Disco Diffusion was a (bad) thing in 2021 that lead to the spaghetti video / weird Burger Kind Ads level quality. But it ran on consumer GPUs / in Jupyter notebook.
2 years ago we had decent video generation for clips
7 months ago we have Sora https://news.ycombinator.com/item?id=39393252 (still silence since then)
With these things, like DALL-E 1 and GPT-3, the original release of the game changer often comes ca. 2 years before people can actually use it. I think that's what we're looking at.
I.e. it's not as fast as you think.
bbor 10 hours ago |
What video generation was decent 2 years ago? Will smith eating spaghetti was barely coherent and clearly broken, and that was March 2023 (https://knowyourmeme.com/memes/ai-will-smith-eating-spaghett...).
And isn’t this model open source…? So we get access to it, like, momentarily? Or did I miss something?
AJ007 4 hours ago |
The subtle "errors" are all low hanging fruit. It reminds me of going to SIGGRAPH years back and realizing most of the presentations were covering things which were almost imperceptible when looking at the slides in front. The math and the tech was impressively, but qualitatively it might have not even mattered.
The only interesting questions now have nothing to do with capability but with economics and raw resources.
In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.
And if we can spend $100 of compute and get something I described above, why wouldn't Disney et al throw $500m at something to get even more out of it, and charge everyone $50? Or maybe we'll all just be zoo animals soon (Or the zoo animals will have neuralink implants and human level intelligence, then what?)
surfingdino 3 hours ago |
> "In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies."
That would be a boring movie.
lancesells 2 hours ago |
> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.
I don't think so at all. You're thinking a movie is just the end result that we watch in theaters. Good directing is not a text prompt, good editing is not a text prompt, good acting is not a text prompt. What you'll see in a few years is more ads. Lots of ads. People who make movies aren't salivating at this stuff but advertising agencies are because it's just bullshit content meant to distract and be replaced by more distractions.
ben_w 2 hours ago |
Indeed, adverts come first.
But at the same time, while it is indeed true that the end result is far more than simply just making good images, LLMs are weird interns at everything — with all the negative that implies as well as the positive, so they're not likely to produce genuinely award winning content all by themselves even though they can do better by asking them for something "award winning" — so it's certainly conceivable that we'll see AI indeed do all these things competently at some point.
ben_w 2 hours ago |
> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.
I'm also expecting, before 2030, that video game pipelines will be replaced entirely. No more polygons and textures, not as we understand the concepts now, just directly rendering any style you want, perfectly, on top of whatever the gameplay logic provided.
I might even get that photorealistic re-imagining of Marathon 2 that I've been wanting since 1997 or so.
atrus 10 hours ago |
I think the big blind spot people have with these models is that the release pages only show just the AI output. But anyone competently using these AI tools will be using them in step X of a hundred step creative process. And it's only going to get worse as both the AI tools improve and people find better ways to integrate them into their workflow.
Rinzler89 10 hours ago |
Yeah exactly. Video pipelines that go into productions we only see the end results of have a lot of steps to them beyond just the raw video output/capture. Even Netflix/Hollywood productions without VFX have a lot of retouching and post processing to them.
derefr 4 hours ago |
Not even filters; every text2image model ever created thusfar, can be very easily nudged with a few keywords into generating outputs in a specific visual style (e.g. artwork matching the signature style of any artist it has seen the some works from.)
This isn't an intentional "feature" of these models; rather, it's kind of an inherent part of how such models work — they learn associations between tokens and structural details of images. Artists' names are tokens like any other, and artists' styles are structural details like any other.
So, unless the architecture and training of this model are very unusual, it's gonna at least be able to give you something that looks like e.g. a "pencil illustration."
DebtDeflation 10 hours ago |
At least all the humans in these videos seem to have the correct number of fingers, so that's progress. And Moo Deng seems to have a natural sheen for some reason so can't hold that against them. But your point about the edges is still a major issue.
dageshi 10 hours ago |
Yeah but... it's good enough?
There were movies with horrible VFX that still sold perfectly well at the time.
jsheard 10 hours ago |
An important contrast is that early VFX offered strong control with weak fidelity, and these prompt-based AI systems offer high fidelity with weak control. Intent matters if you want to make something more than a tech demo or throwaway B-roll and you can't communicate much intent in a 30 word prompt, assuming the model even follows the prompt accurately.
Ajedi32 9 hours ago |
Just need to wait for someone to develop a version of ControlNet that works with this system.
marcosdumay 8 hours ago |
This is such an important problem of the entire genAI idea. It's absurd that people keep focusing on quality instead of talking about it.
But then, a lot of people have financial reasons to ignore the problem. What's too bad, because it's hindering the creation of stuff that are actually useful.
dageshi 8 hours ago |
Yeah, that's a fair point.
blueblisters 6 hours ago |
Yeah controlnet-style conditioning doesn't solve for consistent assets, or lighting, framing etc. Maybe its early but it seems hard to get around traditional 3D assets + rendering, at least for serious use-cases.
These models do seem like they could be great photorealism/stylization shaders. And they are also pretty good at stuff like realistic explosions, fluid renders etc. That stuff is really hard with CG.
blargey 10 hours ago |
I wonder how much RLHF or other human tweaking of the models contributes to this sort of overstauration / excess contrast in the first place. The average consumer seems to prefer such features when comparing images/video, and use it as a heuristic for quality. And there have been some text-to-image comparisons of older gen models to newer gen, purporting that the older, more hands-off models didn't skew towards kitschy and overblown output the way newer ones do.
jensensbutton 10 hours ago |
I don't think that's a bug. I think that helps us separate truth from fiction as we navigate the transition to this new world.
ffsm8 10 hours ago |
Ever heard of post processing? Because no, you can't trust these signals to always exist with AI content.
lopis 10 hours ago |
I assure you that's not enough. These are high quality videos. Once they get uploaded to social media, compression mostly makes imperfections go away. And it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI. I would 100% believe most of these videos were real if caught off guard.
ddtaylor 8 hours ago |
I regularly catch my kids watching AI generated content and they don't know it.
throwup238 7 hours ago |
A surprising amount of it is really popular too. I recently figured out that the Movie Recaps channel was all AI when the generated voice slipped and mispronounced a word in a really unnatural way. They post videos almost daily and they get millions of views. Must be making bank.
inkcapmushroom 5 hours ago |
I was watching UFC recaps on Youtube and the algorithm got me onto AI generated MMA content, I watched for a while before realizing it. They were using old videos which were "enhanced" using AI and had an AI narrator. I only realized it when the fight footage got so old, and the AI had to do so much work to touch it up, that artifacts started appearing in the video. Once I realized it I rewatched the earlier clips in the video and could see the artifacts there too, but not until I was looking for them.
Twisell 5 hours ago |
Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.
I won't argue wether text to speech qualifies as an AI but I agree they must be making bank.
derefr 4 hours ago |
> Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.
Might also be an AI voice-changer (i.e. speech2speech) model.
These models are most well-known for being used to create "if [famous singer] performed [famous song not by them]" covers — you sing the song yourself, then run your recording through the model to convert the recording into an equivalent performance in the singer's voice; and then you composite that onto a vocal-less version of the track.
But you can just as well use such a model to have overseas workers read a script, and then convert that recording into an "equivalent performance" in a fluent English speaker's voice.
Such models just slip up when they hit input phonemes they can't quite understand the meaning of.
(If you were setting this up for your own personal use, you could fine-tune the speech2speech model like a translation model, so it understands how your specific accent should map to the target. [I.e., take a bunch of known sample outputs, and create paired inputs by recording your own performances of them.] This wouldn't be tenable for a big low-cost operation, of course, as the recordings would come from temp workers all over the world with high churn.)
spywaregorilla 3 hours ago |
Can you identify any of these models?
bee_rider 3 hours ago |
I wonder if they are making bank. Seems like a race to the bottom, there’s no barrier to entry, right?
atomic128 3 hours ago |
Right, content creators are in a race to the bottom.
But the people who position themselves to profit from the energy consumption of the hardware will profit from all of it: the LLMs, the image generators, the video generators, etc. See discussion yesterday: https://news.ycombinator.com/item?id=41733311
Imagine the number of worthless images being generated as people try to find one they like. Slop content creators iterate on a prompt, or maybe create hundreds of video clips hoping to find one that gets views. This is a compute-intensive process that consumes an enormous amount of energy.
The market for chips will fragment, margins will shrink. It's just matrix multiplication and the user interface is PyTorch or similar. Nvidia will keep some of its business, Google's TPUs will capture some, other players like Tenstorrent (https://tenstorrent.com/hardware/grayskull) and Groq and Cerebras will capture some, etc.
But at the root of it all is the electricity demand. That's where the money will be made. Data centers need baseload power, preferably clean baseload power.
Unless hydro is available, the only clean baseload power source is nuclear fission. As we emerge from the Fukushima bear market where many uranium mining companies went out of business, the bottleneck is the fuel: uranium.
valval 14 minutes ago |
You spent a lot of words to conclude that energy is the difference maker between modern western standards of living and whatever else there is and has been.
mystifyingpoi 4 hours ago |
But it uses AI only for audio, right? Script for the vid seems to be written by human, given the unusual humor type of this channel. I started watching this channel some time ago.
throwup238 3 hours ago |
It's hard to tell whether they use AI for script generation. After having seen enough of those recaps, the humor seems to be rather mechanical and basic humor is relatively easy to get from an LLM if prompted correctly. The video titles also seem as if they were generated.
That said, this channel has been producing videos well before ChatGPT3.5/4 so at the very least they probably started with human written scripts.
the_af 2 hours ago |
A group I follow about hobby/miniatures (as in wargaming miniatures and dioramas) recently shared an "awesome" image of a diorama from another "hobby" group.
The image had all the telltale signs of being AI generated (too much detail, the lights & shadows were the wrong scale, the focus of the lens was odd for the kind of photo, etc). I checked that other group, and sure enough, they claim to be about sharing "miniature dioramas" but all they share is AI-generated crap.
And in the original group, which I'm a member of and is full of people who actually create dioramas -- let's say they are "subject matter experts" -- nobody suspected anything! To them, who are unfamiliar with AI art, the photo was of a real hand-made diorama.
dham 3 hours ago |
It's kind of an interesting phenomenon. I read something on this. Basically being born between ~1980 and ~1990 is a huge advantage in tech.
ben_w 2 hours ago |
The only generation that ever knew how to set the clock on a VCR: our parents needed our help; our kids won't have even seen a VCR outside of a museum, much less used one.
mikae1 5 hours ago |
> it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI.
At this point, looking at a big tech SoMe feed I would expect that everything is, or at least could be, gen AI content.
jetrink 5 hours ago |
A friend who lives in North Carolina sent me a video of the raging floodwaters in his state- at least that's what the superimposed text claimed it was. When I looked closer, it was clearly an Indian city filled with Indian people and Indian cars. He hadn't noticed anything except the flood water. It reminded me of that famous selective attention test video[1]. I won't ruin it for those who haven't seen it, but it's amazing what details we can miss when we aren't looking for them. I suspect this is made even worse when we're casually viewing videos in a disjointed way as on social media and we're not even giving one part of the video our full attention.
1. https://www.youtube.com/watch?v=vJG698U2Mvo
itslennysfault 4 hours ago |
hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.
I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.
firebaze 4 hours ago |
> hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.
> I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.
The point of the video wasn't to count correctly, but to see the gorilla
yunwal 4 hours ago |
cool, he noticed it right away
andrewinardeer 3 hours ago |
I believe them. Why would people lie on the internet?
bee_rider 3 hours ago |
99% the person was playing along for the rest of us, so we get a chance to enjoy the video as intended.
xsmasher 3 hours ago |
> I noticed it right away
jetrink 4 hours ago |
It probably doesn't work if you're primed to look for hidden details. I took the test along with my Psychology 101 class of about 30 people and no one noticed anything amiss.
hackernewds 4 hours ago |
I do not see how the examples you mentioned are related to the topic? What does selective attention have to do with the video looking AI generated in all the frames?
CSSer 4 hours ago |
Their argument is that if someone is affected by confirmation bias, they likely won’t notice these kinds of details.
Essentially, send me a video of something I care about and I will only look for that thing. Most people are not detectives, and even most would-be detectives aren’t yet experts.
scotty79 3 hours ago |
If you see a text accompanying some content you can de-prime yourself by saying "nuh-uh, that's exactly what it's fscking not."
szundi 3 hours ago |
probably people will soon develop a habit of verifying every detail in videos of interest haha
szvsw 2 hours ago |
Cause people are well known for verifying every detail in most other forms of media already right?
jsheard 3 hours ago |
For the entire duration of the Russia/Ukraine war "combat footage" that is actually from the video game ARMA 3 has gone viral fairly regularly, and now exactly the same thing is happening with Israel/Iran.
aguaviva 2 hours ago |
And which YouTube happily promotes straight to the top, of course -- thanks to the efforts of its rocket-science algorithm team. (Not sure whether the ones I've been seeing were generated by that particular platform, but YT does seem to promote obviously fake and deceptively labelled "combat" footage with depressing regularity).
ynniv an hour ago |
The willingness of people to believe that combatants are wearing cinematic body cams for no tactical reason can only be matched by their willingness to assume people meticulously record every minute of their lives just so they can post a once-in-a-lifetime event on TikTok.
Who even needs AI generated videos when you can just act out absurdity and pretend it's real?
Seanambers 34 minutes ago |
As far as I know, most of the viral stuff has been active air defence CWIS and the like which can be hard to discern.
There's a morbid path from the grainy Iraq war and earlier shaky footage, through IS propaganda which at the time had basically the most intense combat footage ever released to the Ukraine war. Which took it to the morbid end conclusion of endless drone video deaths and edited clips 30+ mins long with day long engagements and defending.
And yes, to answer your belief that there is none - there is loads of "cinematic body cam footage out there now".
stavros an hour ago |
It's kind of sad that we don't even need AI to create misinformation, the bar for what people will fall for is really low.
fossuser 3 hours ago |
People believe false things easily if it confirms their priors. Confirmation bias is strong.
Fake images play into that, but they don't need to be AI generated for that to be true, it's been true forever.
paul7986 3 hours ago |
Indeed watching Reels or Tiktok videos is an exercise in testing your bullshit meter and commenting accordingly to let the uninformed know hey this is most likely fake.
wpietri 2 hours ago |
And let's not forget the paper that goes with the video, which has a stellar title: http://www.chabris.com/Simons1999.pdf
valval 21 minutes ago |
The way I see it, it won’t take long before human eyes won’t be able to distinguish AI generated content from original.
The only regret I have about that is losing video as a form of evidence. CCTV footage and the like are a valuable tool for solving crimes. That’s going to be out the window soon.
newaccount74 10 hours ago |
I thought the movements were off. The little girl on the beach moves like an adult, the painter looks like a puppet, and everything is in slow motion?
declan_roberts 8 hours ago |
They look like some commercial promo video, which makes sense since that's probably what they were trained on.
the_af 2 hours ago |
To me they seem off, but off in the same sense real humans in ads always seem off. E.g. the fake smile of the smiling girl. That's what people look like in ads.
Workaccount2 10 hours ago |
It's my understanding that the AI sheen is done on purpose to give people a "tell". It is totally possible right now to at least generate images with no discernible tell.
spookie 8 hours ago |
> It is totally possible right now to at least generate images with no discernible tell.
I have yet to find examples of this
grumbel 6 hours ago |
There are numerous tricks and LORAs to make realistic images without the overpolish you get by default:
* https://www.reddit.com/gallery/1fvs0e1
* https://old.reddit.com/r/StableDiffusion/comments/1fak0jl/fi...
daedrdev 3 hours ago |
Haha, I think I can maybe tell on like one or two of those
mitthrowaway2 5 hours ago |
In the linked webpage, the following videos would be good enough to trick me:
- The monkey in hotspring video, if not for its weird beard...
- The koala video I would have mistaken for hollywood-quality studio CGI (although I would know it's not real because koalas don't surf... do they?)
- The pumpkin video if played at 1/4 resolution and 2x speed
- The dog-at-Versailles video edit
If the videos are that good, I'm sure I already can't distinguish between real photos and the best AI images. For example, ThisPersonDoesNotExist isn't even very recent, but I wouldn't be able to tell whether most of its output is real or not, although it's limited to a certain style of close-up portrait photography.
https://this-person-does-not-exist.com/en
lcnPylGDnU4H9OF 4 hours ago |
> limited to a certain style of close-up portrait photography
Not to take away from your point but it's more limited than one might think from this phrase. As an exercise, open that page and scroll so the full image is on your screen, then hover your mouse cursor within the iris of one of the eyes, refresh and scroll again. (Edit: I just noticed there's a delayed refresh button on the page, so one can click that and then move their mouse over the eye to skip a full page refresh.) I've yet to see a case where my mouse cursor is not still in line with the iris of the next not-person.
hoosieree 8 hours ago |
Video autotune.
dekhn 5 hours ago |
That sheen looks (to me) like some of the filters that are used by people who copy videos from TV and movie and post them on (for example) facebook reels.
There's an entire pattern of reels that are basically just ripped-off-content with enough added noise to (I presume) avoid content detection filters. Then the comments have links to scam sites (but are labelled as "the IMDB page for this content").
CSSer 4 hours ago |
The idea that Meta’s effectively stolen content is tainted by a requirement to avoid collecting stolen content is laughably ironic.
kylebenzle 4 hours ago |
Yes, but thats just a hypothesis, have we seen any evidence that shows the cause of the "AI sheen" is bad training data, or more likly, just a shortcomming of generating realistic photos from text at this early stage.
alana314 4 hours ago |
I'm thankful to be able to recognize that sheen, though I think it will go away soon enough
demaga 3 hours ago |
It is maybe recognizable in most cases, but definitely not instantly nor easily. I could definitely see nobody noticing one of those clips used in an otherwise non-AI video production.
forgetfulness 2 hours ago |
A lot look like CGI, but I wouldn't be able to tell that they weren't created by an actual animator.
sixothree an hour ago |
I did some images generation and found a LORA for VHS footage. It's amazing what "taking away the sheen" can do to make an image look strikingly real.
Loughla 36 minutes ago |
The ATV turning in mid air was a giveaway as well. Physics seems to be a basic problem for these type of videos.
jsheard 10 hours ago |
If nothing else it will produce some amazing material for this account, once the content farms get their hands on it: https://x.com/FacebookAIslop
niles 10 hours ago |
Crashes Firefox mobile. Looks pretty impressive on Chrome! Apparently hosted only
lwansbrough 10 hours ago |
This is really something. The spatial and temporal coherence is unbelievable.
sourraspberry 10 hours ago |
Impressive.
Always important to bear in mind that the examples they show are likely the best examples they were able to produce.
Many times over the past few years a new AI release has "wowed" me, but none of them resulted in any sudden overnight changes to the world as we know it.
VFX artists: You can sleep well tonight, just keep an eye on things!
bbor 10 hours ago |
Tbf, the biggest private infrastructure project in the history of humanity is now underway (Microsoft GPU centers), the fastest app to reach #1 on the App Store was released (ChatGPT), and it’s dominating online discourse. Many companies have used LLMs to justify layoffs, and /r/writers and many, many fanart subreddits already talk of significant changes to their niches. All of this was basically at 0 in 2022, and 100 by early 2023. It’s not normal.
Everyone should sleep well tonight, but only because we’ll look out for each other and fight for just distribution of resources, not because the current job market is stable. IMO :)
chucky_z 8 hours ago |
VFX artists cannot sleep well, they're already being displaced with AI or being forced to use it to massively increase their output.
Here's an example thread: https://www.reddit.com/r/vfx/comments/1e4zdj7/in_the_climate...
I am not trying to be negative, however it is the reality that ML/LLM has eliminated entire industries. Medical transcription for example is essentially gone.
sionisrecur 8 hours ago |
I don't see it as that much of a problem. It's like washing machines taking away people's job of washing clothes, what are they gonna do with their time now? Maybe something more productive.
cubefox 8 hours ago |
We really have a problem once there are no more jobs left for us humans, and only the people who own capital (stocks, real estate etc) will be able to earn money from dividends.
ddtaylor 8 hours ago |
> We really have a problem once there are no more jobs left for us humans
What is the required amount of labor humans should have to do?
krapp 8 hours ago |
The amount required to pay rent on their continued survival, which in a capitalist society, and excluding members of the capitalist class, will never be zero.
chucky_z 7 hours ago |
… something more productive than art?
that’s quite a productive thing. art has tremendous value to society.
why don’t we automate the washing machine more instead of automating the artist?
echoangle 5 hours ago |
Well we already automated all the easy stuff (washing machines for example), and now we’re automating more stuff as we get better at it.
whiplash451 5 hours ago |
Washing machines and roombas were the low hanging fruits in the real world.
Automating more in the real world is much (much) harder than grabbing the low-hanging fruits in the digital world.
dagmx 7 hours ago |
That thread you linked doesn’t seem to align at all with your claims though? The majority of comments do not make the claim that they’re using any GenAI elements.
As someone who’s worked in the industry previously and am quite involved still, very few studios are using it because of the lack of direction it can take and the copyright quagmire. There are lots of uses of ML in VFX but those aren’t necessarily GenAI.
GenAI hasn’t had an effect on the industry yet. It’s unlikely it will for a while longer. Bad business moves from clients are the bigger drain, including not negotiating with unions and a marked decline in streaming to cover lost profits.
burkaman 8 hours ago |
Yes, and like pretty much every AI release I've seen, even these cherry-picked examples mostly do not quite match the given prompt. The outputs are genuinely incredible, but if you imagine actually trying to use this for work, it would be very frustrating. A few examples from this page:
Pumpkin patch - Not sitting on the grass, not wearing a scarf, no rows of pumpkins the way most people would imagine.
Sloth - that's not really a tropical drink, and we can't see enough of the background to call it a "tropical world".
Fire spinner - not wearing a green cloth around his waist
Ghost - Not facing the mirror, obviously not reflected the way the prompter intended. No old beams, no cloth-covered furniture, not what I would call "cool and natural light". This is probably the most impressively realistic-looking example, but it almost certainly doesn't come close to matching what the prompter was imagining.
Monkey - boat doesn't have a rudder, no trees or lush greenery
Science lab - no rainbow wallpaper
This seems like nitpicking, and again I can't underestimate how unbelievable the technology is, but the process of making any kind of video or movie involves translating a very specific vision from your brain to reality. I can't think of many applications where "anything that looks good and vaguely matches the assignment" is the goal. I guess stock footage videographers should be concerned.
This all matches my experience using any kind of AI tool. Once I get past my astonishment at the quality of the results, I find it's almost always impossible to get the output I'm looking for. The details matter, and in most cases they are the only thing that matters.
psb217 6 hours ago |
The one thing that immediately stood out to me in the ghost example was how the face of the ghost had "wobbly geometry" and didn't appear physically coupled to the sheet. This and the way the fruit in the sloth's drink magically rested on top of the drink without being wedged onto the edge of the glass as that would require were actually some of the more immediate "this isn't real" moments for me.
burkaman 4 hours ago |
I think those types of visual glitches can probably be fixed with more or better training, and I have no doubt that future versions of this type of system will produce outputs that are indistinguishable from real videos.
But better training can't fix the more general problem that I'm describing. Perfect-looking videos aren't useful if you can't get it to follow your instructions.
elcomet an hour ago |
The ghost is insanely impressive, it's the example that gave me a "wow" effect. The cloth physic looks stunning, I never thought we would reach such a level of temporal coherence so fast.
syntaxing 10 hours ago |
I find the edit video with text the most fascinating aspect. I can see this being used for indie films that doesn’t have a CGI budget. Like the scene with the movie theater, you can film them on lounge chairs first and then edit it to seem like a movie theater.
ForHackernews 10 hours ago |
Why bother? Actors cost money and scheduling is difficult. Do the whole thing in AI - the model will be trained on better actors than your indie cast, anyway.
M4v3R 10 hours ago |
It will be a loong time before AI can produce lead actors that are believable, act exactly the way you as a director want and tell the story you want to tell, so I think at least for now you'll still need the actors for the lead roles. But I can totally this being used for generating people/stuff in the background of certain shots in a low budget movie.
ipaddr 2 hours ago |
Isn't that a core problem now. Getting actors to act exactly how you want was never a solved problem.
But this limits promotion where actors do interviews and sell the movie to the public. It also limits an actor doing something crazy that tanks a movie like a tweet.
l33tbro an hour ago |
The answer is that it depends on the director. For David Fincher or the Coen brothers, having this level of exactitude and precision is what their craft is all about.
But for plenty of other masters - think Cassavetes, Mike Leigh, even PTA - the actor's outstanding talent and instincts bring something to the script and vision that is outside of their prescriptive powers. Their focus is essentially setting up a framework for magic to happen inside of.
zappchance 10 hours ago |
Consistentency between scenes is one possible reason.
gen3 8 hours ago |
100% agree, the background replace that puts the guy into a stadium would be fully usable as a cut in a movie/tv show, and the background is believable enough that no one would bat an eye. If you use it properly, I expect a quality uplift on indie films/shorts. Your limit is your creativity
quest88 10 hours ago |
That's very impressive.
Jean-Papoulos 10 hours ago |
Website doesn't work on Firefox and videos don't play on Edge. They should consider asking the AI to make a correct website before having it make hippos swim.
loufe 10 hours ago |
The entire page load is completely broken on Edge for me. Bizarre
littlestymaar 10 hours ago |
It works fine for me on Firefox on Linux, weird.
adrian_b 10 hours ago |
Also Firefox 127.0.1 on Linux, works perfectly (using an NVIDIA GPU).
chinathrow 10 hours ago |
That's a bit old, isn't it?
adrian_b 8 hours ago |
It is only a little more than 3 months old, so I would not call that old.
I avoid updating to each new Firefox version, because from time to time they break some features important for me.
aphit 10 hours ago |
All works fine for me here in Edge, odd.
bob1029 10 hours ago |
Not working in Safari on my MBP either.
HelloMcFly 10 hours ago |
I use Edge at work: the videos played without issue (version 129.0.2792.65 on Windows).
I use Firefox on my personal device: the website worked fine though took an extra "hiccup" to load compared to Edge (version 131.0 on Windows).
chillingeffect 10 hours ago |
Yeah it doesn't play the video for me on S10+. I can't imagine what they're doing to break that. It's just another disposable consumerist craze anyway.
voidUpdate 10 hours ago |
Some of these look really obviously bad, like the guy spinning the fire and the girl running along the beach. And it completely failed at the bubbles
Quarrelsome 10 hours ago |
doesn't need to be movie quality, just needs to be tiktok quality and this totally passes the bar.
Are you ready to become a penguin in all of your posts to maximise aquatic engagement? I am.
voidUpdate 10 hours ago |
I've become a robot and a demon to maximise engagement, its called being a vtuber
nsagent 10 hours ago |
Interesting perspective, considering a paper ByteDance just released yesterday [1] has much worse video quality. If your comparison is to real videos, then for sure the quality isn't great. If instead you compare to other released research, the this model is one of the best released thus far.
[1]: https://epiphqny.github.io/Loong-video/
add-sub-mul-div 9 hours ago |
Okay, let's give it a participation trophy for being the best of the slop category.
beezlebroxxxxxx 9 hours ago |
Yeah, some were impressive, but others looked quite bad. The guy running in the desert looked like a guy floating over the ground only sporadically making contact with the sand. The footfalls in a lot of these videos look pretty janky or "soft".
The clothing changes also have pretty rough edges, or just look like they're floating over the original model. The 3D glasses one looked atrocious. The lighting changes are also pretty lacking.
Bloedcoins 9 hours ago |
I have not had the same feeling as you and i do look at ai art for quite. awhile.
Are you still impressed though?
tiborsaas 7 hours ago |
The spinning fire was one that could easily fool me if a 0.5 shot was in a music video. Context is everything.
cs702 10 hours ago |
Impressive.
It's only going to get better, faster, cheaper, easier.[a]
Sooner than anyone could have expected, we'll be able to ask the machines: "Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."
Sooner than anyone could have expected, we'll be able to have immersive VR experiences that are crafted to each person.
Sooner than anyone could have expected, we won't be able to identify deepfakes anymore.
We sure live in interesting times!
---
[a] With apologies to Daft Punk: https://www.youtube.com/watch?v=gAjR4_CbPpQ
mywacaday 10 hours ago |
Are we only a few years away from one person/small group made movies where the dialog, acting, location and special effects can be tweaked endlessly for a relatively low cost. If I was a studio exec I'd be worried.
Workaccount2 10 hours ago |
If I was a human I'd be worried.
edgyquant 10 hours ago |
These are tools by and for humans
ceejayoz 9 hours ago |
So are nukes.
Workaccount2 9 hours ago |
I guess cats probably think we are tools for feeding them...
adamc 10 hours ago |
I wouldn't be. How is any of this going to lead to meaningful art?
I don't think you get "The Green Mile" from something like this.
meiraleal 10 hours ago |
> I wouldn't be. How is any of this going to lead to meaningful art?
Local art, local actors, local animations telling stories about local culture. A netflix for every city, even neighborhoods. That's going to be crazy fun.
CuriouslyC 10 hours ago |
There are plenty of great outsider storytellers and artists. Youtube is proof of this. People mostly do comedy on youtube because that's what the medium supports on a low budget, but AI is going to change that.
adamc 9 hours ago |
I'm really not seeing how that would happen from these examples. It would seem like achieving an adequate, directorial level of control would require writing a novel -- or, anyway, more than a conventional screenplay -- to get the AI to make the movie you wanted.
There is so much that has to be conveyed in making a film, if you want it to say something particular.
meiraleal 7 hours ago |
> It would seem like achieving an adequate, directorial level of control would require writing a novel -- or, anyway, more than a conventional screenplay -- to get the AI to make the movie you wanted.
And? What's the problem with that? You seem to be locked in a "prompt to get a movie" mindset.
adamc 7 hours ago |
Those are the examples provided. When they deliver pro tools for generating movie clips, I will be more convinced, but that hasn't remotely happened yet.
spacebanana7 10 hours ago |
I'd love to see what'd happen if someone dumped the entire text of the Silmarillion or the Hobbit into one of these models. Assuming context windows and output capacity become large enough.
bnj 8 hours ago |
Especially primed by all the lord of the rings movies, for example; I could see the studio taking all the archived footage, camera angles, all the extra data that was generated in the creation of the films and feeding that into something like this model to create all kinds of interesting additional material.
Hrun0 9 hours ago |
> I don't think you get "The Green Mile" from something like this.
...yet
vundercind 9 hours ago |
Extremely-heavily-CG movies already mostly look like shit compared to ones where they build sets and have props and location shoot, even if somewhat assisted by computer compositing and such (everything is, nowadays). [edit: I don’t even mean that the graphics look bad, but the creative and artistic choices tend to be poor]
The limitations of reality seem to have a positive effect on the overall process of film making, for whatever reason. I expect generative AI film will be at least as bad. Gonna be hard to get an entire well-crafted film out of them.
ben_w 9 hours ago |
You're unlikely to get an AI that wins accolades for the same reason that's unlikely with humans: they represent the absolute pinnacle of achievement.
The same AI can still raise the minimum bar for quality. Or replace YouTubers and similar while they're still learning how to be good in the first place.
No idea where we are in this whole process yet, but it's a continuum not a boolean.
observationist 7 hours ago |
What accolades? The Hollywood self-congratulatory conspicuous consumption festivals they use to show how good they are at producing "art" every year? The film festivals where billions of dollars are spent on clothing and jewelry to show off the "class" of everyone attending, which people like Weinstein used to pick victims, and everyone else uses as conspicuous consumption and "marketing" media?
Pinnacle is not the word I'd use. Race to the bottom, least possible effort, plausibly deniable quality, gross exploitation, capitalist bottom line - those are all things I'd use to describe current "art" awards like Grammy, Oscars, Cannes, etc.
The media industry is run by exploiting artists for licensing rights. The middle men and publishers add absolutely nothing to the mix. Google or Spotify or platforms arguably add value by surfacing, searching, categorizing, and so on, but not anywhere near the level of revenue capture they rationalize as their due.
When anyone and everyone can produce a film series or set of stories or song or artistic image that matches their inner artistic vision, and they're given the tools to do so without restriction or being beholden to anyone, then we're going to see high quality art and media that couldn't possibly be made in the grotesquely commercial environment we have now. These tools are as raw and rough and bad performing as they ever will be, and are only going to get better.
Shared universes of prompts and storylines and media styles and things that bring generative art and storytelling together to allow coherent social sharing and interactive media will be a thing. Kids in 10 years will be able to click and create their own cartoons and stories. Parents will be able to engage by establishing cultural parameters and maybe sneak in educational, ethical, and moral content designed around what they think is important. Artists are going to be able to produce every form of digital media and tune and tweak their vision using sophisticated tools and processes, and they're not going to be limited by budgets, politics, studio constraints, State Department limitations, wink/nod geopolitical agreements with nation states, and so on.
Art's going to get weird, and censorship will be nigh on impossible. People will create a lot of garbage, a lot of spam, low effort gifs and video memes, but more artists will be empowered than ever before, and I'm here for it.
ben_w an hour ago |
> What accolades?
Any accolades, be that professional groups, people's awards, rotten tomatoes or IMDB ratings.
> Race to the bottom, least possible effort, plausibly deniable quality, gross exploitation, capitalist bottom line - those are all things I'd use to describe current "art" awards like Grammy, Oscars, Cannes, etc.
I find them ridiculous in many ways, but no, one thing they're definitely not is a race to the bottom.
If you want to see what a race to the bottom looks like, The Room has a reputation for being generally terrible, "bad movie nights" are a thing, and Mystery Science Theater 3000's schtick is to poke fun at bad movies.
> The media industry is run by exploiting artists for licensing rights.
Yes
> The middle men and publishers add absolutely nothing to the mix. Google or Spotify or platforms arguably add value by surfacing, searching, categorizing, and so on, but not anywhere near the level of revenue capture they rationalize as their due.
I disagree. I think that every tech since a medium became subject to mass reproduction (different for video and audio, as early films were famously silent) has pushed things from a position close to egalitarianism towards a winner-takes-all. This includes Google: already-popular things become more popular, because Google knows you're more likely to engage with the more popular thing than the less popular thing. This dynamic also means that while anyone will be free to make their own personal vision (although most of us will have all the artistic talent of an inexperienced Tommy Wiseau), almost everyone will still only watch a handful of them.
> Art's going to get weird, and censorship will be nigh on impossible.
Bad news there, I'm afraid. AI you can run on your personal device, is quite capable of being used by the state to drive censorship at the level of your screen or your headphones.
mathgeek 9 hours ago |
We already exist in a world where most of the revenue for film companies comes from formulaic productions. Studio execs certainly worry about how they are going to create profits in addition to any concerns about the qualitative cultural value of the films.
adamc 9 hours ago |
Even formulaic movies from hollywood have directors and actors doing a million things the AI will do randomly unless you tell it otherwise.
riskable 9 hours ago |
> How is any of this going to lead to meaningful art?
Nearly all the movies that go to theaters aren't "meaningful art". Not only that but what's meaningful to you isn't necessarily what's meaningful to others.
If someone can get their own personal "Godzilla VS The Iron Giant" crossover made into a feature-length film it will be meaningful to them.
ToucanLoucan 8 hours ago |
> Nearly all the movies that go to theaters aren't "meaningful art".
No but what they are is expensive, flashy, impressive productions which is the only reason people are comfortable paying upwards of $25 each to see them. And there's no way in the world that an AI movie is going to come anywhere close to the production quality of Godzilla vs Kong.
And like, yeah, their example videos at the posted link are impressive. How many attempts did those take? Are they going to be able to maintain continuity of a character's appearance from one shot to the next to form a coherent visual structure? How long can these shots be before the AI starts tripping over itself and forgetting how arms work?
wavemode 8 hours ago |
My suspicion is that, if AI moviemaking actually becomes common, there will be a younger generation of folks who will grow up on it and become used to its peculiarities.
We will be the old ones going "back in my day, you had to actually shoot movies on a camera! And background objects had perfect continuity!" And they will roll their eyes at us and retort that nobody pays attention to background objects anyway.
adamc 8 hours ago |
Shades of autotune.
But I have faith that people will notice the difference. The current generation may not care about autotune, but that doesn't mean another generation won't. People rediscover differences and decide what matters to them.
When superhero movies were new, almost everyone loved them. I was entranced. After being saturated with them... the audience dropped off. We started being dissatisfied with witty one-liners and meaningless action. Can you still sell a super-hero movie? Sure. Like all action movies, they internationalize well. But the domestic audiences are declining. It makes me think of Westerns. At one time, they were a hollywood staple. Now, not so much. Yes, they still make them, and a good one will do fine, but a mediocre one... maybe not.
ToucanLoucan 7 hours ago |
> The current generation may not care about autotune
The previous generation's care about autotune was also flatly wrong. Autotune was used by a few prominent artists then and is more widely used now as an aesthetic choice, for the sound it creates which is distinctly not natural singing, as the effect was performed by running the autotune plugin at a much, much higher setting than was expected in regular use.
Tone correction occurs in basically every song production now, and you never hear it. Hell, newer tech can perform tone correction on the fly for live performances, and the actual singing being done on the stage can be swapped out on the fly with pre-recorded singing to let the performer rest, or even just lipsync the entire thing but still allow the performer to jump in when they want to and ad-lib or tweak delivery of certain parts of songs.
The autotune controversey was just wrong from end to end. When audio engineers don't want you to hear them correcting vocals, you don't hear it. I'd be willing to buy another engineer being able to hear tone correction in music, but if a layman says they do, sorry but I assume that person's full of shit.
adamc 7 hours ago |
There are a bunch of videos (e.g., Wings of Pegasus) on youtube that cover pitch-correction, and there are plenty of examples you can hear.
ToucanLoucan 8 hours ago |
My suspicion (and fear) is that poor members of the younger generation are going to grow up reading AI kids books and watching AI TV shows, and playing AI generated iPad games, and be less literate, less experienced, less rounded and interesting people as a result. This is already kind of a problem where under-served kids access less, experience less, and are able to do less and I see AI doing nothing but absolutely slamming the gas on that process and causing already under-served kids to be even more under-served. That human created art will be yet another luxury only afforded to the children of the advantaged classes.
And maybe they won't have a problem with it, like you say, maybe that'll just be their "normal" but that seems so fucking sad to me.
philipkglass 7 hours ago |
If poor kids of the future grow up reading AI book-slop instead of classic books that's going to be due to complicated factors of culture and habit rather than economic necessity. Most of the traditional Western canon of "great literature" is already in the public domain, available for free.
https://standardebooks.org/
For newer in-copyright works, public libraries commonly offer Libby:
https://company.overdrive.com/2023/01/25/public-libraries-le...
It gives anyone with a participating-system library card free electronic access to books and magazines. And it's unlikely that librarians themselves will be adding AI book-slop to the title selection.
ToucanLoucan 6 hours ago |
> If poor kids of the future grow up reading AI book-slop instead of classic books that's going to be due to complicated factors of culture and habit rather than economic necessity.
To be clear, I'm not talking great literature. I'm talking Clifford the Big Red Dog type stuff.
That said I still have a number of problems with this assertion:
It will absolutely be down, in part, to economic necessity. Amazon's platform is already dealing with a glut of shitty AI books and the key way they get ahead in rankings is being cheaper than human-created alternatives, and they can be cheaper because having an AI slop something out is way less expensive and time consuming than someone writing/illustrating a kid's book.
Moreover, our economy runs on the notion that the easier something is to do, the more likely people are to do it at scale, and vetting your kids media is hard and annoying as a parent at the best of times: if you come home from working your second job and are ready to collapse, are you going to prepare a nutritious meal for your child and set them up with insightful, interesting media? No you're going to heat up chicken nuggets and put them in front of the iPad. That's not good, but like, what do you expect poor parents to do here? Invent more time in the day so they can better raise their child while they're in the societal fuckbarrel?
And yes, before it goes into that direction, yes this is all down to the choices of these parents, both to have children they don't really have the resources to raise (though recent changes to US law complicates that choice but that's a whole other can of worms) and them not taking the time to do it and all the rest, yes, all of these parents could and arguably should be making better choices. But ALSO, I do not see how it is a positive for our society to let people be fucked over like this constantly. What do we GAIN from this? As far as I can tell, the only people who gain anything from the exhausted-lower-classes-industrial-complex are the same rich assholes who gain from everything else being terrible, and I dunno, maybe they could just take one for the team? Maybe we build a society focused on helping people instead of giving the rich yet another leg up they don't need?
philipkglass 5 hours ago |
...if you come home from working your second job and are ready to collapse, are you going to prepare a nutritious meal for your child and set them up with insightful, interesting media? No you're going to heat up chicken nuggets and put them in front of the iPad.
This is what I mean by "complicated factors of culture and habit." An iPad costs more than an assortment of paper books. Frozen chicken nuggets cost more than basic ingredients. But the iPad and nuggets are faster and more convenient. The kids-get-iPad-and-nuggets habit is popular with middle-income American families too, not just poor families where parents work two jobs. The economic explanation is too reductive.
I'm not trying to say that this is the "fault" of parents or of anyone in particular. When the iPad came out I doubt that Apple engineers or executives thought "now parents can spend less time engaging with children" or that parents thought of it as "a way to keep the kids quiet while I browse Pinterest" but here we are.
ToucanLoucan 5 hours ago |
I mean, that's the thing though. We now have had kids parked in front of iPads for a good amount of time, along with other technical innovations like social media, and we have documented scientific proof of the harms they do to children's self-esteem, focus and mental acuity. I don't think the designers of the iPad or even the engineers at Facebook set out to cause these issues, but. they. did. And now we have a fresh technology in the form of AI that whole swathes of "entrepreneurs" are ready to toss into more children's brains as these previous ones were.
Is it too much to ask for a hint of caution with regard to our most vulnerable populations brains?
kridsdale3 3 hours ago |
As a former iPad (OS) designer, and former Facebook feed engineer, of course we're upset about what happened. Most of us fought valiantly, with awareness, against what became the dark forces and antisocial antipatterns. But the promo-culture performance incentive system instituted by HR being based on growth metrics at all costs made all of us powerless to stop it. Do something good for the world, miss your promo or get fired.
Circa 2020 a huge number of fed-up good-intentioned engineers and designers quit. It had no effect, at all.
ToucanLoucan 2 hours ago |
I'm genuinely sorry that happened to you. That had to be an absolute nightmare of an experience.
To be clear: I am not saying that engineers need to be better at preventing this stuff. I am saying regulators need to demand that companies be careful, and study how this stuff is going to affect people, not just yeet it into the culture and see what happens.
kridsdale3 3 hours ago |
I was there (Apple) at the time. Absolutely did NOT expect this thing that Steve thought was a neat way to see the whole NYT front page at once, was going to be the defining MacGuffin of an entire generation of children.
aenvoker 7 hours ago |
There's already conversation in AI art about how "Y'all will miss all these weird AI glitches when they're gone!" It will become the new tape hiss. Something people will nostalgically simulate in later media that doesn't have it naturally.
StevenNunez 5 hours ago |
Looking forward to watching this post age like milk.
adamc 8 hours ago |
They are art compared to getting uncontrolled choices. Who decides what the actors look like? How they move? What emotions their faces are to convey? How the blocking works for a scene? What the color scheme is for the movie? How each shot is taken? Etc.
There is a vast difference between a formulaic hollywood movie and some guy with a camera. If I say "Godzilla vs. The Iron Giant" what is the plot? Who is the good guy? Why does the conflict take place?
AI will come up with something. Will it be compelling even to the audience of one?
As a toy, maybe. As an artistic experience... not convinced.
stale2002 8 hours ago |
> Who decides
You still aren't getting it. Movie directors aren't making these decisions either.
What they are doing, is listening to market focus groups and checking off boxes based on the data from that.
A market focus group driven decisions for a movie is just as much, if not more so, of an "algorithm" than when a literal computer makes the decision.
Thats not art. Its the same as if a human manually did an algorithm by hand and used that to make a movie.
adamc 7 hours ago |
Some of it is done that way, but by no means all of it. You can easily see the differences, because, say, Wes Anderson movies are not the same as Martin Scorsese movies.
If it were really all just market decisions, directors would have no influence. This is not remotely the case. Nor are they paid as though that were true.
chefandy 7 hours ago |
This is a common perspective among people that don't realize how much goes into making a movie. That stuff informs which movies get approved and it certainly can inform broader script changes, casting changes, and in some cases editing decisions, but there's a UNIVERSE of other artistic decisions that need to be made. Implying that the people involved are mere technicians implementing a marketing strategy is exponentially more reductive than saying developers and designers aren't relevant to making software because marketing surveys dictate the feature development timeline. A developer's input is far more fungible than an artist working on a feature film.
I assure you, they don't do surveys on the punchiness and strategy used by foley artists; the slope and toe of the film stock chosen for cut scenes by the DP or that those cut scenes should be shot like cut scenes instead of dream sequences; the kind of cars they use; how energetic the explosions are; clothing selection and how the costumes change situationally or throughout the film; indescribably nuanced changes in the actor's delivery; what fonts go on the signs; which props they use in all of the sets and the strategies they use to weather things; what specific locations they shoot at within an area and which direction they point the camera, how the grading might change the mood and imply thematic connections, subtle symbolism used, the specifics of camera movements, focus, and depth of field, and then there's the deeeep world of lighting... All of those things and a million others are contributions from individual artists contributing their own art in one big collaborative project.
2OEH8eoCRo0 8 hours ago |
You'll never be able to talk to your friends about it. Culture wouldn't be a shared experience. We would all watch our own unique AI generated things.
sroussey 7 hours ago |
More likely there will be cliches.
IanCal 9 hours ago |
But what they're describing is a case where someone with the storytelling ability but not the money or technical skills could create something that looks solid.
You're imagining "pls write film" but the case of being able to film something and then adjust and tweak it, easily change backdrops etc could lead to much higher polish on creations from smaller producers.
Would the green mile be any less hard hitting if the lights flickering were caused by an AI alteration to a scene? If the mouse was created purely by a machine?
adamc 8 hours ago |
I don't have a problem with adjusting small elements of the film, but that isn't going to make it a tool for youtubers (or home users off the grid) to tell their own stories.
cjenkins 9 hours ago |
Looking back on history I think this will lead to meaningful art (and tons and tons of absolute garbage!).
The printing press led to publishing works being reachable by more people so we got tons of garbage but we also got those few individual geniuses that previously wouldn't have been able to get their works out.
I see similarities in indie video/PC games recently too. Once the tech got to the point that an individual or small group could create a game, we got tons of absolute garbage but also games like Cave Story and Stardew Valley (both single creators IIRC).
Anything that pushes the bar down on the money and effort needed to make something will result in way more of it being made. It also hopefully makes it possible for those rare geniuses to give us their output without the dilution of having to go through bigger groups first.
I'm also excited from the perspective that this decouples skills in the creative process. There have to be people out there with tremendous story telling and movie making skills who don't have the resources/connections to produce what they're capable of.
adamc 8 hours ago |
The printing press enabled the artistic visions of single individuals (the writers) to find a wide audience.
To do something similar, this has to allow the director (or whomever is prompting the AI) to control all meaningful choices so that they get more or less the movie the intend. That seems far away from what is demonstrated.
parsimo2010 8 hours ago |
You don't get "The Green Mile" from this, because it's a tool. You get "The Green Mile" with artistic vision. The tool has to be told what to do. But now a director can shoot a film with actors who don't match the physical description of a character in a story, and then correct their race/gender/figure/whatever with AI. That probably means they save money on casting. A director can shoot a scene inside a blank set and turn it into a palace with AI. That saves money from shooting on location or saves money from having to pay for expensive sets.
So now a director with a limited budget but with a good vision and understanding of the tools available has a better chance to realize their vision. There will be tons of crap put out by this tool as well. But I think/hope that at least one person uses it well.
But because it will make shooting a movie more accessible to people with limited budgets, the movie studios, who literally gatekeep access to their sets and moviemaking equipment, are going to have a smaller moat. The distribution channels will still need to select good films to show in theaters, TV, and streaming, but the industry will probably be changing in a few years if this development keeps pace.
adamc 8 hours ago |
This is the best answer I've seen, but I think what was demonstrated is miles away from this. A lot would need to be able to be specified (and honored) from the prompts, far more than any examples have demonstrated.
I'm not against tools for directors, but the thing is, directors tell actors things and get results. Directors hire cinematographers and work with them to get the shots they want. Etc. How does that happen here?
Also, as someone else mentioned, there is the general problem that heavily CG movies tend to look... fake and uncompelling. The real world is somehow just realer than CG. So that also has to be factored into this.
wrsh07 8 hours ago |
I think it starts simple. Have you ever been watching a movie or tv show, and it shows the people walking up to the helicopter or Lamborghini and then cut to "they've arrived at their destination no transportation in sight"?
It will start out with more believable green screen backgrounds and b roll. Used judiciously, it will improve immersion and cost <$10 instead of thousands. The actors and normal shots will still be the focus, but the elements that make things more believable will be cheaper to add.
Have you ever noticed that explosions look good? Even in hobby films? At some point it became easy to add a surprisingly good looking explosion in post. The same thing will happen here, but for an increasing amount of stuff.
adamc 8 hours ago |
That I could believe, although... there is quite a bit of commentary from film buffs that lots of the stuff done in post doesn't quite look right, compared to older films.
Which doesn't mean it won't keep happening (economics), but it doesn't necessarily mean any improvement in movie quality.
frumper 7 hours ago |
It doesn't look right in a lot of older films either. Plenty of entertaining films were poor quality yet still make money and attract audiences.
nonameiguess 8 hours ago |
Interesting that you pick that example in particular. Due to the sheer depth of behinds the scenes takes HBO has provided for Game of Thrones and House of the Dragon, it seems to be the consensus view among effects folks that CG fire and explosions are nearly impossible to get right and real fire is still the way to go.
quuxplusone 7 hours ago |
"Why is it so hard to make fire look good in movies?" (New York magazine, October 2023)
https://www.vulture.com/article/movies-fire-computer-generat...
https://archive.is/u8Ugr
takinola 5 hours ago |
My guess is the art form will evolve. When YouTube started, some people thought it would not be able to compete with heavily produced video content. Instead, YouTube spawned a different type of "movie". It was short-form, filmed on phone cameras, lightly scripted, etc. The medium changed the content. I suspect we will see new genres of video content show up once this tech is widely available.
DaemonAlchemist 4 hours ago |
The first real movies made 100 years ago looked like something someone today could put together in their garage on a shoestring budget. AI-generated movies have existed for just two years, and are only going to get better. This is bleeding edge research, and I haven't seen any sign yet of AI models hitting a quality ceiling.
JeremyNT 8 hours ago |
> I don't think you get "The Green Mile" from something like this.
But maybe you do get Deadpool & Wolverine 3
Guess where the money is?
adamc 8 hours ago |
If it becomes easy to make "Deadpool & Wolverine", it will no longer be where the money is. Everything that becomes a commodity attracts competition and ceases to be special. (You can see some of that in super-hero movies, which have started to be generic and lost some of their audience.)
But, in reality, even making that kind of film is miles away from these examples.
JeremyNT 6 hours ago |
> If it becomes easy to make "Deadpool & Wolverine", it will no longer be where the money is. Everything that becomes a commodity attracts competition and ceases to be special. (You can see some of that in super-hero movies, which have started to be generic and lost some of their audience.)
Well, given the studios still hold the copyright, they can severely constrain supply to keep profits up.
My suspicion is that this kind of stuff gradually reduces some of the labor involved in making films and allows studios to continue padding their margins.
xnx 8 hours ago |
> How is any of this going to lead to meaningful art?
It's a powerful tool. A painting isn't better because the artist made their own paint. A movie made with IRL camera may not be better than one made with an AI camera.
adamc 8 hours ago |
No, what makes art are choices (and execution). But the examples given were too general and didn't exercise much control over the choices.
aenvoker 7 hours ago |
The examples given aren't trying to be artistic. They're demonstrating technical capabilities as simply as possible.
atmavatar 8 hours ago |
From this and other comments, I get the impression that most on HN assume the tool will be used exclusively by people without any sort of artistic talent, either plagiarizing existing works and/or producing absolute dreck.
However, I see an interesting middle ground appear: a talented writer could utilize the AI tooling to produce a movie based upon their own works without having to involve Hollywood, both giving more writers a chance to put their works in front of an audience as well as ensuring what's produced more closely matches their material (for better or worse).
stale2002 8 hours ago |
Who says even a majority of the content you see online is meaningful art?
The algorithms and people making content for the algorithm were trends that have dominated for years already.
None of that is "real" art, when you are just making something optimized for an algorithm.
denisvlr 8 hours ago |
Shameless plug: I just created a short AI film (1) and tried to tell an actual story and trigger emotions. I spent countless hours crafting the script, choosing the shots, refining my prompts, generating images, animating images, generating music, sounds, and so on... For me AI tools are just that - tools. True, you have to yield them some "control", but at the end of the day you are still the one guiding and directing them.
Similarily, a film director "just" gives guidance to a bunch of people: actors, camera operator, etc. Do you consider the movie is his creation, even if he didn't directly perform any action? A photographer just has to push a button and the camera captures an image. Is the output still considered his creation? Yes and Yes, so I think we should consider the same with AI assisted art forms. Maybe the real topic is the level of depth and sophistication in the art (just like the difference between your iPhone pictures and a professional photographer's) but in my opinion this is orthogonal to it being human or AI generated.
To be honest so far we have mostly seen AI video demos which were indeed quite uninteresting and shallow, but now filmmakers are busy learning how to harness these tools, so my prediction is that in no time you will see high quality and captivating AI generated films.
(1) https://artefact-ai-film-festival.com/golden-hours-66f869b36... Please consider liking it!
herval 8 hours ago |
this is an excellent example that despite all the technical limitations (the ugly image artifacts, the lack of exact image consistency, etc), it's _already possible_ to create something that connects. The "format purists" currently dismiss AI tooling the same way they used to dismiss computer graphics animation back when Toy Story 1 came out.
Excellent work!
wanderingbit 8 hours ago |
I have a newborn daughter, so watching this made me cry a little bit, alone at my desk.
If yall needed evidence of these tools giving everyday people the ability to make emotion-tugging creations, I'll send you a picture of the tears!
Now I'm thinking I can finally make the (IMO) dope music videos that come to me sometimes when I'm listening to a song I really love.
kridsdale3 4 hours ago |
I got to the scene where there is a doctor visit (halfway though) and though "NOPE, I'm not going through this[1] again" and closed it.
[1] The first minutes of UP
cbsks 7 hours ago |
Very well done! I’m not ashamed to admit that I cried.
scudsworth 7 hours ago |
a shameless plug deserves an honest review: this is dog shit
genewitch 6 hours ago |
I, uh, gave some more nuance because i had some free time as a sibling comment to this. I hope we don't get downvoted because someone has to call a spade a spade.
scudsworth 5 hours ago |
good comment, haha. agree with those points and would add, since im thinking about it again now, that the entire work feels like a fairly (deeply) shallow riff on the opening sequence in pixar's "up". but of course with no stakes or emotional impact whatsoever.
nuancebydefault 5 hours ago |
That's not a review. It's an (probably honest) opinion stated like a fact.
I liked it.
scudsworth 4 hours ago |
sure it is. that's my critical evaluation of this work. if you liked it, i highly recommend the hallmark channel, lifetime, family channel originals, the netflix straight-to-vod swimlane, and a frontal lobotomy.
genewitch 6 hours ago |
I'm in agreement with scudsworth here, but i have a little more nuance, i think. I know how long this took, and how many compromises were made. The only reason this works, at all, is because humans have a massive list of cultural memes and tropes that shorthand "experience" for us. it has the "AI can only generate 2-5 seconds of video before it goes completely off the rails" vibe; which allows it to fit in with the ADHD nature of most video production of the last 30 years - something a lot of people do not like. For an example where this was jarring in the video, when they're drawing or painting, you cut the scene slightly too late, you can see the AI was about to do some wild nonsense.
What happened to the mom? Why does the kid get older and younger looking? why does the city flicker in the beginning? which kid is his in the ballet performance? why do they randomly have "lazy eye"? i could keep going but i think we all get my point.
I can intuit the tropes used by the AI to convey meaning, and i'd be willing to list them all with relevant links for the paltry sum of $50. Be warned, it will be a very large list. Tropes and "memes" are doing 100% of the heavy lifting of this "art".
Sorry, human. As someone who stopped creating art on a daily basis due to market dilution (read: it's too hard to build a fanbase that i care about), i am very critical of most "art" produced anyhow.
this is dogshit.
adamc 5 hours ago |
I will take a look. Good to hear.
lairv 10 hours ago |
I'm not sure, I see a common pattern with autonomous vehicle, text to image, llms: the last 10% are hard to achieve
zeroonetwothree 9 hours ago |
It’s true of everything
ActionHank 9 hours ago |
Yet VCs are sold that last 10% and an additional 10% on top. No idea why they keep throwing their money into the fire.
LordDragonfang 8 hours ago |
Because VCs are compulsive gamblers, and they're convinced the payout if they "win" is enormous
herval 8 hours ago |
to be fair, that's exactly what the asset class EXISTS for - betting on huge outcomes, no matter the odds. People misunderstand that due to how much of tech is "VC funded" when building stuff that would fare better as a bootstrapped company (or funded by other means)
darepublic 8 hours ago |
I'm grateful for this
Mistletoe 8 hours ago |
If we judge from AI writing, we can extrapolate what an AI movie would be like. I cannot imagine reading an AI book. It would look and smell like a book but nothing of value or new insight would be inside. Michael Bay might be very interested.
bovermyer 8 hours ago |
Michael Bay has said that he doesn't like AI.
Mistletoe 8 hours ago |
I stand corrected. I should have remembered that organisms that occupy the same niche have the strongest competition.
blululu 8 hours ago |
You look back at old movies, and on a technical level they really aren’t as good as contemporary trash productions. But they knew how to weave the camera and a script into something amazing back then even if they didn’t have resolve and aftereffects to polish every shot. A good script writer, editor and cinematographer have a huge impact on the quality of a movie. But these roles are only a small portion of the operating budget of a movie. Filming every single scene is an exhausting undertaking and this constitutes the bulk of a movie production’s budget. If you can get good quality footage without leaving your garage then you can have a small team make a great movie. Maybe not the extent where you simply click a button but to the level that you would launch straight to a streaming service.
bee_rider 6 hours ago |
Yes, AI will probably fail miserably for a while at least, at making the sort of well written artistic, clever movie that nobody watches. The only ones that need to be worried are the studios making churning out massive blockbusters…
wrsh07 8 hours ago |
Self driving cars are quite safe and ubiquitous if you're in the right cities
psychoslave 8 hours ago |
I don’t know, for a car the last 10% has a direct relation with "people die" that is obvious to everyone. With the movie made in anyone basement, the risks are far less likely to create such a vivid perception of dramatic end result.
Not that cyber-bullying and usurpation schemes escalating a whole new level being less of a concern in the aftermaths, to be clear.
6gvONxR4sf7o 7 hours ago |
Less about risk parallels and more about control parallels. The last 10% of fine grained control over a system is hard. Like every time I’ve done text to image prompting and it gives me a great starting point, but cannot get certain details i want, no matter how i ask.
snovv_crash 7 hours ago |
If you look at the majority of their catalogue these days, they really aren't trying to squeeze that last 10% out of the movie quality these days anyways, so I doubt it will matter.
llmthrow102 7 hours ago |
The average person spends 9-11 hours per day consuming media depending on what source you look at. When people are playing games or browsing social media at the same time that they have the latest Netflix show on their TV, you can't tell me that this is really valuable time spent to deepen one's understanding of the human experience; it's a replacement for the human experience.
Most people will not notice if the soundtrack to a new TV show is made by a 5 word AI prompt of "exciting build-up suspense scene music" while they're playing pouring money into their mobile gacha game to get the "cute girl, anime, {color} {outfit}" prompt picture that is SSS rank.
You or I might not care for AI slop, but it's a lot cheaper to produce for Netflix or Zinga or Spotify or whatever, and if they go this route, they don't have to pay for writers, actors, illustrators, songwriters, or licensing for someone else's product. They'll just put their own AI content on autoplay after what you're currently watching, and hope most people don't care enough to stop it and choose something else.
bee_rider 6 hours ago |
A 90% approximation of what somebody wants might be more interesting to that person, than a 99% approximation of what some studio exec wants.
ActionHank 9 hours ago |
I doubt it and if we were no one would be earning money anymore and wouldn't have cash to pay for the cost to run these services.
void-pointer 8 hours ago |
Or this is the top, and the only thing AI will be able to generate is boring and uninspiring clips.
Ever notice how they never show anyone moving quickly in these clips?
cle 6 hours ago |
Studio execs don't do any of that stuff anyway. It's the long list of people in the movie credits who should be worried.
sangnoir 6 hours ago |
> If I was a studio exec I'd be worried
Counterpoint: home "studio" recording has been feasible for decades, but music execs are not ruffled. Sure, you get a Billie Eilish debut album once a generation, but the other 99.99% of charting music is from the old guard. The media/entertainment machine is so much bigger than just creating raw material.
kingofthehill98 10 hours ago |
But mostly just porn.
spacebanana7 10 hours ago |
I think AI porn is overhyped. We've had the ability to create realistic photos and short vids for over a year now and onlyfans creators are still doing fine. AI porn is just a niche for stuff that humans don't want to perform.
candiddevmike 9 hours ago |
I think AI porn is chocked full of fascinating moral quandaries. It kind of transcends all other types of GenAI for the amount of hard questions it asks society.
As they say, porn is always the leading spear of technology. It's something to keep an eye on (no pun intended) to understand how society will accept/integrate generated content.
spacebanana7 9 hours ago |
It's definitely interesting from a moral and tech perspective.
However, commercially it seems like a niche within the existing structures of porn. Mostly competing with the market for animated stuff. At least that's where it is right now, and its already at photorealistic parity with human content creators.
Bloedcoins 9 hours ago |
'still'
We have not had the ability to create interesting ai porn vids yet. How would we? Meta just showed movie gen.
But i'm pretty sure the short images of moving woman very subtle might gotten the one or other of. Just wait a little bit until you can really create wat you are looking for.
spacebanana7 9 hours ago |
HN doesn't feel like right place to share links, but have a look at what's available on fanvue.
I'm not sure exactly what models the account owners use, but I think it's a mix of Stable diffusion video touched up with adobe tooling.
Der_Einzige 7 hours ago |
You obviously have not spent any time on civit.ai if you're saying that.
What scares me most is that in my opinion, by far the best prompt writers are the ones who are deeply "motivated" and "experienced" with prompting. Often the best prompters have only one hand on the keyboard at a time.
Der_Einzige 7 hours ago |
You have no idea how much butthurt their is from specifically artists who draw at the AI NSFW models which exist.
I can trivially fine-tune and create more art from certain artists in an hour than they have produced themselves in their whole careers. This makes a lot of people very upset.
mon_in_the_moon 10 hours ago |
You sound like two minute papers.
ben_w 9 hours ago |
In enthusiasm perhaps, but when I play that in my own head with the voice of Dr. Károly Zsolnai-Fehér, he doesn't script his videos like that. I can't recall a single instance of him using triple repetition like that.
throwawayk7h 8 hours ago |
what a time to be alive.
keiferski 9 hours ago |
This sounds gimmicky and worth watching once or twice, then forgetting about. Worthwhile art will continue to be created from a specific person's/group's vision, not an algorithmically generated sum of personal preferences.
dyauspitr 9 hours ago |
You’ll never be able to tell them apart.
williamcotton 9 hours ago |
You will if you go see someone pick on a guitar at an open mic!
keiferski 9 hours ago |
"Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."
This sounds marginally above fanfiction, so I do think it'll be very easy to tell them apart. "Terminator, except with Adam Sandler and set on Mars" is a cute, gimmicky idea, not a competitor to serious work.
gedy 8 hours ago |
Maybe, but I guarantee you this is going to get banned in the US for "safety" or "misinformation" reasons eventually (with large backing from Hollywood).
aenvoker 7 hours ago |
Well, yeah. If you explicitly try to come up with a cute, gimmicky idea, it's not going to be serious. Taste still matters regardless of paint, cameras or computers.
consteval 9 hours ago |
It depends on how shallow your understanding of media is.
I'm sure this can be used to create entertaining movies that are fun and wacky. I don't think it can create impactful movies.
dyauspitr 8 hours ago |
I think that’s an extremely short sited perspective. There isn’t much that separates a “fun and wacky” movie from something impactful from a cinematography perspective. With the right music, ambience and script you could absolutely do any genre of movie you wish to.
consteval 8 hours ago |
I disagree, I believe your perspective is short-sighted. If you really think what makes a movie "impactful" is the music, ambience, and script then I don't think you have much media literacy.
It's no more ridiculous than saying what makes a painting impactful is the brush strokes. But if I copy Picasso's work stroke for stroke, why am I not Picasso? After all, the dumbass paints like a child, admittedly! How could someone like him ever be considered a great painter?
heurist 8 hours ago |
You forget that there's a human behind the prompt, stitching frames and dialogue together.
consteval 7 hours ago |
If they are stitching then I would consider that a form of art.
However, merely describing something is not doing the thing. Otherwise, the business analysists at my company would be software engineers. No, I make the software, and they describe it.
The end-goal here is humanless automation, no? Then I'm not sure your assumption holds up. If there's no human, I question the value.
dyauspitr 4 hours ago |
> If there's no human, I question the value.
You may question the value but if it’s anything like rugs you won’t be in the majority. People pay a significant premium for artisanal handmade rugs but that being said, more than 95% of the rugs people use are machine made because they’re essentially indistinguishable from a handmade one and are much, much cheaper and just as functional.
gedy 8 hours ago |
I don't agree. While "just" audio, I've made a few AI songs that have made people tear up and trigger strong emotions.
I think you can do this with video too, just more challenging right now.
consteval 7 hours ago |
I'm sure eventually you can, but I don't think triggering emotions is the correct "KPI" so to speak.
On social media platforms, typically the most popular content triggers the strongest emotions. It's rage-bait however, or sadness bait, or any other kind of emotional manipulation. It tricks the human mind and drives up engagement, but I don't think that is indicative of its value.
To be clear, I'm certain that's not what you're doing, and the music is good. But I think it's complicated enough that triggering emotions isn't enough data to ascertain value.
I don't know, exactly, what combination of measurements are needed to ascertain value. But I'm confident human-ness is part of the equation. I think if people are even aware of the fact a human didn't make something they lose interest. That makes the future of AI in entertainment dicey, and I think that's what fuels the constant dishonesty around AI we're seeing right now. Art is funny in how it works because, I think, intention does matter. And knowledge about the intention matters, too. It maybe doesn't make much sense, but that's how I see it.
aenvoker 6 hours ago |
Right now there is a ton of stigma around AI art. That stigma fuels a ton of poorly-informed rhetoric against it. There is also tons and tons of casual use of AI art being shitposted for funsies everywhere that reinforces that rhetoric that AI art means "Push button. Receive crap. Repeat."
Meanwhile, as someone who has been engaged with the AI art community for years, and spent years volunteering part-time as a content moderator for Midjourney, the process of creating art via AI with intentionality is deeply human.
As an MJ mod, I have seeeeeeen things.... It's like browsing though people's psyche. Even in public portfolios people bare their souls because they assume no one will bother to look. People use AI to process the world, their lives, their desires, their trauma. So much of it is straight-up self-directed art therapy. Pages and pages, thousands of images stretching over weeks, sometimes months, of digging into the depths of their selves.
Now go through that process to make something you intend to speak publicly from the depth of your own soul. You don't see much of that day to day because it is difficult. It's risky at a deeply personal level to expose yourself like that.
But, be honest: How much deeply personal art do you see day to day? You see tons of ads and memes. But, to find "real art" you have to explicitly dig for it. Shitposting AI images is as fun and easy as shitposting images from meme generators. So, no surprise you see floods of shitposts everywhere. But, when was the last time you explicitly searched out meaningful AI art?
consteval 6 hours ago |
> But, be honest: How much deeply personal art do you see day to day?
You bring up a good point - very little. But, to be fair, those people aren't necessarily trying to convince me it's art.
I think you're mostly right but I am a little caught up on the details. I think it's mostly a thing of where the process is so different, and involves no physical strokes or manipulation, that I doubt it. And maybe that's incorrect.
However, I will also see a lot of people who don't know how to do art pretending like they've figured it all out. I also see the problem with that. It wouldn't be such a problem if people didn't take such an overly-confident stance in their abilities. I mean, it's a little offensive for that guy mucking around for an hour to act like he's DiVinci. And maybe he's a minority, I wouldn't know, I don't have that kind of visibility into the space.
I think a lot of the friction comes from that. Shitposts are shitposts, but I mean... we call them shitposts, you know? They, the people that make them, call them shitposts. There's a level of humility there I haven't necessarily seen with "AI Bros".
I think, if you really love art, AI can be a means to create a product but it can also be a starting point to explore the space. Explore styles, explore technique, explore the history. And I think that might be missing in some cases.
For a personal example, I'm really into fashion and style. I love clothes and always have. But it's really been an inspiration to me to create clothes, to sew. I've done hand sewing, many machine stitches too. And I don't need to - I could explore this in a more "high-level" context, and just curate clothing. But I think there's value in learning the smaller actions, including the obsolete ones.
aenvoker 4 hours ago |
Check out
https://x.com/ClaireSilver12
https://www.clairesilver.com/collections
from the POV of fashion illustration. Her "corpo|real" collection took something like 9 months to create and was published nine months ago.
sionisrecur 8 hours ago |
Isn't a specific person's vision basically their personal preferences?
keiferski 8 hours ago |
Not really. A vision implies a particular kind of project, presumably created by someone with expertise and some well-thought through ideas about what it ought to be. Personal preferences just mean that someone likes X qualities.
To use a real-world example: if the Renaissance-era patrons had merely written down their preferences and had work made to match those preferences, it's highly unlikely that you'd have gotten the Mona Lisa or David.
Which is to say that, there will definitely be some interesting and compelling art made with AI tools. But it will be made by a specific person with an artistic vision in mind, and not merely an algorithm checking boxes.
heurist 8 hours ago |
I rarely watch movies or read books twice anymore. There's too much content already. The challenge with purely human art at this point is that it will be silenced by the perpetual flood of half-assed generated work. There will be room in elite art circles for more, but at some point the generated stuff will be so ubiquitous (and even meaningful) that anyone without connections is going to have a tough time building an audience for their handcrafted work, unless it happens to be particularly controversial or 'difficult' to make. The demand for visual stimulus will be satisfied by hypertuned AI models. Generative AI is not there quite yet but there's no reason to think it won't be better than 90%+ of purely human content within a decade given the pace of development over the last few years.
keiferski 7 hours ago |
I don't buy this narrative at all. People like people and increasingly follow artists because of their personality and overall "brand." No one cares about generated AI art or its creator(s), because it's not interesting. It's also not sharable with other humans; see, for example, the frenzy around going to a Taylor Swift concert. The mass appeal and shared interest is part of the draw.
At best, you'll get something like a generic sitcom. The idea that "all visual stimulus will be satisfied by hypertuned AI models" doesn't line up with how people experience the arts, at all.
whilenot-dev 5 hours ago |
I fully agree here. I want to be part of an audience, and as part of that audience I always look at the human development of the things to share - artifacts in the case of fine art, or experiences in the case of performative art. The artist will always be more important than their work to me.
I don't want to carry mechanical solutions labelled culture - deterministic enough, despite hallucinations - into the next generation that follows my own. It's an impressive advancement for automation, sure, but just not a value worth sharing as human development.
That being said, I think GenAI could be a valuable addition in any blueprint-/prototype-/wireframing phase. But, ironically, it positions itself in stark contrast to what I would consider my standards to contemporary brainstorming, considering the current Zeitgeist:
- truthful to history and research (GenAI is marketing and propaganda) - aware of resources (GenAI is wasteful computing) - materialistic beyond mere capitalistic gains (GenAI produces short-lived digital data output and isn't really worth anything)
heurist 4 hours ago |
That may be the case today but kids are starting to grow up with this stuff as part of their lives, and I don't think we can anticipate the reaction as both they and the models grow in tandem. I think human creativity is much deeper than LLMs, but that is from my human perspective and I can't fully rule out that the LLMs may become better at it at some point in the future. I actually think they're already smarter and more creative than most people (though not more than the potential of any given human if they practiced/trained thoroughly).
javier123454321 7 hours ago |
Exactly, there will be a lower barrier to entry, but making content that stands out will require the same (or more) effort.
ajkjk 9 hours ago |
"we will be able to" -> "someone with a financial interest in my believing them says that we will be able to"
indigoabstract 9 hours ago |
It looks impressive, yet I'm not feeling very impressed. If only I could get as high as you do from watching those demo reels.
Out of curiosity, what is it that people do with these things? Do they put them on TikTok?
sangnoir 6 hours ago |
> It looks impressive, yet I'm not feeling very impressed.
The demos was made by nerds (said with love) with a limited time window. Wait until the creatives get a hold of the tool.
ratedgene 8 hours ago |
We are super cooked! I love the future!!!!
nakedrobot2 8 hours ago |
that sounds awful. we need to start asking ourselves just because we can, do we need to fulfill all of our prurient desires?
animanoir 8 hours ago |
this user is a prime example of "consoooooooooooooom!!!!!!!!!!"
conductr 6 hours ago |
> Sooner than anyone could have expected, we'll be able to ask the machines: "Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."
I've been doing this with ChatGPT, except it's more of a "turn into a screenplay" then "create a graphic of each scene" and telling it how I want each character to look. It's works pretty well but results in more of a graphic novel than a movie. I'm definitely been waiting for the video version to be available!
sensanaty 6 hours ago |
Does anyone other than PMs when thinking up user stories do shit like this or finds this kinda stuff desirable? It just sounds like a business person who doesn't have a life other than selling their product trying to think up "real user" usecases every time.
bakugo 6 hours ago |
Nope. This is just like when cryptobros would regularly insist that cryptocurrencies would replace banks by the end of the decade. It's safe to assume that anyone who makes such wild predictions is a bagholder who stands to gain financially from said wild predictions coming true, even though they never will.
thunderbird120 5 hours ago |
Frankly, yes.
Many creative works these days require the effort and input of so many people, so much time, and so much money that they can't have a specific creative vision. Mediums like book, comics, indie movies, and very low budget indie games, where the the end product was created by the smallest number of people, have the most potential to be interesting and creative. They can take risks. This doesn't mean they will be good, most aren't, but it means that the range of quality is much broader, with some having a chance to shine in ways which big budget projects just can't. The issue with small teams and small budgets is that they are inherently limited in what they can create. Better tools allows smaller groups of people to make things that previously would have required an entire studio but without diluting the creative vision.
Will this also result in a tidal wave of low effort garbage? Of course it will. But that can be ignored.
indymike 10 hours ago |
The giant crab-like thing in the background of the Hippo swimming (if a hippo could swim) is the stuff of nightmares.
bbor 10 hours ago |
Incredible, simply incredible. You know a paper is seminal when all the methods seem obvious in hindsight! Though I’m not caught up on SOTA, so maybe some of this is obvious in normal-sight, too.
RIP Pika and ElevenLabs… tho I guess they always can offer convenience and top tier UX. Still, gotta imagine they’re panicking this morning!
Upload an image of yourself and transform it into a personalized video. Movie Gen’s cutting-edge model lets you create personalized videos that preserve human identity and motion.
Given how effective the still images of Trump saving people in floodwater and fixing electrical poles have been despite being identifiable as AI if you look closely (or think…), this is going to be nuts. 16 seconds is more than enough to convince people, I’m guessing the average video watch time is much less than that on social media.
Also, YouTube shorts (and whatever Meta’s version is) is about to get even worse, yet also probably more addicting! It would be hard to explain to an alien why we got so unreasonably good at optimal content to keep people scrolling. Imagine an automated YouTube channel running 24/7 A/B experiments for some set of audiences…
nanna 10 hours ago |
Absolutely terrifying. Please stop.
vouaobrasil 9 hours ago |
Absolutely agree. It's very terrifying and will likely cause mass disruption because it will disintegrate the social fabric that is held together by people needing other people for stuff.
Sgt_Apone 7 hours ago |
Absolutely. What's even the purpose of this thing? Who is it really serving?
clvx 10 hours ago |
Off topic but some day you could live off grid with your own solar fusion mini reactor powering your own hardware that enables creating your own stories, movies and tales. No more need of streaming services. Internet would be to obtain news, goods and buy greatest and latest (or not) data to update your models. Decentralization could be for once not as painful as it is now; however, I still believe every single hardware vendor would try to hook to the internet and make you install an app. Looking forward to this AI revolution for sure.
adamc 10 hours ago |
I'm not. The likelihood that such movies (for example) would have anything significant to say about being human seems very low.
If one watches movies, reads books, etc. just to pass the time, maybe this would be some kind of boon. But for those of us looking for meaningful commentary on life, looking to connect with other human beings, this would be some circle of hell. It's some kind of solipsism.
CuriouslyC 9 hours ago |
That doesn't follow at all. If I come up with a meaningful story and use AI to generate clips and stich them together to tell it, that's real art.
If you disagree with that, you're basically saying La Jetee isn't art, which would be a hard sell.
adamc 8 hours ago |
I don't have anything against stitching together clips to tell your story, but I'm unconvinced that these demonstrate anything like that. As I said in another comment, it seems like you'd need to write a screenplay PLUS all the information the director, cinematographer, etc. use to create an actual movie -- everything from direction for how actors portray scenes to decisions on exactly how shots are constructed, to blocking for multiple actors in a scene, to color schemes...
There are a LOT of choices in making a movie, and if you just let the AI make them, you are getting "random" (uncontrolled) choices. I don't think that is going to compare favorably to the real thing.
If you can specify all that, then it's just a tool. Cool. But it's still going to take pro-level skills to use it.
rolux 37 minutes ago |
If La Jetee was just some photos stitched together plus meaningful narration, then of course, you could use AI-generated photos.
But would AI be able to quote Vertigo, like La Jetee does? Doesn't art, at least to some degree, require intent (including all intentional subversions of that intent dogma, of course)?
zarzavat 9 hours ago |
I wouldn't be so sure. AI can ingest far more information about humans than a human ever could. It has read our stories and understands our languages. AI might have more to say about humans than we do ourselves.
Of course AI can never truly experience being human, it has no emotions, but it is excellent at mimicry and it can certainly provide a meaningful outside perspective.
Is there anything to say about humanity that is not in the training corpus already?
adamc 8 hours ago |
Every new novel of any merit shows that there is. And the world keeps changing. The experience of being human keeps changing.
Nothing AI has yet done has demonstrated anything at the level of art or mastery. I guess I'm unconvinced that throwing a million stories into the blender and synthesizing is going to produce a compelling one.
schmorptron 8 hours ago |
Maybe people with good story literacy and cultural comprehension will be able to tell the difference for much longer, maybe even indefinitely. But the majority of people, and I dread that includes me, won't, at some point. I've already fallen for some AI generated music and thought "hey, that sounds pretty good, I'll bookmark it". It's genuinely scary.
93po 4 hours ago |
95% of television and movies, to me, are completely uninteresting and not worth watching. the property of being human-made has a pretty low success rate for basically anyone
diego_sandoval 9 hours ago |
They used to say that the Internet would make people smarter and more knowledgeable.
That prediction became true for like 5% of the population, everyone else is probably stupider than they were before, thanks to social media.
Similarly, I think your prediction will apply to a small subset of humanity.
TranquilMarmot 2 hours ago |
To me, this feels like a very dystopian take. I watch movies, read books, and listen to music because they are a way to connect with fellow human beings. Taking the human out of the equation also removes any meaning for me.
I get that this is kind of a fundamental line in the sand for most of the "AI art" going around, and it seems like most people fall on one side or the other. "I consume art for entertainment" vs "I interact with art to experience the human condition".
I also don't want to say that AI Art has no value, because I think as a tool to help artists realize their vision it can be very useful! I just don't think that art entirely made by AI is interesting.
codeduck 10 hours ago |
those penguins are incredibly buoyant.
thinkingemote 10 hours ago |
Are any image / video generation tools giving just the output or the layers, timelines, transitions, audio as things to work with in our old fashioned toolsets?
The problem: In my limited playing of these tools they don't quite make the mark and I would easily be able to tweak something if I had all the layers used. I imagine in the future products could be used to tweak this to match what I think the output should be....
At least the code generation tools are providing source code. Imagine them only giving compiled bytecode.
swatcoder 9 hours ago |
Keep in mind that these technologies produce more stuff like what they've been trained on, and they need tremendous amounts of training data to pull that off.
It so happens that there are innumerable samples of prose and source code and rendered songs and videos and images to use as this training data.
But that's not so much the case for professional workflows (outside of software development).
If the tools can evolve to generating usefully detailed and coherent media projects instead of just perceptually convincing media assets, it's going to be a while before they get there.
botanical76 9 hours ago |
They definitely do not give you an Adobe After Effects project. This is because of the way they are trained. I suspect a vast proportion of its training data is not annotated with the corresponding layers, timelines, etc so the model is unable to reproduce it like that. You basically just get video AFAIK.
vunderba 4 hours ago |
If you have experience as a graphic designer, you can get very far with any layer based graphic tools like Krita or Affinity in conjunction with proper inpainting against generative image models - in fact that's InvokeAI's entire target user base.
39896880 10 hours ago |
They all have dead eyes. It's creepy.
tempusalaria 10 hours ago |
We live in the future. I just hope we consumers get easy access to these video tools at some point. I want to make personal movies from my favorite books
HumblyTossed 10 hours ago |
I'm not impressed with the quality. Did they mean to make it look so cartoony?
brokensegue 9 hours ago |
A lot of them don't look cartoony to me. Better then previous video generators
Hard_Space 10 hours ago |
Yet another one-shot, single-clip Instagram machine that can't do a follow-on shot natively.
As it stands, the only chance you have of depicting a consistent story across a series of shots is image-to-video, presuming you can use LoRAs or similar techniques to get the seed photos consistent in themselves.
skerit 10 hours ago |
I made a silly 1-hour long movie with friends +/- 20 years ago, on DV tape. I would love to use this to actually be able to implement all the things we wanted to achieve back then
tikkun 9 hours ago |
FAQs I found:
Is it available for use now? Nope
When will it be available for use? On FB, IG and WhatsApp in 2025
Will it be open sourced? Maybe
What are they doing before releasing it? Working with filmmakers, improving video quality, reducing inference time
kleiba 9 hours ago |
To all the folks with negative opinions of this work: you guys are nuts! This work is incredible. Is it the end of the line yet? Of course not, but come on! This is unbelievably cool, and who of you would have predicted any of this ten years ago?
KevinGlass 9 hours ago |
It's incredible in the same way an AK-47 is incredible. This sort of thing is going to uproot all of culture and god knows what happens after that.
vouaobrasil 9 hours ago |
For me, peace in society, a nice world where humans can share what they create, and nature outside and preserved are all much better than "cool", and this "cool" tool threatens all of the above.
wiseowise 9 hours ago |
So I’m probably going to be too closed minded about this: but who the f*ck asked for this and did anyone consider consequences of easily accessible AI slop generation?
It’s already nearly impossible to find quality content on the internet if you don’t know where to look at.
tiborsaas 7 hours ago |
I did and I'm quite happy that this is happening :) It's unleashing a new computing era when you just have to lean back, close your eyes and your vision can materialize without a Hollywood production crew.
NexRebular an hour ago |
And it's great as anyone can use it in whichever way they want since machine generated content does not have copyright protections.
We will finally achieve the dream of everything being in public domain!
nthdesign 9 hours ago |
My kids both have creative hearts, and they are terrified that A.I. will prevent them from earning a living through creativity. Very recently, I've had an alternate thought. We've spent decades improving the technology of entertainment, spending billions (trillions?) of dollars in the process. When A.I. can generate any entertainment you can imagine, we might start finding this kind of entertainment boring. Maybe, at that point, we decide that exploring space, stretching our knowledge of physics and chemistry, and combating disease are far more interesting because they are real. And, through the same lens, maybe human-created art is more interesting because it is real.
solaris152000 9 hours ago |
I had a similar thought. I knew someone who lived a life of crime, for a long time he was very poor like most criminals, but for a while made it big. He could buy anything he wanted, he always liked suits so bought very nice suits. But they meant nothing to him, he couldn't enjoy them, as he didn't earn then.
I wonder if it will be the same with AI. When you can have anything for nothing, it has no value. So the digital world will have little meaning.
2OEH8eoCRo0 9 hours ago |
That's my optimistic belief as well but I've also been disappointed at every turn. The future feels like a nihilistic joke constantly competing to plot the most disappointing course forward.
More likely the average person will happily lap up AI generated slop.
dyauspitr 9 hours ago |
He might be an exception because most people would have no problems riding around in a million dollar car whether they earned it or not.
jprete 2 hours ago |
Nobody cares about driving around in a million-dollar car. They want the money/power/status of the person who owns the million-dollar car. An unearned million-dollar car is practically a liability instead of an asset.
jppittma 9 hours ago |
Or maybe, the limiting factor in one's ability to create art will be... creativity rather than the technical skills necessary to make movies, draw, or pluck strings.
judge2020 9 hours ago |
The issue is that the human performance of those things is precisely how creativity is expressed. You can tell an AI to write a story you envision but if there’s nothing unique in the presentation (or it copies the presentation from existing media to a large extent) you still end up with boring output.
ativzzz 9 hours ago |
A lot of creativity is generated by spending countless hours sharpening
> the technical skills necessary to make movies, draw, or pluck strings
AI will (hopefully) be an accelerator for the people still putting in the hours. At least it is for coding
vouaobrasil 9 hours ago |
Nah, creativity cannot be separated from the means. "The medium is the message". It is precisely the interaction of technical skill and the mind that creates something truly wonderful.
farts_mckensy 9 hours ago |
That's not exactly what McLuhan meant by that statement. "The medium is the message" refers more to how the medium itself influences the way a message is perceived by an audience. It is not an assessment of the creative process itself. It's not as though I disagree entirely with what you're saying though. There are certainly ways in which the medium is highly influential over the process of creating something. But it's a mixed bag, and technical skill is not something to be celebrated in all cases. A technically accurate painting is oftentimes quite dull and uninspired. One could argue that creativity isn't just the interaction of skill and mind, but rather the ability to think beyond the medium, to embrace accidents, imperfections, and impulsive decisions.
sonofhans 9 hours ago |
Creativity isn’t magic, it’s a skill. There is no creativity without the application of it. By definition creativity produces something. Without skills it’s not possible to produce anything.
The act of creating teaches you to be better at creating, in that way and in that context. This is why people with practice and expertise (e.g., professional artists, like screenwriters and musicians) can reliably create new things.
farts_mckensy 9 hours ago |
I'm not sure I completely agree. In some ways, developing technical skills can drill creativity out of you and condition you to think in ways that are really quite rigid and formulaic.
consteval 9 hours ago |
> Creativity isn’t magic, it’s a skill
I don't agree. There's some skill, some theory, behind it. But mastering this alone is almost worthless.
There's a huge overlap between creatives and mental illness, particularly bipolar disorder. It seems perfectly mentally stable people lack that edge and insight. To me, that signals there is some magic behind it.
And it's magic because then it must not be rationale and it must not make sense, because the neurotypical can't see it.
I think it's sort of like how you can beat professional poker players with an algorithm that's nonsensical. They're professionals so they're only looking at rationale moves; they don't consider the nonsensical.
spookie 8 hours ago |
All artists I have known have spent most of their lives practicing. Just as I have practiced programming.
That's the biggest edge, commitment.
To think that you _need_ to be neurodivergent to be an artist is non-sensical and stating mastering the craft itself is worthless is indicative of a lack of respect for their work.
I'm baffled by this type of comment here in all honesty. Really, broaden your horizons.
consteval 8 hours ago |
> To think that you _need_ to be neurodivergent to be an artist
You will notice I never said this.
All I said, and is true, is there is a correlation between being an artist and being neurodivergent.
> stating mastering the craft itself is worthless
Where did I say this too?
It appears you're having an argument with a ghost. You're correct, that argument is baffling! I wonder then why you made it up if you're just gonna get baffled by it? Seems like a waste of time, no?
Look, art is two things: perspective and skill. One without the other is worthless.
I can have near perfect skill and recreate amazing works of art. And I will get nowhere. Or, I can have a unique and profound perspective but no skill, and then nobody will be able to decipher my perspective!
spookie 7 hours ago |
I'm sorry if I misunderstood, but please clarify how this two quotes don't align with what I said?
> But mastering this alone is almost worthless.
> And it's magic because then it must not be rationale and it must not make sense, because the neurotypical can't see it.
Not trying to take them out of context, but specifying them. You mention, from my understanding, that mastering is almost worthless without the magic, and the magic only being there if you're neurodivergent.
This implies one cannot be a proper artist if not neurodivergent. Now, I could be misinterpreting it, so I apologize in advance.
consteval 7 hours ago |
I never said the magic is "only" there if you're neurodivergent, I said it seems to me neurodivergent people seem to be more likely to have the magic.
> There's a huge overlap between creatives and mental illness
Keyword overlap, but I don't think it's 100%
Magic is maybe not the right word here, but I do think it's indescribable. It's some sort of perspective.
But I stand by this: > that mastering is almost worthless without the magic
How, exactly, you obtain the magic is kind of unknown. But I do think you need it. Because skill alone is just not worth much outside of economics. You can make great corporate art, but you're not gonna be a great artist.
I think if you're perfectly rationally minded, you're going to struggle a lot to find that magic. I shouldn't say it's impossible, but I think it's close to.
psychoslave 8 hours ago |
Certainly, life-long commitment to some discipline is not something that is in the middle of the bell curve.
I don’t know if neurodivergence might have any overlap, but I wouldn’t be surprise that a study reveals it to be as correlated as the fact that most rich people were born in wealthy families.
jppittma 7 hours ago |
To an extent. Take cooking for example though- I don't doubt that writing recipes and trying them builds ones creative muscle, on the other hand, I don't think being we'd be at a loss for great chefs if we were to automate the cutting of onions, the poaching of eggs, and the stirring of risotto.
sonofhans 5 hours ago |
Of course we would. That’s my entire point.
Take poaching eggs for example. Let’s say you automate that 100% so as a human you never need to do it again. Well, how good are your omelettes then? It’s a similar activity — keeping eggs at the right temperature and agitation for the right amount of time. Every new thing you learn to do with eggs — poaching, scrambling, omelettes, soft-cooking for ramen — will teach you more about eggs and how to work with them.
So the more you automate your cooking with eggs the worse you get at all egg-related things. The KitchenBot-9000 poaches and scrambles perfect eggs, so why bother? And you lose the knowledge of how to do it, how to tell the 30-second difference between “not enough” and “too much.”
mattgreenrocks 9 hours ago |
The discipline and care to get good at it are what the things that spur creativity.
adventured 9 hours ago |
99% of humanity have very little interest in creating. They're mimics, they're fine with copying, hitting repost, et al. You see this across all social media without exception (TikTok being the most obvious mimic example, but it's the same on Reddit as well). You see it in day to day life. You see it in how people spend their time. You see it in how people spend their money. And none of this is new.
The public can create vast amounts of spectacular original content right now using Dalle, MidJourney, Stable Diffusion - they have very little interest in doing so. Only a tiny fraction of the population has demonstrated that it cares what-so-ever about generative media. It's a passing curiosity for a flicker of an instant for the masses.
The hilariously fantastical premise of: if we just give people massive amounts of time, they'll dedicate their brains to creativity and exploration and live exceptionally fulfilling lives - we already know that's a lie for the masses. That is not what they do at all if you give them enormous amounts of time, they sit around doing nothing much at all (and if you give them enormous amounts of money to go with it, they do really dumb things with it, mostly focused on rampant consumerism). The reason it doesn't work is because all people are not created equal, all people are not the same, all brains are not wired the same, the masses are mimics, they are unable & unwilling to originate as a prime focus (and nothing can change that).
farts_mckensy 9 hours ago |
That's simply untrue. Children have a natural inclination to create art. It is slowly drilled out of them by various factors, in large part, economic pressures. One of my best friends has a natural talent for drawing. He even made a children's book. Guess what? He became a cop because being a graphic artist is too precarious. If we alleviate the pressures that cause people to become closed off to the possibility of creating art, more people will be open to it.
jancsika 8 hours ago |
You: escape the oppressive technical limitations of scoring a piece for an orchestra through novel use of technology.
Csound: To make a sine tone, we'll describe the oscillator in a textfile as if it were a musical instrument. You can think of this textfile as a blueprint for a kind of digital orchestra. Later we'll specify how to "play" this orchestra using another text file, called the score.
Fricken 3 hours ago |
You don't need any special technical skills to write the next great American novel. Few people actually do it. Talent and dedication are as elusive as ever.
Animats 9 hours ago |
"And, through the same lens, maybe human-created art is more interesting because it is real."
Most human-created art is rather bad. I used to go to a lot of art openings, and we'd look at some works and ask "will this have been tossed in five years?"
causal 9 hours ago |
Being pleasing to the eye is often not the point. Technical ability is a small part of the art experience. That's one reason a lot of people hate calling image gens "art" - it's so flashy without substance. But it's also a reason I don't think generative AI is much of a threat to the human practice of art-making.
That said, AI is probably a threat to roles in the entertainment industry. But it's also worth noting that much of the creativity was being sucked out of entertainment well before AI arrived.
teaearlgraycold 9 hours ago |
Unless we have god-like robotics I don't see AI making physical art any time soon. We can print out photos but people still buy paintings. We can 3D print but people still buy sculptures. People are paid to design and build beautiful buildings and interiors.
And of course if you can combine skills with sculpture with graphic design you're getting more specialized and are more likely to make a living - even if the field of graphic design is decimated by AI. That's generally how I feel about my skills as a programmer. I'm not just a programmer. So even if AI does most of the work with coding I can still write code for income as long as it's not the only reason I'm getting paid.
CooCooCaCha 9 hours ago |
The idea that we won’t care about art is frankly strange. But I think people will still need to make interesting art regardless of the tools.
So far AI doesn’t seem very good at the creative element.
sk11001 9 hours ago |
Earning a living through creativity doesn't work for the majority of people anyway even without AI in the picture. Creative expression is a thing that exists for its own sake, the people who make a living out of it are lucky outliers.
vouaobrasil 9 hours ago |
And so what if they are outliers? It is precisely the outliers that spice up our artistic wealth to make it truly interesting.
bigstrat2003 4 hours ago |
"So what" is that OP's children shouldn't be terrified about the prospects of an artistic career because of AI. It is not going from "good career choice" to "long shot", more like "long shot" to "somewhat longer shot".
briandear 9 hours ago |
We heard this same argument when cameras were invented. Yet some of the most valuable paintings in the world were created in the 20th century.
We heard it again when electronic music started becoming a thing.
Formula 1 wouldn’t exist if the blacksmiths had their way.
The unknown scares people because they are afraid of their known paradigms being shattered. But the new things ahead are often beyond anything of which we could ever dream.
Be optimistic.
vouaobrasil 9 hours ago |
One must not use analogy to analyze individual technologies. People were afraid of the camera, yes, but the camera does not attempt to replace painting. AI attempts to replace photography, painting, and all sorts of art with something that looks like the real thing. Photography never tried to do that, as photographs don't look anything like paintings.
ToValueFunfetti 9 hours ago |
When the camera was invented, it did replace what paintings were used for at the time. Photographs don't look like paintings, but up until the camera paintings were trying to look like photographs. It's no coincidence that impressionism arrived at the same time as the camera.
vouaobrasil 9 hours ago |
There is a difference between replacing usage and replacing the exact art and the people who make it. Yes, the camera influenced painting, but it did not destroy it. AI attempts to destroy natural human expression.
hindsightbias 3 hours ago |
Is anyone working on a painting robot that would use colors, strokes and textures based off of great painters?
beezlebroxxxxxx 9 hours ago |
> And, through the same lens, maybe human-created art is more interesting because it is real.
Conversations I have with people in real life almost always come back to this point. Most people find AI stuff novel, but few find it particularly interesting on an artistic level. I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.
I always find the breathless joy that some people express at this stuff with confusion. To me, the very instant someone mentions "AI generated" I just instantly find it un-interesting artistically. It's not the same as photoshop or using digital art suites. It's AI generated. Insisting on the bare minimum human involvement as a feature is just a non-starter for me if something is presented as art.
I'll wait to see if the utopian vision people have for this stuff comes to fruition. But I have enough years of seeing breathless positivity for some new tech curdle into resignation that it's ended up as ad focused, bland, MBA driven, slop, that I'm not very optimistic.
vouaobrasil 9 hours ago |
I think the main point is that art is interesting precisely because it can transmit human experience. It's communication from another human being. AI "media" completely lacks that. It's more of an expression of the machine-soul, which is tempting us to continue its development until it takes over.
polishdude20 8 hours ago |
For me, art is more interesting, moving, soul connecting the more it is made by less and less people. Art by one person gives me a unique perspective to the artists mind. AI generated art is the opposite of being created by one person. It's an amalgamation of millions or billions of people's input. To me that's uninteresting, not novel and not mind-expanding at all.
visarga 9 hours ago |
> Insisting on the bare minimum human involvement as a feature is just a non starter for me if something is presented as art
You can make the guidance as superficial or detailed as you like. Input detailed descriptions, use real images as reference, you can spend a minute or a day on it. If you prompt "cute dog" you should expect generic outputs. If you write half a screen with detailed instructions, you can expect it to be mostly your contribution. It's the old "you're holding it wrong" problem.
BTW, try to input an image in chatGPT or Claude and ask for a description, you will be amazed how detailed it can get.
beezlebroxxxxxx 9 hours ago |
You need an image for an ad. You write a brief and send it to an artist who follows your brief and makes the image for you. You make more detailed briefs, or you make generic briefs. You receive an image. Regardless, did you make that image or just get a response to your brief?
You want a painting of your dog. You send the painter dozens of photos of your dog. You describe your dog in rapturous, incredible, detail. You receive a painting in response. Did you make that painting? Were you the artist in any normal parlance?
When you use chatGPT or Claude you're signing up to getting/receiving the image generated as a response to your prompt, not creating that image. You're involvement is always lessened.
You might claim you made that image, but then you would be like a company claiming they made the response to their brief, or the dog owner insisting they were the painter, which everyone would consider nonsensical if not plain wrong. Are they collaborators? Maybe. But the degree of collaboration in making the image is very very small.
visarga 3 hours ago |
> Did you make that painting? Were you the artist in any normal parlance?
The symphony conductor just waves her hands reading the score, does she make music? The orchestra makes all the sounds. She just prompts them. Same for movie director.
chefandy 9 hours ago |
It's still very different. What you describe is exactly what an art director does, which is creative and difficult— there's a good reason many commercial artists end their careers as art directors but none start there. Anybody that says making things that look good and interesting using generative AI is easy or doesn't require genuine creativity is just being a naysayer. However, at most, the art director is credited with the compilation of other people's work. In no situation would they claim authorship over any of the pieces that other people made no matter how much influence they had on them. This distinction might seem like a paperwork difference to people outside of the process, but it's not. Every stroke of the pen or stylus or brush, scissor snip, or pixel pushed is specifically informed by that artist's unique perspective based on their experience, internal state, minute physical differences, and any number of other non-quantifiable factors; there's no way even an identical twin that went to the same school and had the same work experience would have done it exactly the same way with the same outcome. Even using tools like Photoshop, which in professional blank-canvas art creation context use little to no automation (compared to finishing work for photography and such that use more of it.) And furthermore, you can almost guarantee that there's enough consistency in their distinctions that a knowledgeable observer could consistently tell which one made which piece. That's an artistic perspective— it's what makes a piece that artist's own piece. It's what makes something someone's take on the mona lisa rather than a forgery (or, copy I guess if they weren't trying to hide it) of the mona lisa. It's also what NN image generators take from artists. Artists don't learn how to do that— they learn broad techniques— their perspective is their humanity showing through in that process. That's what makes NN image generators learning process different from humans, and why it's can make a polaroid look like a Picasso in his synthetic cubist phase but gets confused about the upper limit for human limb counts. I think generative AI could be used to make statements with visual language, closer to design than art. I definitely think it could be used to make art by making images and then physically or digitally cutting pieces out and assembling them. But no matter how detailed you get in those prompts, there aren't enough words to express real artistic perspective and no matter what, your still working with other people's borrowed humanity usefully pureed and reformed by a machine. These tools are fundamentally completely different than tools like Photoshop. In art school I worked with both physical media and electronic media and the fundamental processes are exactly the same. Things like typography in graphic design are much easier, but you're still doing the same exact process and reasoning about the same exact things on a computer that you do working on paper and sending it to a "paste up man," as they did until the 80s/90s. People aren't just being sour pusses about this amazing new art tool— it's taking and reselling their humanity. I actually think these image generators are super neat — I use them to make more boards and references all the time. But no matter how specific I get with those prompts, I didn't make any of that. I asked a computer and that computer made it for me out of other people's art. A lot of people who are taken by their newfound ability to make polished images on command refuse to believe it, but it's true. It's a fundamentally different activity.
visarga 3 hours ago |
> your still working with other people's borrowed humanity usefully pureed and reformed by a machine
Exactly, isn't it amazing? You can travel the latent space of human culture in any direction. It's an endless mirror house where you can explore. I find it an inspiring experience, it's like a microscope that allows zooming into anything.
chefandy an hour ago |
Sure it's a lot of fun. I also find it very useful for some things like references and mood boards. No matter how granular you get with control nets or LORAs and how good the models get, you just can't get the specificity needed for professional work and the forms it gives you are just too onerous to mold into a useful shape using professional tools. It's still, fundamentally, asking another thing to make it for you, like work for hire or a commission. Software like Nuke's copycat tool or Adobe's background remover or content-aware fill were professionally useful right off the bat because they were designed for professional use cases. Even then, text prompt image generators are more useful than not in low-effort, high-volume use cases where the extremely granular per-pixel nuance doesn't really matter. I doubt they'll ever be useful enough for anything higher-level than that. It's just fundamentally the wrong interface for this work. It's like saying a bus driver on a specific route with a bus is equally useful to a cab driver with a cab. There are obviously instances where that's true, but no matter how many great things you can show are on that bus route, and no matter how many people it's perfectly suited for, there's just no way a FedEx driver could use it to replace their van.
schmidtleonard 9 hours ago |
> "AI generated" I just instantly find it un-interesting artistically
How familiar are you with what is possible and how much human effort goes towards achieving it?
https://civitai.com/images
Photography, digital painting, 3D rendering -- these all went through a phase of being panned as "not real art" before they were accepted, but they were all eventually accepted and they all turned out to have their own type of merit. It will be the same for AI tools.
jsheard 9 hours ago |
If I were trying to convince people that AI art is interesting and creative then I would not choose to highlight the site dedicated to strip-mining the creativity of non-AI artists, to produce models which regurgitate their ideas ad infinitum.
spookie 8 hours ago |
Not to mention extremely suspicious checkpoints that produce imagery of extremely young women. Or in others words women with extremely child like features in ways kids should not be presented.
consteval 9 hours ago |
I would say the difference here is with these:
> Photography, digital painting, 3D rendering
You still make these. You sit down and form the art.
When you use AI you don't make anything, you ask someone else to make it, i.e. you've commissioned it. It doesn't really matter if I sit down for a portrait and describe in excruciating detail what I want, I'm still not a painter.
It doesn't even matter, in my eyes, how good or how shit the art is. It can be the best art ever, but the only reason art, as a whole, has value is because of the human aspect.
Picasso famously said he spent his childhood learning how to paint professionally, and then spent the rest of his life learning how to paint like a child. And I think that really encapsulates the meaning of art. It's not so much about the end product, it's about the author's intention to get there. Anybody can paint like a child, very few have the inclination and inspiration to think of that.
You can see this a lot in contemporary art. People say it looks really easy. Sure, it looks easy now, because you've already seen it and didn't come up with it. The coming up with it part is the art, not the thing.
schmidtleonard 8 hours ago |
> You still make these. You sit down and form the art.
When you use a camera you don't make anything. You press a button and the camera makes it. You haven't even described it.
When you use photoshop you don't make anything. You press buttons and the software just draws the pixels for you. It doesn't make you a painter.
When you use 3D rendering software you don't make anything. You tell the computer about the scene and the computer makes it. You've barely commissioned it.
It's easy to be super reductive. Easy but wrong.
consteval 8 hours ago |
Sorry, I don't think it's the same because making physical specifications via modifying pixels, or 3D art, or forming a shot is something you do.
It's the difference between making a house with wood and making a house by telling someone to make a house. One is making a house, one isn't.
The problem with AI is that it's natural language. So there's no skill there, you're describing something, you're commissioning it. When I do photoshop, I'm not describing anything, I'm modifying pixels. When I do 3D modeling, I'm not describing anything, I'm doing modeling.
You can say that those more formal specifications is the same as a description. But it's not. Because then why aren't the business folks programmers? Why aren't the people who come up with the requirements software engineers? Why are YOU the engineer and not them?
Because you made it formally, they just described it. So you're the engineer, they're the business analysts.
Also, as a side note, it's not at all reductive to say people who use AI just describe what they want. That is literally, actually, what they do. There's no more secret sauce than that - that is where the process begins and ends. If that makes it seem really uninspired then that's a clue, not an indicator that my reasoning is broken.
You can get into prompt engineering and whatever, I don't care. You can be a prompt engineer then, but not an artist. To me it seems plainly obvious nobody has any trouble applying this to everyone else, but suddenly when it's AI it's like everyone's prior human experience evaporates and they're saying novel things.
aenvoker 7 hours ago |
Try it sometime. Don't just type one prompt and declare the job done. Try to make something that invokes a reaction in yourself.
AI makes it easy to generate ten thousand random images. Making something of interest still requires a lot of digging in the tools and in your self.
consteval 6 hours ago |
Right, it can require describing and refining over and over. I still don't think that means you did the thing. Otherwise, the business analysts who have to constantly describe requirements would be software engineers, but they're not.
Not that that isn't a skill in it of itself. I just don't think it's a creationary skill. What you're creating is the description, not the product.
aenvoker 6 hours ago |
Best reply I can give ya I already typed up for someone else here: https://news.ycombinator.com/item?id=41743680
ipaddr 43 minutes ago |
You are creating the product but have to go through an unclear layer and through trial and error you try to reach your original vision. No different from painting a picture for an amateur.
The better you get the closer you can get to your original vision.
beezlebroxxxxxx 9 hours ago |
I'll be blunt, all of those images look comically generic and extremely "AI".
> Photography, digital painting, 3D rendering
Those are not the same as AI. Using AI is akin to standing beside a great pianist and whispering into his ear that you want "something sad and slow" and then waiting for him to play your request. You might continue to give him prompts but you're just doing that. In time, you might be called a "collaborator" but your involvement begins at bare minimum and you have to justify that you're more involved --- the pianist doesn't, the pianist is making the music.
You could record the song and do more to the recording, or improv along with your own instrument. But just taking the raw output again and again is simply getting a response to your prompt again and again.
The prompt themselves are actually more artistic as they venture into surrealist poetry and prose, but the images are almost always much less interesting artistically than the prompts would suggest.
schmidtleonard 8 hours ago |
> I'll be blunt, all of those images look comically generic and extremely "AI".
Ok, now I know you're watching through hate goggles. Fortunately, not everyone will bring those to the party.
> Using AI is akin... [goes on to describe a clueless iterative prompting process that wouldn't get within a mile of the front page]
You've really outed yourself here. If you think it's all just iterative prompting, you are about 3 years behind the tools and workflows that allow the level of quality and consistency you see in the best AI work.
blargey 4 hours ago |
I scrolled through and...have to agree with their impression. I'm confused as to what you thought is being demonstrated by images on https://civitai.com/images of all things, since it's all very high-concept/low-intentionality, to put it nicely. Did you mix it up with a different link?
Nadya 3 hours ago |
My litmus test is to simply lie. It weeds out the people hating AI simply because they know or think it is AI. If you link directly to an AI site they're already going to say they hate it or that it all "looks like AI slop". You won't get anywhere trying to meet them at a middle ground because they simply aren't interested in any kind of a middle ground.
> https://www.reddit.com/r/greentext/comments/zq91wm/anons_dis...
Which is exactly the opposite of what the artists claim to want. But god is it hilarious following the anti-AI artists on Twitter who end up having to apologize for liking an AI-generated artwork pretty much as a daily occurrence. I just grab my popcorn and enjoy the show.
Every passing day the technologies making all of this possible get a little bit better and every single day continues to be the worst it will ever be. They'll point to today's imperfections or flaws as evidence of something being AI-generated and those imperfections will be trained out with fine tuning or LoRA models until there is no longer any way to tell.
E: A lot of them also don't realize that besides text-to-image there is image-to-image for more control over composition as well as ControlNet for controlling poses. More LoRA models than you can imagine for controlling the style. Their imagination is limited to strictly text-to-image prompts with no human input afterwards.
AI is a tool not much different than Photoshop was back when "digital artists aren't real artists" was the argument. And in case anyone has forgotten: "You can't Ctrl+Z real art".
Ask any fractal artists the names they were called for "adjusting a few settings" in Apophysis.
E2:
We need more tests such as this. The vast majority of people can't identify AI nearly as well as they think they can identify AI - even people familiar with AI who "know what to look for".
https://www.tidio.com/blog/ai-test/
Artworks (3/4) | Photos (6/7) | Texts (3/4) | Memes (2/2)
Fun excerpt by the way:
> Respondents who felt confident about their answers had worse results than those who weren’t so sure
> Survey respondents who believed they answered most questions correctly had worse results than those with doubts. Over 78% of respondents who thought their score is very likely to be high got less than half of the answers right. In comparison, those who were most pessimistic did significantly better, with the majority of them scoring above the average.
TranquilMarmot 2 hours ago |
Sorry, but there's nothing interesting or unique about the images on that site.
xanderlewis 9 hours ago |
> I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.
Well put. This is also my experience. And I'm no AI doom-monger or neo-Luddite.
mattgreenrocks 9 hours ago |
> I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.
Yes, I've noticed this. The people who are excited about it usually come off as opportunistic (hence the "breathless joy"), and not really interested in letting whatever art/craft they want to make deeply change them. They just want the recognition of being able to make the thing without the formative work. (I hesitate to point this out, anticipating allegations of elitism.)
Plus, really online people tend to dominate online discussions, giving the impression that the public will be happy to consume only AI generated things. Then again, the public is happy to consume social media engagement crap, so I'm very curious what the revealed preference is here.
The value in learning this stuff is that it changes you. I'll be forever indebted to my guitar teacher partially because he teaches me to do the work, and that evidence of doing the work is manifest readily, and to play the long, long game.
aliasxneo 8 hours ago |
I think a key piece here is that I often consume art from the mindset of, "What was the creator thinking?" What is their worldview? What social situations pushed them to express things in this way?
For video, it's possible AI can feed into the overall creative pipeline, but I don't see it replacing the human touch. If anything, it opens up the industry to less-technical people who can spend more time focusing on the human touch. Even if the next big film has AI generation in it, if it came from someone with a fascinating story and creative insight, I'll still likely appreciate it.
phainopepla2 9 hours ago |
I suspect the demand for human creative output will shrink, as AI generated content will be so cheap and prevalent, even as it will only ever be an imitation of human art. The same way that most people eat terrible, flavorless tomatoes from the supermarket, instead of the harder to grow heirloom varieties.
But I don't think human creativity is going anywhere. Unless there is some breakthrough that moves it far beyond anything we've seen so far, AI will always be trailing behind us. Human creativity might become a more boutique product, like heirloom tomatoes, but there will always be people who value it.
throwaway2203 9 hours ago |
There might be more creating it than there are those valuing it
GaggiX 9 hours ago |
Most of my entertainment is watching dudes sitting in their chairs talking into a microphone. I find it more entertaining than the billion dollar entertainment industry.
batch12 9 hours ago |
I think there will be a body that certifies artistic content as organic similar to food. This will create a premium offering for organic content and a lower tier AI /uncertified level.
mrtksn 9 hours ago |
AI content is already very dull, the text is dull the music is dull the images and videos are also dull. No one is interested in AI Seinfeld or this short movie that AI created. Their only audience is just people admiring what the machines come to be able to do.
Any AI content that's good, and there are a few of them, actually has plenty of human creativity in it.
There are some AI artist that begin to emerge or there are some AI generated personas out there who are interesting but they are interesting only because the people behind it made it interesting.
I am not fatalistic at all for the creatives. AI is going to wipe out the producers and integrators(people that specialize in putting things together, like coders who code when tasked, painters who paint when commissioned, musicians that play once provided with the score), not the creatives.
The GOTCHA, IMHO, will be people not developing skills because the machine can do it but I guess maybe they will the skills that make the machine sing.
righthand 9 hours ago |
So we’ll automate away entertainment jobs but none of the cool science jobs will be automated? I don’t understand how this proposed world will have an available work for scientists but not entertainers.
supriyo-biswas 9 hours ago |
At least for Meta, this has implications for keeping people engaged in their metaverse.
farts_mckensy 9 hours ago |
Science is never going to supplant art. They serve two very different functions in society. What I hope is that performance art and experiences that can't be easily replicated by AI become more mainstream. Things like ARGs and multimedia storytelling, where there is a back and forth participatory sort of process between the audience and the creator.
Fluorescence 9 hours ago |
Cheaper more effective entertainment is likely to only cause more problems: it will be more addictive, better at hijacking our brains and attention, better at pushing the propaganda goals of the author, better at filling traditional "human needs" of relationships that forever separates us from each other into a civilisation of Hikikomori.
I have little faith in an optimistic view of human nature where we voluntarily turn more toward more intellectual or worthy pursuits.
On one hand, entertainment has often been the seed that drives us to make the imagined real, but the adjacent possible of rewarding adventure/discovery/invention only seems to get more unaffordable and out of reach. Intellectual revolutions are like gold rushes. They require discovery, that initial nugget in a stream, the novel idea that opens a door to new opportunities that draws in the prospectors. Without fresh opportunity, there is no enthusiasm and we stew in our juices.
I suspect the only thing that might save us from total solipsistic brain-in-vat immersion in entertainment... is something like glp-1 type antagonists. If they can help us resist a plate of Danish maybe they can protect us from barrages of Infinite Jest brain missiles from Netflix about incestuous cat wizards or whatever. Who knows what alternatives this new permanently medicated society, Pharma-Sapiens, might pursue instead though.
petesergeant 9 hours ago |
We'll be able to start fuzz testing the human brain. A horror film that uses bio-feedback to really push the bits that are actually terrifying you, in real-time. Campaign videos that lean in to the bit that your lizard brain is responding to.
slavik81 3 hours ago |
The Onion was ahead of the curve with "New Live Poll Lets Punits Pander To Viewers In Real Time". https://youtu.be/uFpK_r-jEXg
schmorptron 8 hours ago |
I believe you're right too. The internet and smartphones are great technology in general, and can do pretty great things but what they've ended up doing was screwing with the reward mechanisms in my brain since I was a teenager. Most optimized use case.
Reading these threads sometimes feels like a bad idea, because you just get new sad ideas on how things will almost certainly be used to make it worse than just the ones you can come up on your own.
hindsightbias 8 hours ago |
They will be creating for a very small crowd. It will be nice for me, because I can't stand all the blockbuster movies that prioritize stretching physics with unrealistic special effects over plot and dialog.
I think the musicians that are barely hanging on at this point would prefer to create over having to slog around on tours to pay their health insurance. But nobody is paying for creation.
Alex-Programs 8 hours ago |
> Maybe, at that point, we decide that exploring space, stretching our knowledge of physics and chemistry, and combating disease are far more interesting because they are real.
It's a compelling thought - we all like hope - and I think it might be realistic if all of humanity were made up of the same kind of people who read hacker news.
But is this not what the early adopters of the internet thought? I wasn't there - this is all second hand - but as far as I know people felt that, once everyone gained the ability to learn anything and talk to anyone, anywhere, humanity would be more knowledgeable, more thoughtful, and more compassionate. Once everyone could effortlessly access information, ignorance would be eliminated.
After all, that's what it was like for the early adopters.
But it wasn't so in practice.
I worry that hopeful visions of the future have an aspect of projecting ourselves onto humanity.
kypro 8 hours ago |
Why would humans explore space when AIs are more intelligent and more physically able to?
Seems more likely we'll just plug ourselves into ever more addicting dopamine machines. That's certainly the trend so far anyway.
hcarvalhoalves 8 hours ago |
Art and entertainment are different things.
earth_walker 8 hours ago |
Paint didn't replace charcoal. Photography didn't replace drawings. Digital art didn't replace physical media. Random game level generation didn't replace architecture.
AI generated works will find a place beside human generated works.
It may even improve the market for 'artsy' films and great acting by highlighting the difference a little human talent can make.
It's not the art that's at risk, it's the grunt work. What will shift is the volume of human-created drek that employed millions to AI-created drek that employs tens.
boogieknite 7 hours ago |
Im hopeful US will have some subsidy for real creative works like ive seen in europe.
My limited understanding is that AI could generate Netflix top 10 hits that mostly recycle familiar jokes. The creators made a great product, but i expect anyone who attended film school would rather try something new, only issue is Netflix wont foot the bill (i know, they take a few oscar swings a year now).
Recent examples: TV Glow, Challengers, Strange Darling. All movies with specific, unique perspectives, visuals, acting choices, scripts, shots, etc. Think about the perspective in The Wire, The Sopranos, Curb Your Enthusiasm. There is plenty of great work that obviously is nearly impossible to reproduce by an AI and i hope that AI "art" is taxed in a way that funds human projects.
zoogeny 7 hours ago |
Consider another angle.
I follow a lot of the new AI gen crowd on Twitter. This community is made up of a lot of creative industry people. One guy who worked in commercials shared a recent job he was on for a name brand. They had a soundstage, actors, sound people, makeup, lighting, etc. setup for 3 days for the shoot. Something like 25 people working for 3 days. But behind that was about 3 months of effort if one includes pre-production and post-production. Think about editing, color correction, sound editing, music, etc.
Your creative children may live in a world where they can achieve a similar result themselves. Perhaps as a small team, one person working on characters, one person doing audio, one person writing a script. Instead of needing tens of thousands of dollars of rented equipment and 25 experts, they will be able to take ideas from their own head and realize them with persistence and AI generation.
I honestly believe these new tools will unlock potential beyond what we can currently imagine.
Miraste 2 hours ago |
That will be the end of creative work. Marketing and promotion is already the most difficult part of any creative endeavor. With literally unlimited trash being produced, it'll become impossible.
echoangle 5 hours ago |
If I imagine a random person on the street, they certainly aren’t enjoying fine human arts because it’s made by a real person. They are scrolling TikTok and don’t care if it’s AI generated or not, if they even notice. The people actually caring about art because it is art are maybe 20% of the population.
alex_suzuki 4 hours ago |
I think 20% is being generous… more like 2%.
layer8 4 hours ago |
Creativity is about having original ideas. So far, AI isn’t that good at that, and neither at maintaining a consistent idea throughout a production. Will AI be able to come up with a compelling novel series, music album, video game, movie or TV series in ten years? Possibly, but there’s also a good chance that it won’t.
elwell 4 hours ago |
The same can be said about plain old I.
sroerick 9 hours ago |
I haven’t had any luck being able to effectively generate compositions with text to image / text to video. Prompts like “subject in the lower third of the frame” have thus far completely failed me. I’m sure this will change in the future but this seems pretty fundamental for any ‘AI Powered Film’ to function the way a film director would.
Curious if anybody has a solution or if this works for that
Animats 9 hours ago |
Likely results:
- Every script in Hollywood will now be submitted with a previs movie.
- Manga to anime converters.
- Online commercials for far more products.
sroussey 7 hours ago |
Scripts with AI low quality “movie” with blocking etc is an interesting concept.
Manga to anime already exists.
Commercials, particularly for social/online, already happening as well.
wseqyrku 9 hours ago |
Looking at bolt.new I think all the Studio/IDE type of apps are going to look like that. Could be video or code or docs etc.
I can see myself paying a little too much to have a local setup for this.
jcims 9 hours ago |
One thing I've noticed with the set of music generation tools (eg Udio, Suno) is that there's a sort of profound attachment to songs that you create. I've never made music the old fashioned way so I'm guessing the same could be true for that as well, but there are songs I've made on Udio that I personally think are amazing but nobody else really responds to. Conversely I can see similar levels of pride and attachment from others for songs they have created that don't do anything for me.
It's going to be interesting to see how that plays out when you can make just about any kind of media you wish. (Especially when you can mix this as a form of 'embodiment' to realize relationships with virtual agents operated by LLMs.)
SamBorick 9 hours ago |
> I've never made music the old fashioned way so I'm guessing the same could be true for that as well
Yes, it is. You should try it.
grepfru_it 9 hours ago |
It’s the same feeling. No different from rebuilding that crazy synth you made one night, succeeding, and then being able to improv/vamp with it during a live session. It is a creative process and I urge anyone who finds the high level aspect of music creation to pursue the lower levels
corytheboyd 9 hours ago |
> I personally think are amazing but nobody else really responds to.
Welcome to making music lol. Since there is so much of it, you have to make the absolute best to even be considered. And then, because so many people make the absolute best, people only care about the persona making the music (as great as you are, you aren’t Taylor Swift, Kendrick Lamar, Damon Albarn). Your friends will never care about your music just because you are friends, don’t fall into that trap. Also nobody cares about music without good lyrics, because again, there is just so much instrumental content out there that sounds the same, lyrics differentiate it with a human, emotional element.
Just make stuff for fun. Listen to it every now and then and feel the magic of “hehe I made that”
changing1999 5 hours ago |
> Also nobody cares about music without good lyrics
Well, that's an exaggeration if I've ever seen one. Firstly, so much of current chart music has atrocious lyrics. And secondly, instrumental music is very popular.
corytheboyd 4 hours ago |
You got me, I exaggerated on the internet. Sorry.
hackable_sand 2 minutes ago |
U good?
changing1999 5 hours ago |
"Music is about communication" (John Lennon, IIRC). Don't expect people to profoundly connect to music that is nothing more than a collection of regurgitated ideas.
Not to sound too crass, but a parallel could be drawn to smelling one's own farts and wondering why no one else appreciates the smell.
TranquilMarmot 2 hours ago |
"I microwaved this frozen pizza while I was desperately craving pizza, so it was perfect to me. Why does my friend who usually eats hand-crafted pizza not think it's as good?!"
vunderba 4 hours ago |
One thing I've noticed with the set of music generation tools (eg Udio, Suno) is that there's a sort of profound attachment to songs that you create.
With all due respect, how could there be when at the click of a button you can generate entire songs? You didn't come up with the chord progression, the structure, the melodic motifs, or the lyrics.
My attachment to my works is directly proportional to the amount of effort it took to create them.
s3p 3 hours ago |
I think OP is saying they really enjoy the song. Not that they feel it is their magnum opus.
wantsanagent 3 hours ago |
Imagine you've had an idea bouncing around in your head, or even an emotion, for a long time and you've never been able to express it. Then one day you push a button and a piece of art captures what you've been feeling perfectly.
It's not the craft that drives attachment in this case but the emotional resonance of something that you think should exist finally existing.
changing1999 2 hours ago |
AI mentioned above is not at the level of capturing and expressing ideas or emotions beyond "a sad rock song about a breakup". Try guiding it to express any clearly formed musical idea.
Author's attachment is to a large degree based on the false notion that they somehow contributed to the creation process.
The generic, frigid, un-interesting "product" that is produced by said AI is why no one other than the prompter is moved by the result.
jcims 43 minutes ago |
This is a tad overwrought. There is a creative process, but it’s much more akin to simple producing rather than composing.
My point wasn’t to debate the merit of generated music, it was simply to highlight the effect I described.
changing1999 37 minutes ago |
It's not closer to producing that it is to composing. In fact, I would say it's closer to composing in the sense that you can at least add lyrics and pick a genre.
Production requires specifying very precise requirements, which the current gen AI is unable to follow. Even at the most fuzzy production level like "a song with strings and a choir", Suno will generate something completely irrelevant. And if you will try to go deeper -- use a classic Moog synth line in the chorus -- don't expect to generate something meaningful.
I won't argue that in the most broad sense, prompt engineering is a creative process. Picking which shoes to wear to work is also a creative process. My argument is that this has barely anything to do with the process of music composition or production. You can literally reuse the same prompt to generate an image or a poem.
throwup238 2 hours ago |
> You didn't come up with the chord progression, the structure, the melodic motifs, or the lyrics.
Both Suno and Udio allow paid subscribers to upload their own clips to extend from. It works for setting up a beat or extending a full composition from a DAW.
Suno's is more basic than Udio's which allows in painting and can create intros as well as extensions, but the tools are becoming more and more powerful for existing musicians. With Udio you can remix the uploaded clip so you can create the cord progression and melody using one set of instruments or styles (or hum it) and transform it into another.
I also use this feature all the time to move compositions from one service to the other. Suno is better at generating intros and interesting melodies while Udio is better at the editing afterwards.
GrantS 2 hours ago |
You can absolutely specify your own lyrics and structure to Suno.
changing1999 25 minutes ago |
If by "structure" you mean "add a verse and a chorus" then sure. Music composition goes slightly beyond that.
takinola an hour ago |
I can imagine a painter two hundred years ago saying the same thing about photographs. How can you feel attachment to a picture when you did not make each brush stroke?
jcims an hour ago |
I have no explanation.
It’s not a sense of pride or accomplishment. I don’t know what it is. Maybe a small amount of pride. It’s hard to say. But there is a definite connection that feels different listening to songs i requested vs those that other people have.
Kiro 3 hours ago |
This is not unique to AI. People simply don't care about your stuff. Ask any regular artist or game developer.
scudsworth 3 hours ago |
link a song you prompted that you personally think is amazing
dpcan 9 hours ago |
You know it’s going to happen:
“I want a funny road trip movie staring Jim Carey and Chris Farley, based in Europe, in the fall, where they have to rescue their mom played by Lucille ball from making the mistake of marrying a character played by an older Steve Martin.”
10 minutes later your movie is generated.
If you like it, you save it, share it, etc.
You have a queue of movies shared by your friends that they liked.
Content will be endless and generated.
adventured 9 hours ago |
It'll require tens to hundreds of hours to script the flow of the AI content, to edit, make adjustments, clean it up, make the scenes link together smoothly, fix small glitches. Even with far more advanced AI, it won't come together like a movie people would enjoy watching, without vast human labor involved.
Sub one percent of people are going to be willing to put in the hours to do it.
The bulk of the spammed created content will be: the masses very briefly playing with the generative capabilities, producing low quality garbage that after five minutes nobody is interested in and then the masses will move on to the next thing to occupy a moment of their time. See: generative image media today. So few people care about the crazy image creation abilities of MidJourney or Flux, that you'd think it didn't exist at all (other than the occasional related headline about deepfakes and or politics).
halyax7 6 hours ago |
Much of those editing steps could be streamlined and/or straight up automated so that estimate will come way down over time
candiddevmike 9 hours ago |
The content bubble apocalypse, where no one is ever watching the same thing and we lose all cultural connections to each other. At least until someone figures out an algorithm/prompt to influence the content, yvan eht nioj style.
adventured 9 hours ago |
The opposite will occur. Very little will change from how people consume content today. There won't be endless amounts of quality content, there will still be very little high quality content. There will be brief bursts of large amounts of garbage that nobody pays attention to (as a small percentage of people flirt with generative media and quickly lose interest; and the vast majority never bother at all).
The extreme majority will all watch the same things just as they do today. High quality AI content will be difficult to produce and will be nearly as limited in the future as any type of high quality content is today. The masses will stick to the limited, high quality media and disregard that piles of garbage. Celebrity will also remain a pull for content, nothing about that will ever change (and celebrity will remain scarce, which will assist in limiting what the masses are interested in).
By and large people only want to go where other people are at. Nothing about AI will change that, it's a trait that is core to humanity. The way that applies to content is just the same as it does a restaurant: content is a mental (and sometimes physical) destination experience just as a restaurant or vacation trip is.
imiric 7 hours ago |
I think both things will be true. We will enjoy common media that everyone else enjoys, not because of its high quality, but because it's shared shared within our social circles. And we will also generate highly personalized media for our own enjoyment, because we will have full control over it. The quality of this media won't necessarily be "garbage", and will likely be on par with professional productions. It will just be much more personal than anything a professional team could create for us.
Though a reason we would gravitate towards common media more is if what someone brought up in the comments here comes to pass, and celebrities/actors license their likeness to studios only, and amateur tools are not licensed to use them. Though I think there will always be crafty/illegal ways around this. Also, likeness probably won't be worth much, if we can generate any type of character we like anyway. I, for one, couldn't be happier for celebrities and the cultural obsession around them to disappear.
intended 7 hours ago |
Rich people will afford the subscription costs for curated and verifiable content.
Plebs will get the mass produced stuff, just like it has been for junk food.
In the information case, even if you wanted to sell good quality, verifiable content, how are you going to keep up with the verification costs, or pay people when someone can just dupe your content and automate its variations?
People who are poor dont have the luxury of time, and verifications cannot be automated.
Most people dont work in infosec or Trust and safety, so this discussion wont go anywhere, but please just know - we dont have the human bandwidth to handle these outcomes.
Bad actors are more prolific and effective than good, because they dont have to give a shit about your rules or assumptions.
birracerveza 9 hours ago |
Rick & Morty introduced the concept of interdimensional cable in 2014. Ten years later, it's a reality. Crazy stuff.
ActionHank 9 hours ago |
I don't know about you, but my friends and family are boring af. I wouldn't want to watch their queue of noise.
I do hope that more talented people will have more leverage to create without the traditional gatekeeping, but I also doubt this will happen as the gatekeepers are all funding AI tooling as well.
chankstein38 8 hours ago |
My girlfriend and I already sometimes use Suno like this for music. Just generate a bunch of songs under a specific genre (our favorite right now is nordic folk, dubstep) and just listen through. If we want to learn about or remember something, we make a Suno song about it. The songs are almost all bangers too so it's not even a chore to listen through!
heurist 8 hours ago |
> Generation failed: I'm sorry, but I cannot generate videos of real people. Please try another prompt.
7734128 5 hours ago |
Also no violence, or alluding to conspiracies, or historical events.
oulipo 9 hours ago |
Impressive, yet more burning GPUs and pushing CO2 in the atmosphere just for stupid stuff that is only of interest to rich western people...
I'd rather have those people work on climate change solutions
dyauspitr 9 hours ago |
No thank you, I not going to let my beloved progressives be dragged into Luddism just so you can feel a little better about yourself through insignificant changes without a meaningful impact on the environment. Anything with true lasting effects will have to be top down like renewable resources/energy, EVs, viable alternatives to plastic, nuclear energy etc.
The argument should never be about reducing energy usage, rather it should be about how we generate that energy in a clean, renewable way.
dievskiy 9 hours ago |
> Impressive, yet more burning GPUs and pushing CO2 in the atmosphere just for stupid stuff that is only of interest to rich western people
Better to spend 10x amount of energy on humans that will give the same result?
throwaway38543 9 hours ago |
Spoken like a true Frenchman. I suppose it's a good thing you don't have any say in what people research. Perhaps Mistral should be training weather models instead of silly chatbots for ERP?
Bloedcoins 9 hours ago |
At least companies are behind it which can actually put the money were its needed and compensate it.
At least microsoft and google are on a co2 neutral race.
And all of these clusters doing something can also do research and partially do.
Its valid critisism, but we need to stop co2 production on a lot of other industries before we do that for datacenters. Datacenters save a lot more co2 (just think about not having to drive to a bank to do bank business).
kkielhofner 9 hours ago |
> burning GPUs and pushing CO2 in the atmosphere
My startup develops AI for the nuclear power industry to drive process, documentation, and regulatory efficiency. We like to say "AI needs nuclear and nuclear needs AI".
Big tech has finally realized/gone public that casually saying things like "we're building our next 1GW datacenter" is uhh, problematic[0].
For some time now there has been significant interest/activity in wiring up entire datacenters to nuclear reactors (existing Gen 2, SMRs, etc):
https://finance.yahoo.com/news/nvidia-huang-says-nuclear-pow...
https://www.ans.org/news/article-5842/amazon-buys-nuclearpow...
https://www.yahoo.com/news/microsoft-signs-groundbreaking-en...
https://www.cnbc.com/2024/09/10/oracle-is-designing-a-data-c...
https://thehill.com/policy/technology/4913714-google-ceo-eye...
[0] - https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-e...
Zpalmtree 5 hours ago |
utter loser mentality
joshdavham 9 hours ago |
I wonder if one day we’ll have generative recommender systems where, instead of finding videos the algorithm thinks you’ll like, it just generates them on the spot.
turblety 9 hours ago |
Why do these video generation ones never become usable to the public. Is it just they had to create millions of videos and cherry pick only a handful of decent generations? Or is it just so expensive there's no business model for it?
My mind instantly assumes it a money thing and they're just wanting to charge millions for it, therefore out of reach for the general public. But then with Meta's whole stance on open ai models, that doesn't seem to ring true.
FileSorter 9 hours ago |
There are usable ones
runwayml.com
pika.art
hailuoai.com
grumbel 6 hours ago |
klingai.com
lumalabs.ai
Apocryphon an hour ago |
When I see lists of URLs like that I can only wonder what a future post archeologist, coming upon this long dusty thread half a decade from now, will find when they try to go to those sites.
causal 9 hours ago |
I'd guess 1 in 10 model demos turn out to be useful product, at best.
This and Sora are particularly annoying, though, for how they put together these huge flashy showcases like they're announcing some kind of product launch and then... nothing. Apparently there's value in just flexing your AI-making muscle now and then.
93po 4 hours ago |
to be fair, Sora was one of the most mind blowing technology showcases of my life, and openai is successful at raising tons of money
ActionHank 9 hours ago |
Cost vs profitability is a big factor and those that don't have a product on the market are heavily cherry picking their demos.
zoogeny 7 hours ago |
There are a few available to the public. runway.ai and kling are a couple that I see heavily used on Twitter.
I pay for runway right now for experiments and it works. The problem is that maybe 1 out of 10 prompts result in something useable. And when I say useable I have pretty low standards. Since the model pumps out 5 or 10 second clips you have to be pretty creative since the models still struggle with keeping any kind of consistency between shots. Things like lighting, locations, characters can all morph within/between cips.
The issue isn't quality exactly, it is like 80% there. When it works, it is capable of blowing your mind. You can get something that looks like it is a bonafide Hollywood shot. But that is a single 5 second or 10 second clip. So far there is no easy way to reliably piece those together to make even a 1 minute long TikTok.
The real problem is the cost. Since you have to sometimes do 10 prompts to get a single acceptable shot it is like a 10x multiplier on the cost per second of video. That can get very expensive for even short experiments.
hackernewds 4 hours ago |
How much do you pay? Imagine if they could charge premium prices to studio's like $100k/user
that's probably where the quality is, but not the billions
yurylifshits an hour ago |
Hi zoogeny (and anyone else here) — you can try our new app Nim to address the Runway problems you describe https://alpha.nim.video
We offer both image-to-video (same situation as Runway, need a few attempts to make something awesome) and video-to-video (under the name "Restyle 2.0") - this is our newest tool and is highly reliable, i.e. you can get complex motion (kissing, handshakes, boxing, skateboarding, etc) with controllable changes to input video (changing outfits, characters, backgrounds, styles).
Unlike Runway and Kling, we currently offer a smiple UNLIMITED plan for just $10/mo. Check it out! https://alpha.nim.video
dvngnt_ 24 minutes ago |
GTA IV Real Life - Runway Gen 3 AI shows the potential to turn low-fidelity source to something life-like https://youtu.be/FGBSzSO8k6A it would be really cool to this to work locally at playable rates
chankstein38 7 hours ago |
Came here to say this... These companies all want patted on the back for how cool their video models are but we're still waiting on Sora since like last year. More and more publish these "look at us" papers but don't publish the models or even give us access to them.
They do exist, Luma AI DreamMachine is pretty cool. As well as Kling, Minimax, etc. But they aren't anything like Sora or this appear to be. They work but these, while likely cherry-picked, are still a whole new breed of video generation. But who knows if we'll ever actually get to use them or if we're just supposed to reflect on them and think about how cool and impressive Facebook and OpenAI are.
ddtaylor 6 hours ago |
I'm confused the demo let me press a button and generate a video, was it not supposed to?
mitthrowaway2 5 hours ago |
I didn't see a button for that. Just "download paper". Did I miss it?
altairprime 5 hours ago |
At “the public” Internet scale, if a hundred million people click Generate, imagine if Meta ends up paying a million dollars instantaneously.
- How many clicks of Generate are budgeted for?
- How many clicks should each user’s quota be?
- How much advertising revenue will be earned per click?
- Why should they give away a million dollars?
Right now, AI costs for this are so high that offering this feature ‘for free’ would bankrupt a small country in a matter of days, if everyone on Meta used it once. It doesn’t particularly matter what the exact cost is: it’s simply not tolerable to anyone who owes payment for the services provided.
This is also why the AI industry is trying to figure out how to shift as much AI processing as possible to devices without letting users copy their models to profit off of the training research spend.
tqi 4 hours ago |
Meta owns their data centers, so I don't think that framing is quite right. Increased traffic might cost marginally more in terms of electricity usage, but I think mostly what would happen is the service would degrade.
yojo 3 hours ago |
The hardware serving web requests on Facebook is very different from the hardware used to generate these videos. It’s different kit, that is currently quite expensive and power intensive.
Facebook absolutely does not have a fleet of GPUs idling that could suddenly spring into action to generate a billion of these videos, nor do they have power stations on standby ready to handle the electricity load.
tqi 3 hours ago |
Right, my point is that "paying a million dollars instantaneously" isn't something that Meta would face the way a company with a public cloud infra would, and as a result their motivations / concerns are probably more along the lines of bad user experiences (due to performance bottlenecks) hurting public perception rather than runaway costs bankrupting the company.
altairprime 3 hours ago |
Having recently seen cost analysis for hosted enterprise generative AI, we’ll continue to disagree on this point. You certainly are describing valid concerns but Meta never struck me as being particularly worried about how people think of them; and, I am certain this doesn’t have the ’degrade’ capability at the billion users scale — it would have work queue lengths measured in weeks or more, which is useless for social media.
afh1 3 hours ago |
Just release the model and anyone can run locally, there is no cost except for the end user. Meta has the cash flow to do this if they wanted.
roywiggins 3 hours ago |
Meta probably doesn't want people generating porn (and worse) with their models or derivations of their models, for obvious reputational reasons.
ipaddr 2 hours ago |
They are in the wrong business if that's the main concern and will get overshadowed by others as tike goes on.
layer8 5 hours ago |
Consistency and continuity is the main problem. Take a look at the “Super Panavision” AI videos on YouTube.
Those videos are a good measure for monitoring AI video improvement.
jonplackett 31 minutes ago |
KlingAI is pretty good - but only 5 second clips for their v 1.5 model which is much better than 1.0
I made this with it (after training a Flux Lora on myself)
https://vm.tiktok.com/ZGdJ6uSh1/
Also interesting - blog post from someone who actually got to use Sora https://www.fxguide.com/fxfeatured/actually-using-sora/
TLDR; it’s still quite frustrating to use
nephy 9 hours ago |
McDonald’s art.
deng 9 hours ago |
These are not movies, these are clips. The stock photo/clip industry is surely worried about this, and probably will sue because 100% these models were trained on their work. If this technology ever makes movies, it'll be exactly like all the texts, images and music these models create: an average of everything ever created, so incredibly mediocre.
devonsolomon 9 hours ago |
I’ve long ago heard it said that the two drivers of technology innovation are the military and porn. And, welp, I don’t see any use of this to the military.
Krei-se 9 hours ago |
This is totally awesome - the tech is out there and whether you use it to make videos or solve long human / world problems is up to you.
Yeah, we might get the bad killer robots. But it's more likely this will make it unnecessary to wonder where on this blue planet you can still live when we power the deserts with solar and go to space. Getting clean nutrition and environment will be within reach. I think that's great.
As with all technology: Yes a car is faster than you. And you can buy or rent one. But it's still great to be healthy and able to jog. So keep your brains folks and get some skills :)
whiplash451 4 hours ago |
« out there » is slightly optimistic.
The model is not released and probably won’t be for a while.
And it probably costs Meta-scale infra to fine-tune to your needs.
idunnoman1222 9 hours ago |
It will be as interesting as our dreams. So maybe personally interesting, like for a small group sitting around a table and taking the piss. But it’s not gonna make a global sensation.
Jiahang 9 hours ago |
student here ，i learn cs and management. And i really Puzzled what i learn now can help me have better life in this era of rapid development of technology.
karel-3d 9 hours ago |
Why don't videos like this ever trend?
#cabincrew
#scarletjohanson
#amen
999900000999 8 hours ago |
Very cool.
But I'm worried about this tech being used for propaganda and dis information.
Someone with a 1K computer and enough effort can generate a video that looks real enough. Add some effects to make it look like it was captured by a CCTV or another low res camera.
This is what we know about, who knows what's behind NDAs or security clearances.
nuz 8 hours ago |
Same was thought to happen about images but it hasn't. People quickly debunk AI generated content presented as real in replies or community notes. Not a real issue.
FrequentLurker 8 hours ago |
They didn't post any examples where it fails?
smusamashah 8 hours ago |
I was looking for that landslide effect (as seen even in Sora and Kling) where land seems moving very disproportionally to everything else. It makes me motion sick. I have not seen those Sora demo videos a second time for that reason.
These are smooth, consistent, no landslide (except sloth floating in water, the stones on right are moving at much higher rate than the dock coming closer), no things appearing out of nowhere. Editing seems not as high quality (the candle to bubble example).
To me, these didn't induce nausea while being very high quality makes it best among current video generators.
pookha 8 hours ago |
Facebook just spent 40 billion dollars on their AI infrastructure. Can they recoup those costs with stuff like this (especially after the VI debacle)? I doubt it. AI has been a wild ass jagged wasteland of economic failure since the 1950's and should be used with extreme caution by these companies...Like is it worth peoples time to spend ten\fifteen dollars (they have to eventually charge for this) to let AI create a, to be freank, half-assed valley of the uncanny movie? I respect the technology and what they're trying to accomplish but this just seems like they're going completely all in on an industry that's laid waste to smarter people than Mark.
crakhamster01 7 hours ago |
In Facebook's case they have more to gain than just nifty gen AI features - better ads, content recommendations, etc. The investment in AI infra is a moat, and is why FB's ad platform has proven to be much more resilient to tracking changes than their competitors (e.g. Snap).
heurist 8 hours ago |
I've been saying for years that generated content is an impending tsunami that's going to drown out all real human voices online. The internet may become effectively unusable as a result for anything other than entertainment.
boogieknite 7 hours ago |
This is interesting and i see some of this now. Even here on HN and other forums i thought were mostly "human". Even one of my group chats i can tell one of my friends is using ai responses, but one of the other members cant tell and replies earnestly.
I am grossed out by this. my instinct is to avoid ai slop. The interesting part to me is: What next? Where do we go? Will it be that "human" forums are pushed further into obscurity of the internet? Or will go so far as that we all start preferring meeting in person? Im clueless here
whiplash451 4 hours ago |
Cryptography-secured/signed generated content / interactions?
93po 4 hours ago |
worldcoin project solves a lot of this when combined with web of trust, however everyone's knee jerk reaction to worldcoin is pretty bad and so it's annoying to even mention it
whiplash451 4 hours ago |
I’m not knowledgeable in crypto/worldcoin.
I was rather thinking classical cryptography baked into generative networks.
tim333 3 hours ago |
Worldcoin was intended to solve it but in practice it's not being used that way. Not enough adoption by users or websites.
Maybe in the future?
I'm a worldcoiner but so far it's just been free money.
skywhopper 18 minutes ago |
The only problem is that all the AI slop is not actually entertaining either.
mike_hearn 8 hours ago |
The paper that comes with this is nearly as crazy as the videos themselves. At a cool 92 pages it's closer to a small book than a normal scientific publication. There's nearly 10 pages of citations alone. I'll have to work through this in the coming days, but here's a few interesting points from the first few sections.
For a long time people have speculated about The Singularity. What happens when AI is used to improve AI in a virtuous circle of productivity? Well, that day has come. To generate videos from text you need video+text pairs to train on. They get that text from more AI. They trained a special Llama3 model that knows how to write detailed captions from images/video and used it to consistently annotate their database of approx 100M videos and 1B images. This is only one of many ways in which they deployed AI to help them train this new AI.
They do a lot of pre-filtering on the videos to ensure training on high quality inputs only. This is a big recent trend in model training: scaling up data works but you can do even better by training on less data after dumping the noise. Things they filter out: portrait videos (landscape videos tend to be higher quality, presumably because it gets rid of most low effort phone cam vids), videos without motion, videos with too much jittery motion, videos with bars, videos with too much text, video with special motion effects like slideshows, perceptual duplicates etc. Then they work out the "concepts" in the videos and re-balance the training set to ensure there are no dominant concepts.
You can control the camera because they trained a dedicated camera motion classifier and ran that over all the inputs, the outputs are then added to the text captions.
The text embeddings they mix in are actually a concatenation of several models. There's MetaCLIP providing the usual understanding of what's in the request, but they also mix in a model trained on character-level text so you can request specific spellings of words too.
The AI sheen mentioned in other comments mostly isn't to do with it being AI but rather because they fine-tune the model on videos selected for being "cinematic" or "aesthetic" in some way. It looks how they want it to look. For instance they select for natural lighting, absence of too many small objects (clutter), vivid colors, interesting motion and absence of overlay text. What remains of the sheen is probable due to the AI upsampling they do, which lets them render videos at a smaller scale followed by a regular bilinear upsample + a "computer, enhance!" step.
They just casually toss in some GPU cluster management improvements along the way for training.
Because the MovieGen was trained on Llama3 generated captions, it's expecting much more detailed and high effort captions than users normally provide. To bridge the gap they use a modified Llama3 to rewrite people's prompts to become higher detail and more consistent with the training set. They dedicated a few paragraphs to this step, but it nonetheless involves a ton of effort with distillation for efficiency, human evals to ensure rewrite quality etc.
I can't even begin to imagine how big of a project this must have been.
rolux 15 minutes ago |
Having read the paper, I agree that this is an enormous effort, but I didn't see anything that was particularly surprising from a technical point of view - and nothing of Singularity-level significance. The use of AI to train AI - as a source of synthetic data, or as an evaluation tool - is absolutely widespread. You will find similar examples in almost any AI paper dealing with a system of comparable scale.
gcr 8 hours ago |
> Upload an image of yourself and transform it > into a personalized video. Movie Gen’s > cutting-edge model lets you create personalized > videos that preserve human identity and motion.
A stalker’s dream! I’m sure my ex is going to love all the videos I’m going to make of her!
Jokes aside, it’s a little bizarre to me that they treat identity preservation as a feature while competitors treat that as a bug, explicitly trying not to preserve identity of generated content to minimize deepfake reputation risk.
Any woman could have flagged this as an issue before this hit the public.
PhearTheCeal 7 hours ago |
When I read that text my first thought was making some videos of my mom that passed away, since so few videos of her exist and pictures don't capture her personality
schlick 7 hours ago |
This is a Black Mirror episode: https://en.wikipedia.org/wiki/Be_Right_Back
garrettgarcia 3 hours ago |
The fact that your first thought was how you could use this amazing tech to remember a lost family member who you love, and OP's first thought was that it could be used for evil so it shouldn't exist says a ton about each of you.
allturtles 3 hours ago |
Well I think the second use sounds creepy, too. I'm sure that says a ton about me.
hackable_sand 7 minutes ago |
Nah u not wrong
Both unhealthy
Barrin92 an hour ago |
If you put a piece of technology into the world you should spend more time on what consequences that has for the living in the future, not the dead.
As someone who has worked on payments infrastructure before, it's probably nice if your first thought is what great things an aunt can buy for her niece, but you're better off asking what bad actors can do with your software, or you're in for a bad surprise.
thwarted 2 hours ago |
How would videos created from photos, photos that didn't capture her personality, show her personality?
dagmx 7 hours ago |
Meta aren’t exactly known for responsible use of technology.
I would expect nothing less of Zuck than to imbue a culture of “tech superiority at all costs” and only focus on the responsible aspect when it can be a sales element.
typeofhuman 7 hours ago |
I'm surprised this hasn't already been done (or I'm not aware of it)...
Step 1. Train AI on pornographic videos
Step 2. Feed AI images of your ex
Step 3. Profit
azinman2 6 hours ago |
https://www.nytimes.com/2024/04/08/technology/deepfake-ai-nu...
bozhark 4 hours ago |
why be weird and use real people for reference?
why be extra weird and use a personal reference?
roywiggins 3 hours ago |
it very much has been done
https://en.wikipedia.org/wiki/Deepfake_pornography
kredd 5 hours ago |
Pretty much anyone that I’ve talked to that somewhat works in AI industry, the attitude is “let it rip right now, and deal with the consequences as it’s going to happen one way or another”. I’m not sure where I stand on this issue, but the reality is, it’s inevitable whether we want it or not.
osmsucks 7 minutes ago |
Ah, the usual "if we don't do it, someone else will".
tivert a minute ago |
[delayed]
lostmsu an hour ago |
Your dream $11.99 kitchen knife! Perfect for stabbing! IT IS REAL!
* product made without use of AI or any unnatural components. pure mountain iron
meiraleal an hour ago |
One positive aspect of it is that at some point people will just not care about nudes. Which is better for the victims of rageporn, not worse.
darepublic 8 hours ago |
The problem with gen AI right now is it still feels fairly obvious. There are numerous YouTube channels that primarily rely on gpt for the visuals. And I don't like them.
AzzyHN 7 hours ago |
Obvious to many, but not most, if Facebook is anything to go by
tiborsaas 7 hours ago |
I know people who use it day to day in their production workflow for ads and installations. You'd never know if they wouldn't break it down for you. Imagine 1 second scenes which happens so fast your bran just accepts it as "hand made" or professional job. 90% of it was generative AI, but the "good news" is that it still required a human editor who just happened to save a ton of time to make something that wasn't commercially viable because the client wouldn't paid for it otherwise.
animanoir 8 hours ago |
wow more useless tech
smusamashah 8 hours ago |
One extra clip on their blog post
https://ai.meta.com/blog/movie-gen-media-foundation-models-g...
zoogeny 7 hours ago |
I wonder how they will package this as a product. I mean, there is some advantage to keeping the tool proprietary and wrapping it in a consumer product for Instagram/Facebook.
What I hope (since I am building a story telling front-end for AI generated video) is that they consider b2c and selling this as a bulk service over an api.
takinola an hour ago |
The obvious use case for Meta is content generation. They provide the tools to content creators who create new content to post on Facebook/Instagram which increases Meta’s ad inventory
alexawarrior4 7 hours ago |
It is important to note, no matching audio dialog, or even an attempt at something like dialog. This seems to be way beyond current full video generation models.
modeless 7 hours ago |
Did I miss it or did they not say anything about letting people actually use these models, let alone open sourcing them?
intended 7 hours ago |
Man, everyone is happy with these advancements, and they are impressive.
I’m here looking at users and wondering - the content pipelines are broader, but the exit points of attention and human brains are constant. How the heck are you supposed to know if your content is valid?
During a recent apple event, someone on YT had an AI generated video of Tim Cook announcing a crypto collaboration; it had a 100k users before it was taken down.
Right now, all the videos of rockets falling on Israel can be faked. Heck, the responses on the communities are already populated by swathes of bots.
It’s simply cheaper to create content and overwhelm society level filters we inherited from an era of more expensive content creation.
Before anyone throws the sink at me for being a Luddite or raining on the parade - I’m coming from the side where you deal with the humans who consume content, and then decide to target your user base.
Yes, the vast majority of this is going to be used to create lovely cat memes and other great stuff.
At the same time, it takes just 1 post to act as a lightning rod and blow up things.
Edit:
From where I sit, there are 3 levels of issues.
1) Day to day arguments - this is organic normal human stuff
2) Bad actors - this is spammers, hate groups, hackers.
3) REALLY Bad actors - this is nation states conducting information warfare. This is countries seeding African user bases with faked stories, then using that as a basis for global interventions.
This is fake videos of war crimes, which incense their base and overshadow the harder won evidence of actual war crimes.
This doesn’t seem real, but political forces are about perception, not science and evidence.
yieldcrv 7 hours ago |
I only care about being able to express myself more easily
Maybe get a job where interviewers are biased against my actual look and pedigree
Just ignore everyone else’s use of the tool
mitthrowaway2 6 hours ago |
> Just ignore everyone else’s use of the tool
That's precisely the hard part!
AyyEye 7 hours ago |
The Information Bomb. There's a reason military types and spooks are joining the boards of OpenAI and friends.
https://www.goodreads.com/book/show/203092.The_Information_B...
> After the era of the atomic bomb, Virilio posits an era of genetic and information bombs which replace the apocalyptic bang of nuclear death with the whimper of a subliminally reinforced eugenics. We are entering the age of euthanasia.
BizarroLand 6 hours ago |
There is some credence to the idea that the third reich was only possible due to mass media. Radio, television, and movie theatres broadcasting and rebroadcasting information onto a populace that did not have experience with media overload and therefore had no resistance to it.
Not attempting to justify their actions or the outcomes, just that media itself is and has been long known to be a powerful weapon, like the fabled story of a city besieged by a greater army, who opened their gates to the invaders knowing that the invaders were lead by a brilliant strategist.
The invader strategist, seeing the gates open, deduced that there must be a giant army laying in wait and that the gates being open were a trap, and so they turned and left.
Had they entered they would have won easily, but the medium of communication, an open gate before an advancing horde, was enough in and of itself to turn the tide of a pitched battle.
When we reach the point where we can never believe what we see or hear or think on our own, how will we ever fight?
coldcode 7 hours ago |
Also, cost. How many do you have to generate to get something you want? Does it take 1 or a 100 attempts to get something reasonable, and what does it cost for each attempt? Might not affect Hollywood, but someone has to pay for this to be profitable for Meta. How many 5-Gigawatt power stations will be required (what OpenAI wants to build all over the country) if lots of people use this?
paganel 7 hours ago |
A State-actor could have already done that manipulation using CGI or something. The answer is not to trust the people and persons who one sees as not to be trusted. As per your Israel example, I don’t personality trust them because I have low levels of trust in genocidal regimes, so even if IDF-asset Gal Gadot were to come to my door and tell me that I won a million dollars I would just slam-shut the door in her face, never mind her and her ilk trying to convince me and people like me through videos posted on the internet of whatever it is they are trying to convince people of.
Again, plain common sense just works, most of the times.
csomar 7 hours ago |
The solution to that is to make models both open weight and open source. That will equalize the level playing field.
rolisz 6 hours ago |
How will that help? How will uncle Joe be able to tell fake videos better with an open source model?
ein0p 6 hours ago |
Uncle Joe will just stop assuming that just because there’s a video it is real. That hasn’t been the case for decades. About time uncle Joe caught on.
bee_rider 6 hours ago |
So what’s the plan to level the playing field in that case? Give everybody an equal amount of compute and ask them what sort of propaganda they’d like to have theirs contribute to?
tsimionescu 6 hours ago |
We are at a crossroads of technology, where we're still used to the idea that audio and video are decent proof that something happened, in a way in which we don't generally trust written descriptions of an event. Generative AI will be a significant problem for a while, but this assumption that audio/video is inherently trustable will relatively soon (in the grand scheme of things) go away, and we'll return to the historical medium.
We've basically been living in a privileged and brief time in human history for the last 100-200 years, where you could mostly trust your eyes and years to learn about events that you didn't directly witness. This didn't exist before photography and phonograms: if you didn't witness an event personally, you could only rely on trust in other human beings that told you about it to know of it actually happened. The same will soon start to be true again, if it isn't already: a million videos from random anonymous strangers showing something happening will mean nothing, just like a million comments describing it mean nothing today.
This is not a brave new world of post-truth such as the world has never seen before. It is going back to basically the world we had before photo, video, and sound recordings.
bee_rider 6 hours ago |
That’s an interesting thought.
I think I would not like to live in a world in which democracy isn’t the predominant form of government. The ability of the typical person to understand and form their own opinions about the world is quite important to democracy, and journalism does help with that. But I guess the modern version of image and video heavy journalism wasn’t the only thing we had the whole time; even as recent as the 90’s (I’m pretty sure; I was just a kid), newspapers were a major source. And somehow America was invented before photojournalism, but of course that form of democracy would be hard for us to recognize nowadays…
It is only when we got these portable video screens that stuff like YouTube and TikTok became really important news sources (for better or worse; worse I would say). And anyway, people already manage to take misleading or out of context videos, so it isn’t like the situation is very good.
Maybe AI video will be a blessing in disguise. At some point we’ll have to give up on believe something just because we saw it. I guess we’ll have to rely on people attesting to information, that sort of thing. With modern cryptography I guess we could do that fairly well.
Edit: Another way of looking at it: basically no modern journalist or politician has a reputation better than an inanimate object, a photos or video. That’s a really bizarre situation! We’re used to consulting people on hard decisions, right? Not figuring out everything by direct observation.
dguest 6 hours ago |
It's not all the way back as long as solid encryption exists: Tim Cook could digitally sign his announcements, and assuming we can establish his signature (we had signatures and stamps 200 years ago) video proof still works.
So we're not going all the way back, but the era of believing strangers because they have photographic or video proof is drawing to a close.
tsimionescu 6 hours ago |
Cryptography is nice here, but the base idea remains the same: you need to trust the person publishing the video to believe the video. Cryptography doesn't help for most interesting cases here, though it can help with another level, that of impersonation.
Sure, Tim Cook can sign a video so I know he is the one who published it - though watching it on https://apple.com does more or less the same thing. But if the video is showing some rockets hitting an air base, the cryptography doesn't do anything to tell you if these were real rockets or its an AI-generated video. It's your trust in Tim Cook (or lack thereof) that determines if you believe the video or not.
squigz 5 hours ago |
All this talk of trust speaks to the larger issue here too - that we've lost so much trust in governments and other important institutions. I'm not saying it was undeserved, but it's still an issue we need to fix.
intended 3 hours ago |
This is too much work for the human use case.
Practically speaking, no one is going yo check provenance when scrolling through Reddit sitting on the pot.
kurthr 6 hours ago |
I'd argue it's a step or two more manipulative. Not only do bad actors have the ability to generate moving images which are default believed by many, they also have the ability to measure the response over large populations, which lets them tune for the effect they want. One step more is building response models for target groups so that each can receive tailored distraction/outrage materials targeted to them. Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.
True endstage adtech will require attention modeling of individuals so that you can predict target response before presenting optimized material.
It's not just a step back, it's a step into black. Each person has to maintain an encrypted web of trust and hope nobody in their trust ring is compromised. Once they are, it's not clear even in person conversations aren't contaminated.
tsimionescu 6 hours ago |
> Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.
Just like the ability to emulate the writing style of your trusted humans was (somewhat) commonplace in the time in which you'd only talk to distant friends over letters.
> Once they are, it's not clear even in person conversations aren't contaminated.
How exactly could any current or even somewhat close technology alter my perception of what someone I'm talking to in-person is saying?
Otherwise, the points about targeting are fair - PR/propaganda has already advanced considerably compared to even 50 years ago, and more personalized propaganda will be a considerable problem, regardless of medium.
intended 3 hours ago |
The difference between artisanal work, vs mass production is enough to make it separate products.
The rate of production is the incomparable, no matter what the parallels may seem.
whiplash451 5 hours ago |
Interesting thought. An alternative is a world where we can securely sign captured medium.
ixtli 4 hours ago |
I feel as though i am honor-bound to say that this isn't new and we havent really been living in a place where we can trust in the way you claim. Its simply that every year it rapidly becomes more and more clear that there is no "original". you're not wrong i just think its important for people who care about such things to realize this the result of a historical process which has been going on longer than we've all been alive. in fact, it likely started at the beginning of the 100-200 year period you're talking about, but its origins are much much older than that.
read simulacra and simulation: https://0ducks.wordpress.com/wp-content/uploads/2014/12/simu...
or this essay from pre-war germany https://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_...
intended 3 hours ago |
Which was the era of insular beliefs, rank superstition and dramatically less use of human potential.
I feel that it’s not appreciated, that we are (were) part of an information ecosystem / market, and this looks like the dawn of industrial scale information pollution. Like firms just dumping fertilizer into the waterways with no care to the downstream impacts, just a concern for the bottom line.
stann 6 hours ago |
Yeah... African users... oh poor infantile, gullible, creatures... so incapable of discerning truth from falsehood are the ones to be fooled by generative AI...I get the gist
smallerize 6 hours ago |
It's just something to put ads next to. Selling ad spots is the business, and investors demand an increase even if they already have 3.5 billion pairs of eyeballs. https://www.404media.co/where-facebooks-ai-slop-comes-from/
echoangle 5 hours ago |
This will just lead to people not taking videos as evidence anymore. Just like images of war crime aren’t irrefutable evidence due to staging and photoshop, videos will lose their worth as evidence. Which is actually a good thing in some instances. If someone blackmails you with nudes/explicit videos, you can just ignore it and claim it’s fake.
staticman2 5 hours ago |
Before photography was invented, mass communications was all just words on paper, right?
How would you know that the British burned down the white house in 1812? Anyone could fake a paper document saying it so. (Except many people were illiterate.)
As far as I can see you need institutions you can trust.
baxtr 5 hours ago |
If this was true, why haven’t we seen it with manipulated pictures?
Maybe I’m not well informed but there seem to be no example for the issues you describe with photos.
I believe it’s actually worse than you think. People believe in narratives, in stories, in ideas. These spread.
It has been like this forever. Text, pictures, videos are merely ways to proliferate narratives. We dismiss even clear evidence if it doesn’t fit our beliefs and we actively look for proof for what we think is the truth.
If you want to "fight" back you need to start on the narrative level, not on the artifact level.
pk-protect-ai 7 hours ago |
It is really amazing how consistent this model is in demo videos about world object details over time. This spatial comprehension is really spooky and super amazing at the same time. I hope Meta will release this model with open weights and open code, as they have done for the LLaMA models.
ein0p an hour ago |
Problem is, the moment they release weights someone will fine tune it to generate porn, including CP. So I wouldn’t hold my breath for the weights release - no legal dept will sign off on something with this much fallout potential.
chasing 7 hours ago |
Periodic generative AI reminder:
It will not make you creative. It will not give you taste or talent. It is a technical tool that will mostly be used to produce cheap garbage unless you develop the skills to use it as a part of your creative toolkit -- which should also include many, many other things.
Juliate 6 hours ago |
Impressive but meh.
Impressive on the relative quality of the output. And of the productivity gains, sure.
But meh on the substance of it. It may be a dream for (financial) producers. For the direct customers as well (advertisement obviously, again). But for creators themselves (who are to be their own producers at some point, for some)?
On the maker side, art/work you don't sweat upon has little interest and emotional appeal. You shape it about as much as it shapes you.
On the viewer side, art that's not directed and produced by a human has little interest, connection and appeal as well. You can't be moved by something that's been produced by someone or something you can't relate to. Especially not a machine. It may have some accidental aesthetic interest, much like generative art had in the past. But uninhabited by someone's intent, it's just void of anything.
I know it's not the mainstream opinion, but Generative AI every day sounds more and more like cryptocurrencies and NFTs and these kinds of technologies that did not find _yet_ their defining problem to which they could be a solution.
azinman2 6 hours ago |
Unless I'm missing something, this technology's harmful potential outweighs the good. What is the great outcome from it that makes society better? MORE content? TikTok already shows that you can out-influence Hollywood/governments in 10 seconds with your smartphone. Heck, you can cause riots through forwarding text messages on WhatsApp [1]. Not everything that can be done should be done, and I think this is just too harmful for people to work on. I wish we'd globally ban it.
[1] https://www.dw.com/en/whatsapp-in-india-scourge-of-violence-...
paul7986 6 hours ago |
I agree with Neil Degrasse Tyson AI will kill the Internet https://www.youtube.com/watch?v=SAuDmBYwLq4
Though maybe there's hope if..
1. All deepfake image & video tech are enforced to add watermark labels & all websites that publish are force to label fake too.
2. Crazy idea but a govt issued Internet ID (ID.me is closest to that now with having to use to file taxes with the IRS) where your personal repuation and credit score are affected by publishing fake/scam/spam crap on the Internet ..affectively helping to destroy it. I want good actors on the web not ones that are out for a buck and in turn destroying it.
echoangle 5 hours ago |
1. will never happen, it’s way too interesting that people won’t try to make an open source version where the watermark can easily be removed by users. Unless you actually criminalize it and put people in jail multiple years for building anything close to deepfakes, you won’t be able to prevent that.
paul7986 4 hours ago |
That's why websites uploaders need to read the meta data and found out how it was originally generated then publish / label it ..AI generated it its first creation was such.
echoangle 4 hours ago |
And how would you actually enforce that? What would happen if I as a private person AI-generated something on my computer and upload it without the metadata? Would I go to prison?
paul7986 3 hours ago |
governments need to enforce that into all upload(ing) tech (web browser builders, apple's iphone sdk, androids, etc) and require all websites/apps to publish/label the metadata showing AI generated or not.
edmundsauto 5 hours ago |
> TikTok already shows that you can out-influence Hollywood/governments in 10 seconds with your smartphone
I can accept the premise that TikTok is trying to do this. Do we have any objective measurement on how effective it has been?
azinman2 5 hours ago |
There’s literally non stop article and studies on misinformation of every category. The evidence is beyond abundant.
I’m not suggesting TikTok themselves is trying to do this, but it (and twitter, instagram, facebook, etc etc) is shaping people’s world views.
edmundsauto 4 hours ago |
My sense is that there is abundant evidence of something, but I'm unable to judge the holistic effect size and direction.
My default perspective is that because humans are so adaptable, every technology shapes our world views. TikTok and Instagram impact us, but so does the plow and shovel. We have research that shows IG harming self-image in some segments of teen girls; what I have not seen evaluated much is how Youtube DIY videos bring self-esteem through teaching people skills on how to make things. These platforms also connect people - my wife had a very serious but rare complication in pregnancy, and her mental health was massively improved by being able to connect with a group of women who had been through/were going through something similar.
My overall point is that it's not very interesting to me to say that technology shapes our world views. Which views? In which way, to what extent? Is it universal, or a subpopulation? Are there prior indications, or does it incept these views? Which views? How much good or harm? How do we balance that?
But what we are left with is a very small view through the keyhole of a door into a massive room that is illuminated with a flickering flashlight. We then glom onto whatever evidence supports our biases and preconceptions, ignoring that which is unstated, unpopular, or violates our sense of the world.
moomoo11 6 hours ago |
This is great. Honestly imagine we get to a point this technology makes most things so demystified we move on to things that are more difficult.
Like cool a movie doesn’t need to cost $200 million or whatever.
Imagine if those creative types were freed up to do something different. What would we see? Better architecture and factories? Maybe better hospitals?
tyjen 6 hours ago |
While we're still a fair distance away from creating polished products capable of replacing Hollywood gatekeeping; the bursting of the creative dam is on the horizon and it's exciting! I'm looking forward to when you can write a script and effectively make your own series or movie. Tweaking it as you go to fit your vision without the exhausting a large amount of resources, capital, and human networking to produce similar products pre-AI.
ramshanker 6 hours ago |
It feels like in the field of AI, a major advancement happens every month now!
benabbott 6 hours ago |
Things are about to get weird. We can't control this at any level:
At the level of image/video synthesis: Some leading companies have suggested they put watermarks in the content they create. Nice thought, but open source will always be an option, and people will always be able to build un-watermarked tools.
At the level of law: You could attempt to pass a law banning image/video generation entirely, or those without watermarks, but same issue as before– you can't stop someone from building this tech in their garage with open-source software.
At the level of social media platforms: If you know how GANs work, you already know this isn't possible. Half of image generation AI is an AI image detector itself. The detectors will always be just about as good as the generators- that's how the generators are able to improve themselves. It is, I will not mince words, IMPOSSIBLE to build an AI detector that works longterm. Because as soon as you have a great AI content classifier, it's used to make a better generator that outsmarts the classifier.
So... smash the looms..?
dinfinity 5 hours ago |
The challenge is to determine what is real, not what is fake.
I think cryptographic signing and the classic web of trust approaches are going to prove the most valuable tools in doing so, even if they're definitely not a panacea.
lytedev 4 hours ago |
This comes up a lot. Because synthesis is so generally feasible plus the existence of very powerful editing tools for things like movies and whatnot, I'm guessing that it will simply become the norm to assume that any image, sound, movie, or whatever may be fake. I expect there won't be a way to verify something was synthesised or "real-synthesized" (since images and videos are ultimately synthesized themselves, just from reality instead of other synthesized content). Even with signing and web of trust we can only verify who is publishing something, but not the method of synthesis.
tintor 4 hours ago |
It can be verified if resulting video contains signed metadata with all intermediate steps needed to produce the video from original recording (which is digitally signed by camera).
Downside is that large original video assets would need to be published, for such verification to work.
nonameiguess 4 hours ago |
You won't be able, as some average person, to trust that what you gets to Twitter, Instagram, or whatever image and video hosting platform gets popular in the future, is real, but 1) I'm not sure you can today anyway, 2) plenty of people don't consume anything from these platforms and get by fine, and 3) what are you even relying on this information for?
Are you concerned about predicting the direction or "real" state of your national economy? Videos aren't going to give you that. Largely, you can't know. Heavily curated statistical reports compiled and published by national agencies can only give you a clear view in retrospect. Are you concerned that a hurricane might be heading your way and you need to leave? Don't listen to videos on social media. Listen to your local weather authority. Are you concerned about whether X candidate for some national office really said a thing? Why? Are any of these people's characters or policy positions really that unclear that the reality or unreality of two seconds worth of words coming out of their mouths are going to sway your overall opinion one way or another?
Things you should actually care about:
- How are you family and friends doing? Ask them directly. If you can't trust the information you get back, you didn't trust them to begin with.
- How should you live your life? Stick with the classics here, man. Some combination of Aristotle, Ben Graham, and the basic AHA guidelines on diet and exercise will get you 95% of the way there.
- How do you fix or clean or operate some equipment or item X that you own? Get that information from the manufacturer.
Things you shouldn't care about:
- Is the IDF or Hamas committing more atrocities?
- Does Kamala Harris really support sex changes for convicted felons serving prison sentences funded by public money?
- Can Koalas actually surf?
Accept at some point that you can't know everything at all times and that's fine. You can know the things that matter. Get information from sources you actually trust, as in individual people or specific organizations you know and trust, not anonymous creators of text on Reddit. If you happen to be a national strategic decision maker that actually needs to know current world events, you're in luck. You have spy agencies and militaries that fully control the entire chain of custody from data collection to report compilation. If they're using AI to show you lies, you've got bigger problems anyway.
dinfinity 2 hours ago |
Trusted entities could vouch for the veracity (or other aspects) of things, especially if they are close to the source.
We already implicitly do this: if a news outlet we trust publishes a photo and does not state that they are unsure of its veracity we assume that it is an authentic photo. Using cryptographic signing that news outlet could explicitly state that they have determined the photo to be real. They could add any type of signed statement to any bit of information, really. Even signing something as being fake could be done, with the resulting signed information being shareable (although one would imagine that any unsigned information would be extremely suspect anyway).
The web of trust approach is to have a distributed system of trust that allows for less institutional parties to be able to earn trust and provide 'trusted' information, but there are also plenty downsides to it. A similar distributed system that determines trustworthiness in a more robust way would be preferable, but I am not aware of one.
williamcotton 4 hours ago |
The web of trust doesn't seem to scale! All of the online social platforms trend towards centralization for identify verification.
In my (historically unpopular) opinion we have two optional choices outside of but still allowing for this anonymous free-for-all:
A private company like Facebook uses a privileged system of identification and authentication based on login/password/2FA and relying on state-issued identification verification,
OR, what I feel is better, a public institution that uses a common system based on PKI and state-issued identification, eg, the DMV issuing DoD Common Access Cards.
Trusting districts and nation-states could sign each other's issuing authorities.
The benefits are multifaceted! It helps authenticate the source of deep fakes. It helps fight astroturfing, foreign or otherwise. It helps to remove private companies fueled by advertising revenue from being in a privileged position of identification, etc, etc.
I totally understand any downvotes but I would prefer if you instead engaged me in this conversation if you disagree.
I'd love to have this picked apart instead of just feeling bummed out.
farleykr 5 hours ago |
I think pretty soon we will get to the point where there’s some sort of significant boundary at all levels between online and real life because the only way to be sure you’re seeing something real is to be interacting with it in real life. The internet will not be something you visit on a web browser to get information but will become a place you go where you will simply have to acknowledge that nothing is real. Obviously that’s a concern now but I wonder if we’ll get to a point where it’s taken for granted at large that whatever you see on the internet just isn’t real. And I wonder what implications that will have.
afro88 5 hours ago |
> It is, I will not mince words, IMPOSSIBLE to build an AI detector that works longterm
Like pretty much any tool involving detection of / protection from erroneous things, it's forever a cat and mouse game. There will always be new viruses, jailbreaks, banned content, 0-days etc. AI detection is no different.
layer8 5 hours ago |
Just stop taking any video you see at face value? People managed without videos before video cameras were available, and the written word was never reliable to start with. Maybe the future won’t be that different?
slg 4 hours ago |
Except that time "before video cameras" didn't coincide with a time in which everyone had a magic device in our pocket that allowed anyone to send a firehose of propaganda our way.
If yellow newspapers were able to push us to war despite us knowing that "the written word was never reliable to start with", what will be the impact of the combination of this technology and the internet used against a population that has been conditioned over generations to trust video.
layer8 4 hours ago |
If “fake news” is anything to go by, the population will quickly be de-conditioned from trusting video.
hombre_fatal 4 hours ago |
Absolutely not. You can just go to Twitter or Reddit, like https://www.reddit.com/r/pics/, to see an image with a (e.g. political) caption that purports something to be true and thousands of people will take it onboard as truth. Nobody asks for a source, or they are admonished when they do for apparently disagreeing with the political claim.
You can go on Youtube to see charlatans peddle all sorts of convenient truths with no evidence.
You don't even need AI. The bug is in the human wetware.
hoten 4 hours ago |
good advice for internet citizens (too bad the uptake will be too slow). but doesn't address how courts and law should function.
mitthrowaway2 3 hours ago |
So this is basically a regression to a 19th-century level in terms of being able to trust and understand reporting on the world beyond our own front door. People managed before photographic and video evidence was a thing; you could use eyewitness reports from trusted friends and news on the official telegraphs, to the extent that those were trustworthy. But it's certainly still a big step backward from the 20th century, that brief window of time where it was much easier to record physical evidence of an event than to fake it.
afh1 3 hours ago |
Photographic evidence has been subject to manipulation before computers were even a thing, more so after Photoshop became widely available. There has always been forensics for that, which will continue to evolve.
I think the issue with trust is rooted elsewhere - in social relations, politics, and not in AI generated content.
maxwell 3 hours ago |
What remaining institutions still command any trust?
educasean 2 hours ago |
... Most of them?
Do you read the news at all? If you can't trust any of them, then why even bother?
mitthrowaway2 3 hours ago |
It has, but it used to take a lot more skill to manipulate a photo than to take a photo, and convincing video manipulation was even harder. I'm also skeptical that forensics will be able to keep up, because of the basic principle of antagonistic training -- any technique forensics can use can be applied back into improving the pipeline that generates the image, defeating the forensic tool. That certainly wasn't the case in the 20th century.
mikeshi42 4 hours ago |
I agree the cat is out of the bag, but GANs do not work like that. One of the common failure modes in training a GAN is that the discriminator gets too powerful too quickly and the generator then can no longer learn.
Hard to say anything is impossible off of one point - but discrimination afaik is generally seen as the easier problem of the two, given you only need to give a binary output as opposed to a continuous one.
elwell 4 hours ago |
> IMPOSSIBLE to build an AI detector that works longterm
return Math.random() < Math.pow(0.5, (new Date()).getFullYear() - 2023) ? "Not AI" : "AI";
This should increase in accuracy over time.
mitthrowaway2 3 hours ago |
It turns out that "return 'AI'" is a better strategy when the probability is above 50%: https://www.lesswrong.com/posts/msJA6B9ZjiiZxT6EZ/lawful-unc...
tintor 4 hours ago |
Possible option is for cameras to digitally sign the original video as it is being recorded.
mistercheph 4 hours ago |
Oi mate, you 'ave a license for producing cryptographic signatures to embed on that footage?
cloverich 4 hours ago |
My favorite idea that nobody is talking about is how news organizations are about to get a second life. As soon as it becomes actually impossible to distinguish AI content from human content, news organizations will have the opportunity to provide that layer of analysis in a way that potentially can't be (easily) automated. They are ironically against it but IDK maybe they should be excited about it. Would love someone to poke holes in this.
sleepybrett 3 hours ago |
> Nice thought, but open source will always be an option, and people will always be able to build un-watermarked tools.
Thats why you make it punishable by potential prison time if you create/disseminate an non watermarked video generated in this way.
bitbasher 6 hours ago |
If social media was the scourge of the last decade, the next decade's scourge will be artificial content.
Digital minimalism is looking more and more attractive.
rootedbox 6 hours ago |
Did this website kill anyone else’s phone?
mucle6 5 hours ago |
The text to modify a video looks so cool
tomw1808 5 hours ago |
Why does it look ... fake?
Before you downvote, don't get this as a belittling the effort and all the results, they are stunning, but as a sincere question.
I do plenty of photography, I do a lot of videography. I know my way around Premiere Pro, Lightroom and After Effects. I also know a decent amount about computer vision and cg.
If I look at the "edited" videos, they look fake. Immediately. And not a little bit. They look like they were put through a washing machine full of effects: too contrasty, too much gamma, too much clarity, too low levels, like a baby playing with the effect controls. Can't exactly put my fingers on, but comparing the "original" videos to the ones that simply change one element, like the "add blue pom poms to his hands", it changes the whole video, and makes the whole video a bit cartooney, for lack of a better word.
I am simply wondering why?!
Is that a change in general through the model that processes the video? Is that something that is easy to get rid of in future versions, or inherently baked into how the model transforms the video?
davedx 5 hours ago |
This is just the landing page for a research paper? It's hard to understand what the actual production capabilities of this are.
jmakov 4 hours ago |
Wonder what a AI generated movie from the same script as original would look like.
yorozu 4 hours ago |
(commented on wrong thread somehow)
mistercheph 4 hours ago |
Wrong thread :)
phkahler 4 hours ago |
The kids kite is flying backwards....
seydor 4 hours ago |
I can finally watch Star Wars the Smurfs edition
greybox 4 hours ago |
At what point did someone look at this and think: "Ah yes, this will be good for humanity right now" ?
oblio 3 hours ago |
That person would have been fired :-(
afh1 3 hours ago |
Seems to have great potential in the VFX industry, for one thing.
tim333 3 hours ago |
I don't think it works like that. It's more "Hey! This tech can make funky videos"
tqi 4 hours ago |
A lot of folks in this thread have mentioned that the problem with the current generation of models is that only 1 in (?) prompts returns something useful. Isn't that exactly what a reward model is supposed to help improve? I'm not an ML person by any means so the entire concept of reward models feels like creating something from nothing, so very curious to understand more.
jprete 3 hours ago |
Bear in mind these systems have already been through the reward-based training, and these are the results that are good enough to show in public.
sleepybrett 3 hours ago |
It should be federal law that any video created with GenAI should be watermarked both stenographically and visually. (Same goes for images and audio.. not sure what can be done about ascii)
rolux 23 minutes ago |
Stenography is writing in shorthand. What you mean is steganography.
You can also watermark plain text by generating "invisible" patterns.
Of course, in all these cases, the watermarks are trivial to remove: just re-encode the output with an open model. Which is why I hope there will be no federal law that tries to enforce something that is categorically unenforceable.
qwery 3 hours ago |
What can you even say about this stuff? It's another incremental improvement, good job Mark. These new video clips of yours are certainly something. I don't know how you do it. Round of applause for Mark!
I will now review some of the standout clips.
That alien thing in the water is horrifying. The background fish look pretty convincing, except for the really flamboyant one in the dark.
I guess I should be impressed that the kite string seems to be rendered every frame and appears to be connected between the hand and the kite most of the time. The whole thing is really stressful though.
drunk sloth with weirdly crisp shadow should take the top slot from girl in danger of being stolen by kite.
man demonstrates novel chain sword fire stick with four or five dimensions might be better off in the bin...
> The camera is behind a man. The man is shirtless, wearing a green cloth around his waist. He is barefoot. With a fiery object in each hand, he creates wide circular motions. A calm sea is in the background. The atmosphere is mesmerizing, with the fire dance.
This just reads like slightly clumsy lyrics to a lost Ween song.
hcks 3 hours ago |
Round of applause for this useless unsubstantiated comment
chaos_emergent 3 hours ago |
it's totally wild that your first response is shitting on flaws rather than having your jaw drop at machines producing coherent videos from text.
This is _the worst that machines will ever be at this task_, and most of the improvements that need to be made are a matter of engineering ingenuity, which can be translated to research dollars.
salmonet 3 hours ago |
This is Hacker News. That comment was way more positive than I expected for something like this and so I assumed this must be pretty awesome
mempko 3 hours ago |
Hippos can't swim. Things are about to get weird where people will start believing strange things. We already have people believing Trump helped people during the hurricane, with images of him wading through water (that are clearly AI generated if you look close enough). We are going to get a form of model collapse at not just the AI level, but societal one.
garrettgarcia 3 hours ago |
Harry Potter-style moving pictures are now a reality.
swayvil 3 hours ago |
Just feed it a book?
throw310822 3 hours ago |
The porntential is immense.
Seriously though. This is the company that is betting hard on VR goggles. And these are engines that can produce real time dreams, 3d, photographic quality, obedient to our commands. No 3d models needed, no physics simulations, no ray tracing, no prebuilt environments and avatars. All simply dreamed up in real time, as requested by the user in natural language. It might be one of the most addictive technologies ever invented.
afh1 3 hours ago |
Or a multi billion dollar fluke like the Metaverse. Time will tell.
complianceowl 2 hours ago |
Hahahaha. I think Websters Dictionary may be interested in hiring you.
skywhopper 20 minutes ago |
That’s not how dictionaries work.
jiggawatts 2 hours ago |
Did you just pornify a word?
jsyang00 2 hours ago |
Meta is already a target for regulators - they are going to have to be very careful around this. I think this is why the "metaverse" is still more likely to be decentralized than created by a tech giant. Even if Meta wanted to take a libertarian, "dream whatever you want", stance or even a "dream whatever you want so long as it is more or less legal" stance, they would see a regulatory deluge come pouring down on them. There is no way VR will be able to go mainstream without a drawn out fight over content prohibitions. I think the early internet was a bit of a historical outlier in this sense, where it happened to come about when a relatively laissez-faire attitude towards censorship was prevailing and people did not realize the full impact it would have. That is not the case now. People understand on all sides that this technology has the potential to revolutionize our systems of social relations once again, and I suspect that they will be fighting tooth and nail to shape that outcome as they most desire.
throw310822 2 hours ago |
> There is no way VR will be able to go mainstream without a drawn out fight over content prohibitions
Could be, but it's a bit dystopian to imagine that the government would have a say on the images you can generate- locally and in realtime- and send straight to your own eyes, don't you think? Dystopian and very difficult to enforce, too.
TranquilMarmot 2 hours ago |
Sorry, but it's probably just going to be used for ads.
sanj 3 hours ago |
Hippos don't float.
zamadatix 3 hours ago |
When the comments range from "it's the demise of the world" to "it doesn't look quite right" (and everything in-between) you get a sense of just how early we are into this decade's "big new tech thing".
lucasyvas 2 hours ago |
Well I wouldn't have called it, but I think Meta is in the lead. They beat Apple to AR and affordable VR. Their AI tooling has basically caught up to OpenAI and at this rate will pass them - is anyone else even playing? Maybe their work culture is just better suited to realizing these technologies than the others.
They're not really showing signs of slowing down either. Hey, Zuck, always thought you were kind of lame in the past. But maybe you weren't a one trick pony after all.
the8472 2 hours ago |
> is anyone else even playing?
Deepmind. Protein folding and solving math problems is just less sexy.
TranquilMarmot 2 hours ago |
Most of the comments here talking about bad actors using this for misinformation, but they're ignoring what Meta does- it collects your information and it sells ads.
Especially based on the examples on this site, it's not a far reach to say that they will start to generate video ads of you (yes, YOU! your face! You've already uploaded hundreds of photos for them to reference!) using a specific product and showing how happy you are because you bought it. Imagine scrolling Instagram and seeing your own face smelling some laundry detergent or laughing because you took some prescription medicine.
bilekas an hour ago |
This just seems to serpent eating its own tail and distopian to me, Facebook, a company where people share their own content like videos and pictures now generating content from nothing but AI. To what end?
bhouston an hour ago |
Where can I download this model? Meta is the open source AI company right?
eth0up an hour ago |
Alright, I may or may not be a moron, but none of my versions of Firefox can connect to this site because 'some HSTS shit'.
Anyone able to update a dinosaur?
terminatornet 27 minutes ago |
does anyone have an example of an AI generated video that's more than 10 seconds long that doesn't look like garbage? All of these tools seem to generate a weirdly zooming shot of something that turns a little bit and that's about it.
Anything longer than a single clip is just a bunch of these clips stitched together.