Also: vocals gonna be the mutant fingers of generative music, or the typography. You see where we are heading with AI-music: Very soon, you’ll input all Radiohead-music to a generative AI-model and have Thom Yorke playing Metallica songs all day long in the style of OK Computer. There’s also finetuning for MusicGen already, like specialized image synthesis models that give you anime, you can train MusicGen on any artist or label or genre you like, and you can split audio tracks from your music collection directly to MusicGen in this Huggingspace. While that Huggingspace-demo only generates fifteen seconds of audio, there are Colabs which produce more than that (I didn’t test those, though). Here’s a bunch:įlock of Seagulls - I ran (pop punk AI-version)Īudio playback is not supported on your browser. I used the Huggingspace playground and uploaded snippets of those tracks to condition the model on a melody, which worked so-so most of the time and went very weird on The Trashmen. My own experiments with MusicGen only yielded subpar but sometimes interesting results. Sometimes the song structure isn’t stable and it leads to fun outcomes, like this AI-try at Metallicas Master of Puppets. In comparison, here’s what MusicGen can do: a cover version of A-ha’s Take On Me in a stable song structure with three minutes runtime in good quality. Here’s an example of a typical Jukebox-output which very fast detoriates into noise and compression artifacts. MusicGen is and as far as i can hear, it’s lightyears ahead of Jukebox and better than Google MusicLM. (There already exists an AI-made version of that song, and we’ll see how it holds up agaist McCartneys.)Īll of this is awesome (minus the shitty tunes by AI-Drake ofcourse), but this stuff is not generative AI. This is also true for that new Beatles song, in which they used AI-tech to clean up bad audio to turn a well known Lennon-demo into a final fab four song. ![]() ![]() Meta released their open source music model MusicGen ( code on Github ) and while we’ve seen a lot of talk about AI-Drake and AI-Oasis, this was never true generative music like OpenAIs Jukebox or the infinite neural streams from the Dadabots and more akin to Deepfakes, where people just edited their voice with AI. The tech surely is not there yet, and this reminds me of the 2014 paper The Visual Microphone where they extracted identifyable audio from the video of vibrations on the surface of a bag of potatoe chips, or the age old zoom and enhance meme ofcourse, and i’m very convinced that stuff like the potatoe-chip-tech is used in intelligence circles for years now - why should you get a microphone near a target when you can deploy hires-videocams from far away? So, i consider this stuff to be of very high interest to a bunch of people.Īlso, this AI-enhanced Eye-NeRF-pipeline clearly is an early experiment, and we are far away from real world applications or headlines like “Dead mans’ eye reflection used in murder trial“ or whatever, but the tech may soon be used to refine Apples’ AR-applications, for instance. Through various experiments on synthetic and real-world captures featuring people with varied eye colors, we demonstrate the feasibility of our approach to recover 3D scenes using eye reflections. We further propose a simple regularization prior on the iris texture pattern to improve reconstruction quality. Our method jointly refines the cornea poses, the radiance field depicting the scene, and the observer's eye iris texture.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |