Normal view
-
TechCrunch News
- A stealth AI model beat DALL-E and Midjourney on a popular benchmark β its creator just landed $30M
-
Latest Tech News from Ars Technica
- Midjourney introduces first new image generation model in over a year
Midjourney introduces first new image generation model in over a year
AI image generator Midjourney released its first new model in quite some time today; dubbed V7, it's a ground-up rework that is available in alpha to users now.
There are two areas of improvement in V7: the first is better images, and the second is new tools and workflows.
Starting with the image improvements, V7 promises much higher coherence and consistency for hands, fingers, body parts, and "objects of all kinds." It also offers much more detailed and realistic textures and materials, like skin wrinkles or the subtleties of a ceramic pot.
Β© Xeophon
Midjourney releases V7, its first new AI image model in nearly a year
-
Latest Tech News from Ars Technica
- New AI text diffusion models break speed barriers by pulling words from noise
New AI text diffusion models break speed barriers by pulling words from noise
On Thursday, Inception Labs released Mercury Coder, a new AI language model that uses diffusion techniques to generate text faster than conventional models. Unlike traditional models that create text word by wordβsuch as the kind that powers ChatGPTβdiffusion-based models like Mercury produce entire responses simultaneously, refining them from an initially masked state into coherent text.
Traditional large language models build text from left to right, one token at a time. They use a technique called "autoregression." Each word must wait for all previous words before appearing. Inspired by techniques from image-generation models like Stable Diffusion, DALL-E, and Midjourney, text diffusion language models like LLaDA (developed by researchers from Renmin University and Ant Group) and Mercury use a masking-based approach. These models begin with fully obscured content and gradually "denoise" the output, revealing all parts of the response at once.
While image diffusion models add continuous noise to pixel values, text diffusion models can't apply continuous noise to discrete tokens (chunks of text data). Instead, they replace tokens with special mask tokens as the text equivalent of noise. In LLaDA, the masking probability controls the noise level, with high masking representing high noise and low masking representing low noise. The diffusion process moves from high noise to low noise. Though LLaDA describes this using masking terminology and Mercury uses noise terminology, both apply a similar concept to text generation rooted in diffusion.
Β© akinbostanci via Getty Images