Apple releases an AI model that can edit images based on text-based commands

Apple isn't one of the major players in the AI game today, but the company's new open source AI model for image editing shows what it's capable of contributing to the world. 'space. The model called MLLM-Guided Image Editing (MGIE), which uses multimodal extended language models (MLLM) to interpret text-based commands when manipulating images. In other words, the tool has the ability to edit photos based on the text entered by the user. Although it's not the first tool capable of doing this, “human instructions are sometimes too brief to be captured and followed by current methods.” project paper (PDF) we read.

The company developed MGIE with researchers at the University of California, Santa Barbara. MLLMs have the power to transform simple or ambiguous text prompts into more detailed and clearer instructions that the photo editor itself can follow. For example, if a user wants to edit a photo of a pepperoni pizza to “make it healthier,” MLLMs can interpret this as “add vegetable toppings” and edit the photo as such.

Photos of pizza, cheetas, a computer and a person. — Apple

In addition to making major edits to images, MGIE can also crop, resize, and rotate photos, as well as improve their brightness, contrast, and color balance, all via text prompts. It can also edit specific areas of a photo and can, for example, change the hair, eyes and clothing of a person in it, or remove elements in the background.

As BusinessBeat notes, Apple released the model via GitHubbut those interested can also try a demo which is currently hosted on Hugging Face Spaces. Apple has not yet said whether it plans to use what it learned from this project into a tool or feature it might integrate into one of its products.

Source link

Post Views: 30

Leave a Comment Cancel reply