Apple Rolled Out Its Very Own Open-Source AI MLLM-Guided Image Editing. Here’s What It Can Do

Apple Rolled Out Its Very Own Open-Source AI MLLM-Guided Image Editing. Here's What It Can Do

Apple recently made a significant splash in the AI world by releasing MGIE, an open-source image editing model capable of understanding and executing editing commands given in plain English. MGIE (MLLM-Guided Image Editing) uses advanced technology to bridge the gap between words and pixels, letting you modify images effortlessly using natural language descriptions.

This breakthrough represents a potential revolution in the way we interact with image editing software. Let’s explore what MGIE is, how it works, and the exciting possibilities it presents.

What makes MGIE stand out?

  • Intuitive Interaction: Forget complex toolbars and menus. MGIE lets you use simple everyday language to control your edits. Just describe the changes you want to see, and the AI model will do its best to interpret them and modify the image accordingly.
  • Sophisticated Understanding: Thanks to the use of Multimodal Large Language Models (MLLMs), MGIE can grasp a wide range of image editing terms. It’s not limited to basic commands; you can use creative and descriptive language to guide the process.
  • Open-Source Innovation: Apple’s decision to make MGIE open-source means developers and researchers around the world can contribute to its advancement. This will undoubtedly fuel further innovation and refinement of the technology.

What Types of Edits Can MGIE Make?

MGIE offers a surprising level of flexibility, allowing you to perform edits ranging from the simple to the sophisticated:

  • Basic Enhancements: Commands like “make the photo brighter,” “increase the contrast,” or “apply a black and white filter” are easily interpreted by the model.
  • Object Manipulation: You can add and remove objects from your images with instructions like “add a cat to the left corner” or “remove the person in the red shirt.”
  • Detailed Modifications: MGIE can handle more complex requests involving specific regions or characteristics; for example, “make the sky more blue” or “change the woman’s hair color to blonde.”
  • Creative Transformations: The model can unleash your creativity by allowing you to describe imaginative changes like “make this photo look like a watercolor painting” or “give the scene a vintage aesthetic.”

How to Get Started (If Applicable)

MGIE is currently available as an open-source project (include Github or Apple source link), which means you’ll need some technical knowledge to run it locally.

While Apple may eventually incorporate this technology into future software updates, more user-friendly versions are likely to emerge in the near future.

The Future of Image Editing

Apple’s release of MGIE signals a seismic shift in image editing. In the coming years, you can expect AI-powered tools to reshape the industry, streamlining tasks and making sophisticated photo editing accessible to anyone, regardless of skill level.

As language models like MGIE become more powerful, creative control over images will become as simple as expressing your imagination with words.