SEO for New Content Formats and Multimodal Search
It’s 2026, and a host of new technologies are quickly taking conventional “blue link” search engine optimisation into a more interactive paradigm. Instead of purely chasing Google’s latest algorithms and ranking logic, any reputable SEO optimisation company knows to take an integrative approach considering new modalities like ambient voice assistants, conversational artificial intelligence, and visual lenses.
To keep your content relevant and future-proof in the modern web, it’s important to understand how to get your content featured on AI-powered content summaries and personalised search. This explainer breaks down everything you need to know in order to rank in the world of multimodal search.
AI and its Role in Generative Engine Optimisation
Today, the game has changed from who ranks at the top to who’s cited as the top source, as the top of the SERP is dominated by an AI overview courtesy of Perplexity or Google’s Search Generative Experience (SGE). Getting cited mainly comes down to three things:
- Staying “citable” as determined by the Answer-First framework. This means short and sweet answers immediately following H1 or H2 questions. Staying within the 40-60 word range significantly increases your content’s chances of getting scraped for a usable snippet.
- Standing out from the fluff by scoring along the Experience, Expertise, Authoritativeness, and Trustworthiness (EEAT 2.0) criteria. This can be done with original data, authoritative citations, and drawing from lived experience when presenting yourself as a source. Give AI a reason to recognise you as human rather than another AI.

Voice and the Conversant Assistant Model
Where the Siris and Google Nows of the 2010s could set timers and check traffic conditions for you, contemporary assistants can complete tasks independently. This development has the potential to be transformative for search — as well as for your business.
- Respect the prevalence of “local intent”, which easily accounts for over 70% of voice searches in Australia. Accommodate “What’s the nearest store where I can find x product” queries by structuring local data with schema markup, helping assistants to efficiently browse your inventory.
- Factor in long-tail queries which are three times as common for users of virtual assistants. Full sentences starting with “Where can I find” or “What’s the best way” are the new norm for hands-free querying.
- Optimise page speed for natural language speed, as assistants need to be able to access and fetch your data in milliseconds to serve users. You don’t want to be skipped over for the next source just so Alexa can maintain a conversational flow.
Augmented Reality and Strategies for Unified Search
Visual search is at the core of AR technology, with Google Lens and Circle to Search making the physical world easier to navigate. This means universal scene description zipped (USDZ) files and 3D models have become important SEO assets, and image captions should be treated as metadata for AI contextualisation within AR experiences. It’s also crucial to keep your Google Business profile photos up to date to ensure accurate AI-assisted navigation via AR overlays.
All three of these modalities — AR, AI, and Voice — should be honed according to existing best practices to rank well in the modern web, as they’ll more than likely come into play simultaneously. A user might spot an intriguing pair of leggings, ask their smart glasses to identify them, and then have a roundup of existing reviews summarised by AI. Ensuring your web presence is strong in all three areas will be key as the internet continues to get smarter.
- What is On-Page and Off-Page SEO, and How are They Different?
- Building Modern Sites: How to Optimise for Voice Search in 2026
- Inside the Machine – What Bots Actually See (Part 3 of 3)
- Between the Lines – How AI Understands Meaning (Part 2 of 3)
- From keywords to context – a brief overview of AI for SEO. (Part 1 of 3)
We obviously know a thing or two about SEO