MoR, or Mixture of Routing, is now becoming the next step in the development of artificial intelligence (AI). So far, many AI successes have been powered by transformer technology. But MoR is lighter, faster, and uses less energy. It allows quicker processing, easier creation of large models, and smarter AI on mobile or local devices. This change could completely reshape how we use AI in mobile phones, cloud services, or local apps. As a result, the future of AI will be not just smarter but also more sustainable.
Not long ago, transformer models were seen as the most powerful tools in artificial intelligence. Popular systems like OpenAI’s GPT, Google’s BERT, and Meta’s LLaMA use transformers to bring big changes in how machines understand and create language. But even the best technologies have some problems. Transformers use a lot of computer power, need a lot of energy, and are becoming too costly to run on a large scale.
Now, a new idea is starting to get attention. It’s called MoR, which stands for Mixture of Routing. This is not just a small update, it could be a big change in how AI works. MoR is designed to be faster, cheaper, and more energy-efficient. For those who are following the future of smart technology, MoR might show the way to a more affordable and accessible AI for everyone.
Mixture of Routing (MoR) is a new type of AI system that makes things faster and more efficient. It works by turning on only the important parts of the neural network based on what the input says. This idea comes from a method called Mixture of Experts.
In simple words, MoR helps the model choose which parts should work for each word or token. Instead of using the full model every time, it picks the best path for each input. This saves a lot of time, power, and computer resources. Normally, transformer models use the entire network for every word, even if it’s not needed. This wastes energy and slows things down. MoR changes that by using only the parts that really matter.
Why is this important now? AI models are getting so big that they need a lot of money and powerful machines to run. Even OpenAI’s latest model requires a huge amount of computing power. MoR can help solve this problem and make AI faster and cheaper.
The transformer architecture has changed the way natural language processing (NLP) works. But after five years of leading the field, some problems are starting to show:
In comparison, MoR (Mixture of Routing) offers a smart solution. Instead of using more GPUs and energy, it avoids doing unnecessary work. This method is more efficient and useful, especially for new technologies like mobile devices, wearables, and edge-based systems.
In early 2025, researchers at DeepMind and Meta started working on a new type of AI model called Mixture of Routing (MoR). This model is a smarter version of existing systems like GPT, LLaMA, and Claude. The main goal was to make these models cheaper and faster to run.
Recent tests show that MoR can reduce computing costs by over 60% for large AI tasks. It works by choosing only the important parts of the model to activate. This means it doesn’t waste power or time by using every part all the time like traditional models do.
This change could help OpenAI and others make AI tools that are cheaper to use, quicker to launch, and easier to run on smartphones. This brings us one step closer to using powerful AI directly on our phones instead of relying on the cloud all the time.
The real change with MoR is not just in its theory, but in how it can change the way AI is used in real life.
AI in smartphones: Upcoming phones like the Google Pixel 10 or iPhone 17 will be able to offer more advanced AI features because MoR works using much fewer resources.
Smart technology from Apple: With Apple’s Neural Engine running AI on the device, MoR can be easily added to their system.
AI in wearables and small devices: Devices like AR glasses, earbuds, or smart home gadgets can also run real-time AI with the help of MoR.
Changes to large AI systems: Cloud companies can run large AI models using much less electricity and fewer GPUs by using MoR.
For countries like Bangladesh, where electricity and internet speed are limited, running big AI models is often very difficult. MoR could be an easy solution to that problem. It opens up a big opportunity to use world-class AI even in our country.
Features |
Transformers |
MoR (Mixture of Routing) |
Computation per token |
High (all units active) |
Low (selective routing) |
Energy Consumption |
Very high |
Up to 60% lower |
Scalability |
Limited by hardware |
More scalable, efficient |
Latency |
Higher |
Significantly reduced |
Mobile Compatibility |
Poor |
Excellent |
Training Cost |
Extremely expensive |
Cost-effective at scale |
Customizability |
Rigid |
Flexible and context-driven |
This comparison shows that transformers were very important. But MoR is the future of AI.
Every big AI company is looking for ways to be more efficient. MoR gives them a new tool to help with this.
OpenAI’s new language model could use MoR to make responses faster and cheaper. This would be a big change for ChatGPT on mobile devices.
Google's DeepMind is already trying out routing methods based on sparsity. We can expect a MoR-based version of their Gemini model soon.
Meta’s LLaMA 3 plan suggests they might use modular designs, possibly inspired by MoR.
Apple, which is usually very secretive, might be using routing-based improvements for Siri and on-device learning in iOS 19.
These changes show that the AI race is shifting. Instead of just getting bigger, AI is becoming smarter. Instead of using brute force, it is moving towards more precise and efficient methods. MoR does not just add to transformer technology; it removes waste and brings smart intelligence.
One of the most overlooked benefits of MoR is that it can improve perplexity scores, which is an important measure for language models. Unlike regular transformers that try to understand everything, MoR focuses only on what is relevant in the context.
In Bangladesh, where mobile internet is often slow and devices are not very powerful, using language models with MoR means users get faster and more accurate AI answers without depending on cloud servers. This is a big change, not just for individuals but for the country’s digital plans.
And yes, this helps make perplexity AI systems more flexible, local, and easier to use for everyone.
MoR is not just a technical improvement, it is very important. As AI becomes a part of our daily tools, working efficiently cannot be ignored. It must be the base. From healthcare tests to smart farming systems in South Asia, MoR helps AI reach places where transformers could not. It lowers the cost and makes the results more reliable.
The future of AI is not about making models bigger and bigger. It is about making them smarter, smaller, and more connected to our daily life. MoR brings this future closer.
As a tech-savvy guy, I follow AI closely and I have rarely seen a change as exciting as the move from Transformers to MoR. This is not just a new research idea but a change that will definitely happen. The time of using lots of power to run AI is ending. MoR starts a smarter time when AI can work better across different devices, places, and economies.
For Bangladesh and other growing tech countries, this is a great chance. We are not just users of AI anymore. Now, we can become creators, innovators, and leaders in using AI in a smart way. That is why MoR is not just a trend. It is the base for the future of AI that works more efficiently.