Learn/Core Concept What is mixture of experts routing? Mixture of Experts (MoE) routing dynamically sends each input token to a subset of specialised neural network 'experts' rather than processing through all parameters. A gating network learns which experts handle different types of inputs best, activating only 2-8 experts per token while keeping the rest dormant. This architecture dramatically reduces compute costs during inference whilst maintaining model quality, which is why mistral.rs includes MoE optimisations for efficient local deployment. GatingSparsity |