Race Updates Discord About Merch
Home Profile History Competitions Texts Upgrade

typeracer

Pit Stop
By dynamically updating the bias of each expert according to its recent load, Loss-Free Balancing can consistently maintain a balanced distribution of expert load. In addition, since Loss-Free Balancing does not produce any interference gradients, it also elevates the upper bound of model performance gained from MoE training.
— Loss-Free Balancing (other) by AUXILIARY-LOSS-FREE LOAD BALANCING STRATEGY FOR MIXTURE-OF-EXPERTS
Language: English

This text has been typed 4 times:
Avg. speed: 60 WPM
Avg. accuracy: 96.2%