Race Updates Discord About Merch
Home Profile History Competitions Texts Upgrade

typeracer

Pit Stop
We validate the performance of Loss-Free Balancing on MoE models with up to 3B parameters trained on up to 200B tokens. Experimental results show that Loss-Free Balancing achieves both better performance and better load balance compared with traditional auxiliary-loss-controlled load balancing strategies.
— Loss-Free Balancing (other) by AUXILIARY-LOSS-FREE LOAD BALANCING STRATEGY FOR MIXTURE-OF-EXPERTS
Language: English

This text has been typed 17 times:
Avg. speed: 87 WPM
Avg. accuracy: 96.9%