Heterogeneous Bert
Published:
We have implemented a neural architecture search and a super-network training framework for heterogeneous BERT models. Given the search space and a teacher model, the super-network is automatically trained and the network structures are evaluated using balanced Pareto sampling. Compared to traditional neural architecture search frameworks, our approach achieves higher accuracy, faster convergence for sub-models, and superior performance under the same structural configurations.
Leave a Comment