*= Equal Contributors
Federated Studying (FL) is a method to coach fashions utilizing information distributed throughout units. Differential Privateness (DP) offers a proper privateness assure for delicate information. Our aim is to coach a big neural community language mannequin (NNLM) on compute-constrained units whereas preserving privateness utilizing FL and DP. Nevertheless, the DP-noise launched to the mannequin will increase because the mannequin dimension grows, which frequently prevents convergence. We suggest Partial Embedding Updates (PEU), a novel method to lower noise by reducing payload dimension. Moreover, we undertake Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) to cut back the reminiscence calls for of enormous fashions on compute-constrained units. This mixture of methods makes it doable to coach large-vocabulary language fashions whereas preserving accuracy and privateness.