We present that giant language fashions (LLMs) may be tailored to be generalizable insurance policies for embodied visible duties. Our method, known as Massive LAnguage mannequin Reinforcement Studying Coverage (LLaRP), adapts a pre-trained frozen LLM to take as enter textual content directions and visible selfish observations and output actions immediately within the setting. Utilizing reinforcement studying, we prepare LLaRP to see and act solely by way of environmental interactions. We present that LLaRP is powerful to advanced paraphrasings of activity directions and might generalize to new duties that require novel optimum conduct. Particularly, on 1,000 unseen duties it achieves 42% success charge, 1.7x the success charge of different frequent realized baselines or zero-shot purposes of LLMs. Lastly, to assist the neighborhood in finding out language conditioned, massively multi-task, embodied AI issues we launch a novel benchmark, Language Rearrangement, consisting of 150,000 coaching and 1,000 testing duties for language-conditioned rearrangement.