Giant Language Fashions (LLMs) with billions of parameters have drastically reworked AI functions. Nevertheless, their demanding computation throughout inference has raised vital challenges for deployment on resource-constrained units. Regardless of latest developments favoring various activation capabilities resembling GELU or SiLU, recognized for elevated computation, this examine strongly advocates for reinstating ReLU activation in LLMs. We exhibit that utilizing the ReLU activation operate has a negligible impression on convergence and efficiency whereas considerably decreasing computation and weight switch. This discount is especially precious in the course of the memory-bound inference step, the place effectivity is paramount. Exploring sparsity patterns in ReLU-based LLMs, we unveil the reutilization of activated neurons for producing new tokens and leveraging these insights we suggest sensible methods to considerably cut back LLM inference computation as much as thrice, utilizing ReLU activations with minimal efficiency trade-offs.