On-device machine studying (ML) strikes computation from the cloud to private units, defending consumer privateness and enabling clever consumer experiences. Nevertheless, becoming fashions on units with restricted sources presents a serious technical problem: practitioners have to optimize fashions and steadiness {hardware} metrics comparable to mannequin measurement, latency, and energy. To assist practitioners create environment friendly ML fashions, we designed and developed Talaria: a mannequin visualization and optimization system. Talaria permits practitioners to compile fashions to {hardware}, interactively visualize mannequin statistics, and simulate optimizations to check the affect on inference metrics. Since its inner deployment two years in the past, we now have evaluated Talaria utilizing three methodologies: (1) a log evaluation highlighting its progress of 800+ practitioners submitting 3,600+ fashions; (2) a usability survey with 26 customers assessing the utility of 20 Talaria options; and (3) a qualitative interview with the 7 most lively customers about their expertise utilizing Talaria.