*= Equal Contributors
Within the context of a voice assistant system, steering refers back to the phenomenon wherein a person points a follow-up command making an attempt to direct or make clear a earlier flip. We suggest STEER, a steering detection mannequin that predicts whether or not a follow-up flip is a person’s try to steer the earlier command. Setting up a coaching dataset for steering use instances poses challenges because of the cold-start drawback. To beat this, we developed heuristic guidelines to pattern opt-in utilization knowledge, approximating optimistic and destructive samples with none annotation. Our experimental outcomes present promising efficiency in figuring out steering intent, with over 95% accuracy on our sampled knowledge. Furthermore, STEER, together with our sampling technique, aligns successfully with real-world steering eventualities, as evidenced by its sturdy zero-shot efficiency on a human-graded analysis set. Along with relying solely on person transcripts as enter, we introduce STEER+, an enhanced model of the mannequin. STEER+ makes use of a semantic parse tree to offer extra context on out-of-vocabulary phrases, corresponding to named entities that always happen on the sentence boundary. This additional improves mannequin efficiency, lowering error charge in domains the place entities often seem, corresponding to messaging. Lastly, we current a knowledge evaluation that highlights the development in person expertise when voice assistants help steering use instances.