*= All authors listed contributed equally to this work
Efficiently dealing with context is crucial for any dialog understanding activity. This context perhaps be conversational (counting on earlier person queries or system responses), visible (counting on what the person sees, for instance, on their display), or background (based mostly on alerts akin to a ringing alarm or taking part in music). On this work, we current an outline of MARRS, or Multimodal Reference Decision System, an on-device framework inside a Pure Language Understanding system, accountable for dealing with conversational, visible and background context. Particularly, we current completely different machine studying fashions to allow handing contextual queries; particularly, one to allow reference decision, and one to deal with context by way of question rewriting. We additionally describe how these fashions complement one another to type a unified, coherent, light-weight system that may perceive context whereas preserving person privateness.