We examine the issue of stereo singing voice cancellation, a subtask of music supply separation, whose purpose is to estimate an instrumental background from a stereo combine. We discover obtain efficiency just like massive state-of-the-art supply separation networks ranging from a small, environment friendly mannequin for real-time speech separation. Such a mannequin is helpful when reminiscence and compute are restricted and singing voice processing has to run with restricted look-ahead. In follow, that is realised by adapting an present mono mannequin to deal with stereo enter. Enhancements in high quality are obtained by tuning mannequin parameters and increasing the coaching set. Furthermore, we spotlight the advantages a stereo mannequin brings by introducing a brand new metric which detects attenuation inconsistencies between channels. Our strategy is evaluated utilizing goal offline metrics and a large-scale MUSHRA trial, confirming the effectiveness of our methods in stringent listening assessments.