I haven't had time to play with your code but I notice that you're using grob::always-vertical-skylines-from-stencil. That really should only be used for grobs that can calculate their stencil without knowing line breaks. Instead use grob::unpure-vertical-skylines-from-stencil, which you can find in output-lib.scm. Incidentally you could also use grob::unpure-Y-extent-from-stencil and save yourself a few lines of code. If the extra vertical space you're noticing is on systems where the grob is printed, fixing
the skyline estimate may help.
The main issue with pure positioning in the approach you're using is that these markups are quite tall but you need to allocate space for them even on systems where they will not be printed. One approach could be to simply divide the pure height estimate by the number of broken spanner segments you expect, so even if the first system is estimated a bit too short and the others are estimated too tall, at least the total allocated space is about right. That said, IMO it would be better if instead of killing parts of the spanner, you just make this grob an item, and write an after-line-breaking callback that finds the horizontal distance from the grob's anchor to the right edge of the system or to the closest column to the right that has as an element another instruction text grob. All the information you need is available after translation by traversing grob object properties. You'd still need a pure height estimate, but pure positioning would only allocate space for the system containing the anchor column for the grob, so the pure estimation you're using now would be pretty close.
Saul