My interpretation of the statement, as made..."The lightmeter assumes you read Zone V (middle tone) brightness area of the subject; so if you meter the 'bright area with detail' of the scene (which should fall into Zone VII) you need to adjust the indicated exposure by -2EV to expose so that detail is retained." Clear to me, not confusing, about the principles of exposure (although not the accompanying darkroom usage) of Zone system.
Well, if that's what it is, then the approach will fail as it'll result in horribly clipped highlights. If you place the highlights that are supposed to fall on VII in the final image at +2EV exposure, most of zones VIII and IX will be clipped. I'm sure that's not what the author of the book intends to convey, but it's precisely why I mentioned that this unnecessarily (and perhaps dangerously) confuses things. I'm sure that reading the entire book would resolve the matter, but then again, how does that fit with OP's desire to keep thing simple? I find it risky to try and marry a digital-interpreted zone system approach with ETTR when someone is looking for a simple approach.