My personal suspect is that medium format sales are - or were - mainly driven by the professional market, and that meant, for most, the use of a tripod or the need to switch fast between portrait orientation and landscape orientation. With the exception of those Mamiya with a rotating back, comfortable work with a tripod requires square format and waist-level finder.
For portrait professionals, ceremony professionals, and product professionals, and some other studio works such as fashion, being able to let the camera alone on the tripod while re-composing the shot (one person, vertical; three persons, horizontal; one steak: horizontal; one wine bottle: vertical; three wine bottles: horizontal etc.) and leaving for a later moment the orientation of the final picture meant a great saving of time.
I also consider that before the Barnack invention all cameras probably had a square or quasi-square format. Barnack's had a rectangular format just because they used motion picture film. Maybe if 24 x 36 had never been, also 4.5 x 6 would have never been.
My first camera was a Kodak Instamatic (126, that's square, which is nice for a beginner, you don't have to think about "orientation" when you make a picture). The second one was a Polaroid: and that also was square, and very likely for the same reason, you don't have to really decide what to put in and out of your picture at the moment of the picture.
My third camera (by the age of 14) was a SLR (a beloved Minolta SrT 100x which I still own and use). That forced me, for every shot, to decide "in advance" the final destiny of the image, the orientation. Now it's pretty obvious that it must be so, but - in retrospect - I understand, or remember, it is not obvious at all. And tripod work is a bit slower with the rectangular format, for sure.