I agree with Charlie. The way I see it,
- if you start with one 2000 x 3000 image, stitch it witih two more 2000 x 3000 images, so the merged image is 2000 x 9000, this does result in a single 2000 x 9000 'higher resolution image' which has more pixel count resolution than the starting 2000 x 3000.
- OTOH, I understand fully that if I have an image of 200 line pairs per millimeter, stitching it to two more images of 200 line pairs per millimeter ends up with one composite 2000 x 9000 image that has 200 line pairs per millimeter of detail, no better detail.
So the ultimate detail content of #2 is no better than a single image (out of the triplet of images), but the composite does have more total pixels than the single 2000 x 3000 image of 200 line pairs per millimeter, and 3x the total detail of a single image
Semantics
Back to OP question. This is a composite photo of two shots taken with Canon G2 which produces 4MPixel images. Cashel, Ireland, shot in 2004
This photo was result of two very quickly taken shots on the side of a highway, hand held and not on tripod, two grab shots not carefully composed with careful angle shift at the lens node. I forget what I used for stitching, but it was a public domain utility. In short, one does not necessarily need to be meticulous and methodical in taking the shots for a stitch composite, but understanding that one needs 'identical exposure' with no differences due to automated exposure control helps. the more care and understanding in the planning can improve the result obtained.