With film, the sharpness is quite subtle and soft even on medium format meanwhile with digital the sharpness is much more glaring and just pops out. Basically, the two have different character when it comes to sharpness. Is this just as a result of the grain sort of masking everything ?
Yes, that is true and the effect is quite real. You asked for a technical explanation, here it goes:
film resolution depends on subject contrast, while digital sensor resolution does not.
What it means in practice is that high-contrast elements of the image, like eyelashes or sharp-edged textures, look extremely sharp on film. While low-contrast textures like skin blemishes aren't as sharp. This is where most of the visual difference you're observing comes from. Moreover, film's response to light is non-linear, particularly in the highlights. Those parts of an image that receive a lot of light start to "compress" and lose in contrast, so they start looking smoother.
And finally, and this applies only to color film where an image is formed by tiny dye clouds, these clouds grow larger in the over-exposed parts of an image, reducing resolution further.
TLDR: the primary technical difference in resolution between film and digital sensors is
film's non-linearity: film gains and loses resolution in different parts of a scene depending on area contrast and exposure, while digital sensors deliver the same resolution regardless of contrast and exposure.