Text this: The future of multimodal artificial intelligence models for integrating imaging and clinical metadata: a narrative review