Text this: Optimising entanglement distribution policies under classical communication constraints assisted by reinforcement learning