You can now choose from preset LLMs with images hosted by AKS and split inferencing across multiple lower-GPU count VMs