Provide a configuration option to disable streaming of LLM responses for `ChatOpenAIModels` (especially when using models via Ollama or vLLM where tool-calls can be mangled), while still maintaining an event stream from Pydantic AI for UI interactions.
### Description When using open-weight models served via Ollama or vLLM, tool-calls are often mangled during streaming. This is documented in vLLM and Ollama issues cited below. A simple work around would be to just not use streaming and use `run` methods on agents. However new agent interaction protocols like "Ag-UI" require the use of streaming of events to enable the UI interaction. ## Request A way to disable LLM responses to be streamed while maintaining an event stream from pydantic-ai to clients. something like the following ``` model = OpenAIChatModel('llama3', provider=OllamaProvider()) agent = Agent(model, model_settings={'openai_disable_streaming': True}) ``` This would require "mocking" a streaming response that has the entire LLM response in a single delta BUT it would allow for agents to use different models without significant refactoring of standards based interactions. ### References https://github.com/vllm-project/vllm/issues/27641 https://github.c