Implement dedicated provider support for popular local LLM frameworks like `llama.cpp`, `sglang`, and `vLLM`, addressing current odd behaviors when used with the generic OpenAI provider.
### Description llama.cpp, sglang, vLLM, etc are all very popular and have probably hundreds of different implementations now. Specifically llama.cpp has hit a point of popularity where it's coming bundled with a few tools and being used in deployments. You should be able to use the OpenAI Provider but... As I learned the other day the OpenAI provider has some very odd behavior with llama.cpp (and any generic OpenAI API). As an example, the OpenAI provider will enable/disable thinking based only on the first character of the model name. This was kind of frustrating and undocumented behavior meant specifically for the OpenAI service. The only way around it is through the custom profile options (which aren't exposed to ex: agent spec or via env vars out-of-the-box). This makes it difficult to develop local agents using agent spec and/or the cli. The new capabilities does not work with llama.cpp either. ex: Thinking should be enabled with the chat_template kwargs reasoning <level> fie