Enable Triton to run vLLM emulating the OpenAI API, allowing vLLM to be used as a drop-in replacement for ChatGPT.
**Is your feature request related to a problem? Please describe.** I'd like to be able to run vLLM emulating the OpenAI compatible API to use vLLM as a drop-in replacement of ChatGPT. **Describe the solution you'd like** I'd like Triton allow me run vLLM as indicated in [vLLM documentation](https://vllm.readthedocs.io/en/stable/getting_started/quickstart.html#openai-compatible-server) Example: `python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m` **Describe alternatives you've considered** It is possible to use the REST API, however, for developers already leveraging OpenAI and serving open-source LLMs using the OpenAI API would allow a faster path for replacement of OpenAI