Loading request...
Currently, Ollama embeddings are slow due to only one request running at a time. The request is to leverage Ollama's 'OLLAMA_NUM_PARALLEL' support to allow multiple parallel embedding requests, improving performance.
ollama现在支持 OLLAMA_NUM_PARALLEL 现在调研ollama进行嵌入很慢,只有一个嵌入请求在运行。如果多请求进行调用会好点