Loading request...
User questions why common words like "unable" are split into multiple tokens in ChatGPT, leading to higher compute costs, and suggests digesting the entire dictionary for better efficiency.
Then why do HUGE LLMs like ChatGPT etc charge two tokens for words like "unable" (1 token for un 1 token for able) instead of them digesting the entire dictionary one time for less compute cost? I'm genuinely asking because I clearly don't understand.