What if you could have conventional large language model output with 10 times to 20 times less energy consumption? And what if you could put a powerful LLM right on your phone? It turns out there are ...