Enhance the retry mechanism so that after validation failure or `ModelRetry`, the system sends only a minimal, correction-focused context instead of the full conversation history, reducing token usage.
### Initial Checks - [x] I'm using the [latest version](https://github.com/pydantic/pydantic-ai/releases/latest) of Pydantic AI - [x] I've searched for my issue in [the issue tracker](https://github.com/pydantic/pydantic-ai/issues) before opening this issue ### Description ### Description When a run retries after validation failure or `ModelRetry`, the next request includes the full accumulated conversation history rather than a minimal correction-focused context. This behaviour increases token usage and introduces unnecessary cognitive load for the model. In structured generation workflows, retries are typically repair tasks (fixing schema violations or invalid fields), not continuation tasks. Including all prior messages (including invalid outputs) can: * reinforce earlier incorrect structure or field names * dilute the retry instruction with stale context * degrade correction accuracy on smaller or weaker models * significantly increase token consumption across multiple retrie