[WIP] Tracking usage token from LLM¶
Overview¶
Based on the usage of LLM through the agents, workflows, and pipelines, we need to track the usage of tokens.
Algorithm¶
Algorithm: Retrieving Token Usage for Cost Accounting from ChatGPT API:
Input:
API key, `K`
Chat model identifier, `model`
User message sequence, `P`
Output:
Prompt tokens, `T_prompt`
Completion tokens, `T_completion`
Total tokens, `T_total`
1. Initialize: Obtain API key `K` from secure storage.
2. Set headers:
- `Authorization ← "Bearer " + K`
- `Content-Type ← "application/json"`
3. Construct request payload:
- Set `payload.model ← model`
- Set `payload.messages ← P`
4. Send request:
- Perform HTTP `POST` request to the Chat Completions endpoint.
5. Receive response:
- Store HTTP response as `resp`.
6. Validate response:
- If `resp.status_code ≠ 200`, terminate with error.
7. Parse response body:
- Decode JSON content from `resp`.
8. Extract token usage:
- Set `T_prompt ← resp.usage.prompt_tokens`
- Set `T_completion ← resp.usage.completion_tokens`
- Set `T_total ← resp.usage.total_tokens`
9. Return:
- Output `(T_prompt, T_completion, T_total)`.
Token Usage to Cost Mapping:
Let:
-
c_prompt(model)be the cost per prompt token -
c_completion(model)be the cost per completion token
Then total request cost is:
Notes¶
-
Token usage is reported by the API, not estimated client-side
-
Usage values appear only after a successful completion
-
Streaming responses accumulate token usage incrementally
-
Cost calculation depends on model-specific pricing