We charge usage of the model when extracting information from documents. The price depends on the number of input tokens (template, examples, and input document) and output tokens (JSON output).
| Model | Input tokens | Output tokens | Token size (Image) | Token size (Text) |
|---|---|---|---|---|
| NuExtract 2.0 PRO | $1/M tokens | $5/M tokens | 32x32 pixels | word or sub-word |
If you need to process a large number of documents and need a lower price, do no hesitate to talk to us. Prices per tokens can be significantly lowered for large volumes, either by batching or by using a fine-tuned model.
Text Tokens: In English, 1 word is about 1.3 tokens on average, which means a page typically contains 1000 tokens. Some languages have a higher average token count per word.
Image Tokens: With NuExtract 2.0 PRO, a token corresponds to a patch of 32x32 pixels. An A4 page rasterized at 115dpi corresponds to about 1500 tokens.
Here is a price-estimation chart for typical documents:
| Modality | Size | Input Tokens | Input Price |
|---|---|---|---|
| Text | 1 page | ~1000 | ~$0.001 |
| Text | 100 pages | ~100k | ~$0.1 |
| Image | 1 A4 page at 115dpi | ~1500 | ~$0.0015 |
| Image | 100 A4 pages at 115dpi | ~150k | ~$.15 |
NB: PDFs and other formatted documents are converted to images by default.
We offer the possibility to use the NuExtract Platform in a fully private way, i.e. deployed on a private cloud of your choice or even on-premises. There are three reasons why you may want to use a private platform:
Currently, you would need to talk to us to make this happen.
Not yet available.