Mathematically Formalizing the TOON Efficiency Revolution versus JSON
🙋 Mateo Lafalce
✉️ mateolafalce@protonmail.com
🔑 My GPG public key
📚 Blog
The arrival of TOON a few weeks ago revolutionized the entire AI ecosystem by offering a much more efficient data transmission format that uses fewer tokens.
The official TOON project repository shows percentages of how efficient it can be , but I haven't found any comparative mathematical formalization that demonstrates TOON's superiority over JSON (which exists and is evident). Therefore, in this article, I am going to propose a formalization for this efficiency revolution.
Simple objects with primitive values in JSON ( RFC 8259 ):
|
{ "id": 101, "status": "activo", "premium": true, "points": null }
|
We could extrapolate its mathematical function, which describes the amount of space it occupies in bytes (excluding spaces and line breaks), as follows:
If , that is, the key-value pairs of the JSON file.
: The character count function for any JSON element .
: A simple function that returns the length in characters of a string literal, without including the quotes.
The function to define the number of characters needed in a JSON file is this:
Organizing the fn we obtain:
: This is the recursive call to the weight function for the value .
Let's look at the specific cases for
If is an Array:
with :
If is an String :
If is a Primitive (Number, Boolean, null)
Now, let gonna find an expression for the TOON
Simple objects with primitive values in TOON ( repo ):
|
id: 101 status: activo premium: true points: null |
We arrive at this general expression when we handle simple objects:
Complex objects with primitive values in TOON
|
users[2]{id,name,role}: 1,Alice,admin 2,Bob,user cats[2]{id,name,role}: 1,Alice,admin 2,Bob,user
|
We arrive at this general expression when we handle complex objects, where C is the name of the class to define:
Let's verify how many fewer characters we save for each new line in a JSON file (with all primitive text-type attributes) compared to an implementation with TOON.
Since all lines will be text-type attributes, there will always be 2 extra characters for each attribute
Therefore
In absolute terms, for each new key-value pair line in a JSON file, more characters are added (where is the number of elements in the JSON). This is the same as saying that for each new line in a JSON file, TOON is characters more efficient.
If we wanted to calculate this difference in terms of percentage efficiency, it should be calculated as follows:
Therefore, imagine you have 100 pairs, with the sum of the character counts for the keys and values being 50.
The savings are gigantic.
Now, I'm leaving you a JSON to TOON converter so you can try it for yourself. The implementation is in vanilla JS, based on the current implementation from the TOON project, so if you are seeing this in the future, small details may have changed.
Write your JSON here:
Characters: 0
Characters: 0