Mathematically Formalizing the TOON Efficiency Revolution versus JSON

🙋 Mateo Lafalce

✉️ mateolafalce@protonmail.com

🔑 My GPG public key

📚 Blog

The arrival of TOON a few weeks ago revolutionized the entire AI ecosystem by offering a much more efficient data transmission format that uses fewer tokens.

The official TOON project repository shows percentages of how efficient it can be , but I haven't found any comparative mathematical formalization that demonstrates TOON's superiority over JSON (which exists and is evident). Therefore, in this article, I am going to propose a formalization for this efficiency revolution.

Simple objects with primitive values in JSON ( RFC 8259 ):

{

  "id": 101,

  "status": "activo",

  "premium": true,

  "points": null

}

We could extrapolate its mathematical function, which describes the amount of space it occupies in bytes (excluding spaces and line breaks), as follows:

If  , that is, the key-value pairs of the JSON file.

: The character count function for any JSON element .

: A simple function that returns the length in characters of a string literal, without including the quotes.

The function to define the number of characters needed in a JSON file is this:

Organizing the fn we obtain:

: This is the recursive call to the weight function for the value .

Let's look at the specific cases for

If  is an Array:

 with :

If   is an  String :

If   is a Primitive (Number, Boolean, null)

Now, let gonna find an expression for the TOON

Simple objects with primitive values in TOON ( repo ):

id: 101

status: activo

premium: true

points: null

We arrive at this general expression when we handle simple objects:

Complex objects with primitive values in TOON

users[2]{id,name,role}:

  1,Alice,admin

  2,Bob,user

cats[2]{id,name,role}:

  1,Alice,admin

  2,Bob,user

We arrive at this general expression when we handle complex objects, where C is the name of the class to define:

Let's verify how many fewer characters we save for each new line in a JSON file (with all primitive text-type attributes) compared to an implementation with TOON.

Since all lines will be text-type attributes, there will always be 2 extra characters for each attribute

Therefore

In absolute terms, for each new key-value pair line in a JSON file,  more characters are added (where  is the number of elements in the JSON). This is the same as saying that for each new line in a JSON file, TOON is  characters more efficient.

If we wanted to calculate this difference in terms of percentage efficiency, it should be calculated as follows:

Therefore, imagine you have 100 pairs, with the sum of the character counts for the keys and values being 50.

The savings are gigantic.

Now, I'm leaving you a JSON to TOON converter so you can try it for yourself. The implementation is in vanilla JS, based on the current implementation from the TOON project, so if you are seeing this in the future, small details may have changed.


JSON to TOON Converter

Write your JSON here:

Characters: 0

Characters: 0


This blog is open source . See an error? Go ahead and propose a change.