The smallest piece of information underlying a computer is the distinction of two states (on or off), which can be represented as a binary digit (bit) with values 1 or 0. To store more complex information, more bits are combined. E.g., 8 bits in a row make up one byte (typically the smallest unit for information storage in computer memory). The discrete combination of ones and zeros in one byte allows storing up to 256 different states (2^8). One can assign specific meaning to each of these states. For example, a byte can represent a letter of the alphabet or a number. An example is the American Standard Code for Information Interchange, ASCII
, where the small letter “a” is encoded in binary code by 01100001, while the digit “1” is encoded by 00110001. In contrast, the first 256 numbers of the decimal number systems could also be represented in 8-bit binary code where 00000001 is the decimal number 1, while 11111111 is the decimal number 256. Larger chunk sizes to store pieces of information allow denoting many more states, increasing over 12-bit, 16-bit, 32-bit, 64-bit a.s.o.
Computers essentially store, read and transmit the information as linear combinations of these bits. However, the binary code is not what computer scientists use directly to write programs or store information. Instead, data is structured using programming languages and file specifications. A simple example is that a table consists of rows and columns; hence it is structured in at least two dimensions. Yet, when the computer transmits this information or stores it, it must do so as a sequence of bits. “Serialization” describes the process of transforming the structured data into the linear series of bits to save the state in a form that it can be exchanged by computers. To ensure the information that was preserved by the data structure is still available after serialization and can be restored in another computer (de-serialized) there are different specifications of how multidimensional structures and relationships are preserved during serialization. These specifications are called “serialization formats” of which there are several different ones (e.g., XML, JSON, or YAML). Thus, when a program on one computer serializes information according to the JSON specification, another computer can receive the file encoding this information and de-serialize it using the JSON specification.