YAML: Simple, Powerful, and Sleek!
Introduction
YAML stands for Yet Another Markup Language or YAML Ain’t Markup Language.YAML is a versatile and human-readable data serialization format. This blog will guide you through, everything you need to know about YAML, from its basic syntax to advanced techniques and practical applications.
Why use Yaml?
- Easily understandable: Yaml files are easy for people to understand, and indentation defines the structure, making it clear and concise.
- Flexibility: We can use various data structures like lists, dictionaries, and nested objects making it adaptable for complex configurations.
- Versatile: Yaml is not tied to any specific programming language or platform. It can be easily integrated into applications written in various programming languages.
- Interoperability: Yaml support for anchors and aliases allows for code reuse and modularity, facilitating interoperability and collaboration.
Getting Started with YAML
We’ll delve into the key components that make up YAML’s syntax and data-structures:
‘---
‘ Indicator:
The ‘—‘ indicator (three dashes) in YAML is used as a document separator. It signifies the beginning of a new YAML document within a stream, and also it signifies the start and end of the data stream for the parser. This is useful when you have multiple YAML documents within the same file.
---
name: kaido
age: 30
---
server:
host: app.com
port: 3000
---
database:
name: mysql
Here, “—” neatly divides the configuration for the web server from the database details. Parsers will recognize them as separate entities.
The Rarest of YAML Constructs: “…”
While “—” reigns supreme for document separation, YAML specifications mention “…” (three dots) as a possible indicator for the end of the data stream. However, this usage is uncommon, It signifies the end of a YAML document within a stream. While it’s optional in most cases, it can be helpful in scenarios where you want to explicitly denote the end of a document.
---
name: kaido
age: 30
...
Indentation-based Hierarchy ( 2 – space rule)
YAML relies on indentation to denote hierarchy and structure. Each level of indentation represents a nested level in the data structure. An indentation consists of spaces, not tabs. Conventionally, two spaces are recommended for indentation in YAML files. Consistent indentation is crucial for YAML parsers to interpret the data correctly. Mixing spaces and tabs or inconsistent indentation can lead to parsing errors.
The Building Blocks: Scalar Types
Scalar types in YAML represent single values, such as strings, numbers, booleans, and null values. Unlike complex types like lists or mappings, scalar types are atomic and cannot contain other values.
Strings
Strings are sequences of characters enclosed in single or double quotes. They are workhorses for textual data. Enclose strings in quotes (single or double) for clarity, especially with spaces or special characters.
database:
name: "test-db"
Numbers
Represent both whole numbers (integers) and decimals (floating-point) and exponential notations. Numbers are written without quotes.
Interger_number: 30
pi_value: 3.14
Booleans:
Booleans represent true or false values. True or False values, typically written as true
and false
(lowercase, without quotes).
is_active: true
is_notactive: false
Null:
Null represents an empty or undefined value. It is written as null
(without quotes).
placeholder: null
Implicit and Explicit Types:
Implicit Typing
Implicit typing in YAML is the process where the YAML parser automatically determines the data type of a scalar value based on its syntax and content. This means that you don’t need to specify the data type explicitly using any tags.
age: 30 #Integer
is_active: true #Boolean
In this case, the value 3
0 is interpreted as an integer by the YAML parser because it consists solely of digits. The parser infers the data type age
as an integer without needing any additional information.
Explicit Typing
Explicit typing involves specifying the data type of a scalar value explicitly using YAML tags. YAML tags are used to provide additional information about the intended data type, overriding any implicit typing that the parser might perform.
pi_value: !!float 3.14159
In this case, the !!float
tag explicitly specifies that the value 3.14159
should be interpreted as a floating-point number, regardless of its syntax. This ensures that the value is treated as a float even if it could have been inferred as an integer by the parser. Explicit typing can be useful in situations where you want to ensure a specific data type is used, regardless of the scalar value’s syntax.
Sequences
Lists represent ordered collections of items in YAML. Ordered collections of items, similar to arrays in programming. Represented by a list of items, each on a new line, indented under the key. Each item in the list is preceded by a dash (-
) followed by a space. A list can contain scalar values, other lists, mapping(dictionaries), or a combination of these data types.
Languages:
- Python
- C
- C++
Mapping
Unordered collections of key-value pairs, similar to dictionaries or objects in programming. Keys define the property, and values hold the corresponding data. Mappings are collections of key-value pairs. Each key-value pair is denoted by a key followed by a colon (:
) and a space, followed by the corresponding value. Indentation creates a hierarchy, allowing for nested structures.
database:
host: localhost
port: 3306
user: db_user
test: true
password: secret
Nested Structures
Nested structures in YAML refer to the ability to embed one data structure within another. This means that lists can contain other lists or mappings (dictionaries), and mappings can contain other mappings or lists. This allows you to create hierarchical configurations, grouping related data under a single key. This enhances readability, maintainability, and the overall organization of your YAML files.
server: #Mapping
host: app.com
port: 3000
database: #Mapping
host: localhost
port: 3306
user: db_user
password: secret
users:
- name: admin #List
email: admin@example.com
role: administrator
- name: user1
email: user1@example.com
role: user
Here, we have three main keys: server
, database
, and users
. Each key has its own sub-keys representing specific properties.
Nesting Levels
YAML allows for multiple levels of nesting. You can create sub-structures within sub-structures, catering to even more complex data models.YAML uses indentation to denote hierarchy and structure, and each level of indentation represents a nested level in the data structure.
server:
host: my-app.com
port: 3000
security:
authentication:
method: JWT
secret_key: "secret"
Here, we’ve nested an additional level under the server
key to define security settings, including the authentication method and secret key.
NOTE: YAML doesn’t support duplicate keys at the same level. YAML relies on unique keys to identify and associate values.
# This will cause an error due to Invalid syntax
server:
host: app.com
port: 3000
port: 8080 # Duplicate key 'port'on the same level.
To resolve this error, you would need to ensure that each key within a mapping is unique. If you need to represent multiple values for the same key, you can use a list or nested mappings.
server:
host: app.com
port:
- 3000
- 8080
Anchors(&) and Aliases(*)
Anchors are markers within a YAML document that create a reference point for a specific data structure. They are denoted using the ampersand &
symbol followed by a unique name. Anchors can be applied to any YAML node, including scalars, lists, and mappings.
Aliases are references to anchors within a YAML document. They are denoted using the *
symbol followed by the name of the anchor. Aliases allow you to reuse values defined elsewhere in the YAML document.
# Anchoring a mapping
Red_house: &details
color: red
age_group: 20-25
department: Engineering
# Referencing the anchor
employee1:
<<: *details #<<-using merge key
employee2: *details
“<<” (double left angle bracket) syntax in YAML is used for merging mappings (dictionaries) by combining the key-value pairs from one mapping with another mapping. This is commonly referred to as the merge key.
The merge process will fail if there is an existing key in the target mapping, it won’t be overridden by the merge.
Use Cases for Anchors and Aliases:
Code Reuse: Anchors and aliases allow you to define a data structure once and reference it multiple times within a YAML document, reducing duplication and making it more maintainable. Some YAML parsers have options to control overwrite behavior, but generally, existing keys take precedence.
Shared Data: They are useful for sharing common data across different parts of a YAML document, ensuring consistency and facilitating updates.
YAML tutorial video
Watch our YouTube video for a better understanding:
Post Comment