YAML::PP(3pm) | User Contributed Perl Documentation | YAML::PP(3pm) |
YAML::PP - YAML 1.2 processor
WARNING: Most of the inner API is not stable yet.
Here are a few examples of the basic load and dump methods:
use YAML::PP; my $ypp = YAML::PP->new; my $yaml = <<'EOM'; --- # Document one is a mapping name: Tina age: 29 favourite language: Perl --- # Document two is a sequence - plain string - 'in single quotes' - "in double quotes we have escapes! like \t and \n" - | # a literal block scalar line1 line2 - > # a folded block scalar this is all one single line because the linebreaks will be folded EOM my @documents = $ypp->load_string($yaml); my @documents = $ypp->load_file($filename); my $yaml = $ypp->dump_string($data1, $data2); $ypp->dump_file($filename, $data1, $data2); # Enable perl data types and objects my $ypp = YAML::PP->new(schema => [qw/ + Perl /]); my $yaml = $yp->dump_string($data_with_perl_objects); # Legacy interface use YAML::PP qw/ Load Dump LoadFile DumpFile /; my @documents = Load($yaml); my @documents = LoadFile($filename); my @documents = LoadFile($filehandle); my $yaml = = Dump(@documents); DumpFile($filename, @documents); DumpFile($filenhandle @documents);
Some utility scripts, mostly useful for debugging:
# Load YAML into a data structure and dump with Data::Dumper yamlpp-load < file.yaml # Load and Dump yamlpp-load-dump < file.yaml # Print the events from the parser in yaml-test-suite format yamlpp-events < file.yaml # Parse and emit events directly without loading yamlpp-parse-emit < file.yaml # Create ANSI colored YAML. Can also be useful for invalid YAML, showing # you the exact location of the error yamlpp-highlight < file.yaml
YAML::PP is a modular YAML processor.
It aims to support "YAML 1.2" and "YAML 1.1". See <https://yaml.org/>. Some (rare) syntax elements are not yet supported and documented below.
YAML is a serialization language. The YAML input is called "YAML Stream". A stream consists of one or more "Documents", separated by a line with a document start marker "---". A document optionally ends with the document end marker "...".
This allows one to process continuous streams additionally to a fixed input file or string.
The YAML::PP frontend will currently load all documents, and return only the first if called with scalar context.
The YAML backend is implemented in a modular way that allows one to add custom handling of YAML tags, perl objects and data types. The inner API is not yet stable. Suggestions welcome.
You can check out all current parse and load results from the yaml-test-suite here: <https://perlpunk.github.io/YAML-PP-p5/test-suite.html>
my $ypp = YAML::PP->new; # use YAML 1.2 Failsafe Schema my $ypp = YAML::PP->new( schema => ['Failsafe'] ); # use YAML 1.2 JSON Schema my $ypp = YAML::PP->new( schema => ['JSON'] ); # use YAML 1.2 Core Schema my $ypp = YAML::PP->new( schema => ['Core'] ); # Die when detecting cyclic references my $ypp = YAML::PP->new( cyclic_refs => 'fatal' ); my $ypp = YAML::PP->new( boolean => 'perl', schema => ['Core'], cyclic_refs => 'fatal', indent => 4, header => 1, footer => 0, version_directive => 0, );
Options:
This option is for loading and dumping.
In case of perl 5.36 and later, builtin booleans should work out of the box (since YAML::PP >= 0.38.0).
print YAML::PP->new->dump_string([ builtin::true, !1 ]); # --- # - true # - false
For earlier perl versions, you can use "pseudo" booleans like documented in the following examples.
# load/dump booleans via boolean.pm my $ypp = YAML::PP->new( boolean => 'boolean' ); # load/dump booleans via JSON::PP::true/false my $ypp = YAML::PP->new( boolean => 'JSON::PP' );
You can also specify more than one class, comma separated. This is important for dumping.
boolean => 'JSON::PP,boolean' Booleans will be loaded as JSON::PP::Booleans, but when dumping, also 'boolean' objects will be recognized boolean => 'JSON::PP,*' Booleans will be loaded as JSON::PP::Booleans, but when dumping, all currently supported boolean classes will be recognized boolean => '*' Booleans will be loaded as perl booleans, but when dumping, all currently supported boolean classes will be recognized boolean => '' Booleans will be loaded as perl booleans, but when dumping, nothing will be recognized as booleans. This option is for backwards compatibility for perl versions < 5.36, if you rely on [!!1, !1] being dumped as [1, ''].
The option "perl_experimental" was introduced when experimental boolean support was added to perl 5.36. Since it will not be experimental anymore in perl 5.40 \o/ the option is deprecated and the same as "perl".
This option is for loading and dumping.
Array reference. Here you can define what schema to use. Supported standard Schemas are: "Failsafe", "JSON", "Core", "YAML1_1".
To get an overview how the different Schemas behave, see <https://perlpunk.github.io/YAML-PP-p5/schemas.html>
Additionally you can add further schemas, for example "Merge".
Before the default was "allow", but this can lead to memory leaks when loading on untrusted data, so it was changed to "fatal" by default.
This option is for loading only.
Defines what to do when a cyclic reference is detected when loading.
# fatal - die # warn - Just warn about them and replace with undef # ignore - replace with undef # allow - Default
Since version 0.027
This option is for loading.
The YAML Spec says duplicate mapping keys should be forbidden.
When set to true, duplicate keys in mappings are allowed (and will overwrite the previous key).
When set to false, duplicate keys will result in an error when loading.
This is especially useful when you have a longer mapping and don't see the duplicate key in your editor:
--- a: 1 b: 2 # ............. a: 23 # error
This option is for dumping.
Use that many spaces for indenting
Default: 80
This option is for dumping.
Maximum columns when dumping.
This is only respected when dumping flow collections right now.
in the future it will be used also for wrapping long strings.
This option is for dumping.
Print document header "---"
This option is for dumping.
Print document footer "..."
This option is for loading and dumping.
Default: 1.2
Note that in this case, a directive "%YAML 1.1" will basically be ignored and everything loaded with the "1.2 Core" Schema.
If you want to support both YAML 1.1 and 1.2, you have to specify that, and the schema ("Core" or "YAML1_1") will be chosen automatically.
my $yp = YAML::PP->new( yaml_version => ['1.2', '1.1'], );
This is the same as
my $yp = YAML::PP->new( schema => ['+'], yaml_version => ['1.2', '1.1'], );
because the "+" stands for the default schema per version.
When loading, and there is no %YAML directive, 1.2 will be considered as default, and the "Core" schema will be used.
If there is a "%YAML 1.1" directive, the "YAML1_1" schema will be used.
Of course, you can also make 1.1 the default:
my $yp = YAML::PP->new( yaml_version => ['1.1', '1.2'], );
You can also specify 1.1 only:
my $yp = YAML::PP->new( yaml_version => ['1.1'], );
In this case also documents with "%YAML 1.2" will be loaded with the "YAML1_1" schema.
This option is for dumping.
Default: 0
Print Version Directive "%YAML 1.2" (or "%YAML 1.1") on top of each YAML document. It will use the first version specified in the "yaml_version" option.
Default: false
This option is for loading and dumping.
Preserving scalar styles is still experimental.
use YAML::PP::Common qw/ :PRESERVE /; # Preserve the order of hash keys my $yp = YAML::PP->new( preserve => PRESERVE_ORDER ); # Preserve the quoting style of scalars my $yp = YAML::PP->new( preserve => PRESERVE_SCALAR_STYLE ); # Preserve block/flow style (since 0.024) my $yp = YAML::PP->new( preserve => PRESERVE_FLOW_STYLE ); # Preserve alias names (since 0.027) my $yp = YAML::PP->new( preserve => PRESERVE_ALIAS ); # Combine, e.g. preserve order and scalar style my $yp = YAML::PP->new( preserve => PRESERVE_ORDER | PRESERVE_SCALAR_STYLE );
Do NOT rely on the internal implementation of it.
If you load the following input:
--- z: 1 a: 2 --- - plain - 'single' - "double" - | literal - > folded --- block mapping: &alias flow sequence: [a, b] same mapping: *alias flow mapping: {a: b}
with this code:
my $yp = YAML::PP->new( preserve => PRESERVE_ORDER | PRESERVE_SCALAR_STYLE | PRESERVE_FLOW_STYLE | PRESERVE_ALIAS ); my ($hash, $styles, $flow) = $yp->load_file($file); $yp->dump_file($hash, $styles, $flow);
Then dumping it will return the same output.
Note that YAML allows repeated definition of anchors. They cannot be preserved with YAML::PP right now. Example:
--- - &seq [a] - *seq - &seq [b] - *seq
Because the data could be shuffled before dumping again, the anchor definition could be broken. In this case repeated anchor names will be discarded when loading and dumped with numeric anchors like usual.
Implementation:
When loading, hashes will be tied to an internal class ("YAML::PP::Preserve::Hash") that keeps the key order.
Scalars will be returned as objects of an internal class ("YAML::PP::Preserve::Scalar") with overloading. If you assign to such a scalar, the object will be replaced by a simple scalar.
# assignment, style gets lost $styles->[1] .= ' append';
You can also pass 1 as a value. In this case all preserving options will be enabled, also if there are new options added in the future.
There are also methods to create preserved nodes from scratch. See the preserved_(scalar|mapping|sequence) "METHODS" below.
my $doc = $ypp->load_string("foo: bar"); my @docs = $ypp->load_string("foo: bar\n---\n- a");
Input should be Unicode characters.
So if you read from a file, you should decode it, for example with Encode::decode().
Note that in scalar context, "load_string" and "load_file" return the first document (like YAML::Syck), while YAML and YAML::XS return the last.
my $doc = $ypp->load_file("file.yaml"); my @docs = $ypp->load_file("file.yaml");
Strings will be loaded as unicode characters.
my $yaml = $ypp->dump_string($doc); my $yaml = $ypp->dump_string($doc1, $doc2); my $yaml = $ypp->dump_string(@docs);
Input strings should be Unicode characters.
Output will return Unicode characters.
So if you want to write that to a file (or pass to YAML::XS, for example), you typically encode it via Encode::encode().
$ypp->dump_file("file.yaml", $doc); $ypp->dump_file("file.yaml", $doc1, $doc2); $ypp->dump_file("file.yaml", @docs);
Input data should be Unicode characters.
This will dump to a predefined writer. By default it will just use the YAML::PP::Writer and output a string.
my $writer = MyWriter->new(\my $output); my $yp = YAML::PP->new( writer => $writer, ); $yp->dump($data);
Since version 0.024
Experimental. Please report bugs or let me know this is useful and works.
You can define a certain scalar style when dumping data. Figuring out the best style is a hard task and practically impossible to get it right for all cases. It's also a matter of taste.
use YAML::PP::Common qw/ PRESERVE_SCALAR_STYLE YAML_LITERAL_SCALAR_STYLE /; my $yp = YAML::PP->new( preserve => PRESERVE_SCALAR_STYLE, ); # a single linebreak would normally be dumped with double quotes: "\n" my $scalar = $yp->preserved_scalar("\n", style => YAML_LITERAL_SCALAR_STYLE ); my $data = { literal => $scalar }; my $dump = $yp->dump_string($data); # output --- literal: |+ ...
Since version 0.024
Experimental. Please report bugs or let me know this is useful and works.
With this you can define which nodes are dumped with the more compact flow style instead of block style.
If you add "PRESERVE_ORDER" to the "preserve" option, it will also keep the order of the keys in a hash.
use YAML::PP::Common qw/ PRESERVE_ORDER PRESERVE_FLOW_STYLE YAML_FLOW_MAPPING_STYLE YAML_FLOW_SEQUENCE_STYLE /; my $yp = YAML::PP->new( preserve => PRESERVE_FLOW_STYLE | PRESERVE_ORDER ); my $hash = $yp->preserved_mapping({}, style => YAML_FLOW_MAPPING_STYLE); # Add values after initialization to preserve order %$hash = (z => 1, a => 2, y => 3, b => 4); my $array = $yp->preserved_sequence([23, 24], style => YAML_FLOW_SEQUENCE_STYLE); my $data = $yp->preserved_mapping({}); %$data = ( map => $hash, seq => $array ); my $dump = $yp->dump_string($data); # output --- map: {z: 1, a: 2, y: 3, b: 4} seq: [23, 24]
Returns or sets the loader object, by default YAML::PP::Loader
Returns or sets the dumper object, by default YAML::PP::Dumper
Returns or sets the schema object
Creates and returns the default schema
The functions "Load", "LoadFile", "Dump" and "DumpFile" are provided as a drop-in replacement for other existing YAML processors. No function is exported by default.
Note that in scalar context, "Load" and "LoadFile" return the first document (like YAML::Syck), while YAML and YAML::XS return the last.
use YAML::PP qw/ Load /; my $doc = Load($yaml); my @docs = Load($yaml);
Works like "load_string".
use YAML::PP qw/ LoadFile /; my $doc = LoadFile($file); my @docs = LoadFile($file); my @docs = LoadFile($filehandle);
Works like "load_file".
use YAML::PP qw/ Dump /; my $yaml = Dump($doc); my $yaml = Dump(@docs);
Works like "dump_string".
use YAML::PP qw/ DumpFile /; DumpFile($file, $doc); DumpFile($file, @docs); DumpFile($filehandle, @docs);
Works like "dump_file".
You can alter the behaviour of YAML::PP by using the following schema classes:
To make the parsing process faster, you can plugin the libyaml parser with YAML::PP::LibYAML.
The process of loading and dumping is split into the following steps:
Load: YAML Stream Tokens Event List Data Structure ---------> ---------> ---------> lex parse construct Dump: Data Structure Event List YAML Stream ---------> ---------> represent emit
You can dump basic perl types like hashes, arrays, scalars (strings, numbers). For dumping blessed objects and things like coderefs have a look at YAML::PP::Perl/YAML::PP::Schema::Perl.
Note that the API to retrieve the tokens will change.
Still TODO:
--- [ a, b, c ]: value
--- [ a, b, c: d ] # equals [ a, b, { c: d } ]
--- key ends with two colons::: value
This was implemented in 0.037.
The Constructor now supports all three YAML 1.2 Schemas, Failsafe, JSON and Core. Additionally you can choose the schema for YAML 1.1 as "YAML1_1".
Too see what strings are resolved as booleans, numbers, null etc. look at <https://perlpunk.github.io/YAML-PP-p5/schema-examples.html>.
You can choose the Schema like this:
my $ypp = YAML::PP->new(schema => ['JSON']); # default is 'Core'
The Tags "!!seq" and "!!map" are still ignored for now.
It supports:
YAML::XS uses real aliases, which allows also aliasing scalars. I might add an option for that since aliasing is now available in pure perl.
Example:
use YAML::PP; use JSON::PP; my $ypp = YAML::PP->new; my $coder = JSON::PP->new->ascii->pretty->allow_nonref->canonical; my $yaml = <<'EOM'; complex: ? ? a: 1 c: 2 : 23 : 42 EOM my $data = $yppl->load_string($yaml); say $coder->encode($data); __END__ { "complex" : { "{'{a => 1,c => 2}' => 23}" : 42 } }
TODO:
The Dumper should be able to dump strings correctly, adding quotes whenever a plain scalar would look like a special string, like "true", or when it contains or starts with characters that are not allowed.
Most strings will be dumped as plain scalars without quotes. If they contain special characters or have a special meaning, they will be dumped with single quotes. If they contain control characters, including <"\n">, they will be dumped with double quotes.
It will recognize JSON::PP::Boolean and boolean.pm objects and dump them correctly.
Numbers which also have a "PV" flag will be recognized as numbers and not as strings:
my $int = 23; say "int: $int"; # $int will now also have a PV flag
That means that if you accidentally use a string in numeric context, it will also be recognized as a number:
my $string = "23"; my $something = $string + 0; print $yp->dump_string($string); # will be emitted as an integer without quotes!
The layout is like libyaml output:
key: - a - b - c --- - key1: 1 key2: 2 key3: 3 --- - - a1 - a2 - - b1 - b2
Now there is also YAML::Tidy, which will format the given file according to your configuration. So far only a few configuration options exist, but they can already be quite helpful.
There are many YAML modules on CPAN. For historical reasons some of them aren't handling YAML correctly.
Most of them are not compatible with the YAML spec and with each other, meaning they can interpret the same YAML differently.
The behaviours we are discussing here can be divided into parsing issues (syntax) and loading/constructing issues (for example type resolving which decides what is a number, boolean or null).
See also <https://matrix.yaml.info/> (parsing) and <https://perlpunk.github.io/YAML-PP-p5/schema-examples.html> (loading).
1. (syntax) libyaml diverged from the spec in several aspects.
They are rare though. 2. The type resolving does not adhere to YAML 1.1
or YAML 1.2, meaning it is
incompatible with other YAML libraries in perl or other languages.
Why did I start to write a new YAML module?
All the available parsers and loaders for Perl are behaving differently, and more important, aren't conforming to the spec. YAML::XS is doing pretty well, but "libyaml" only handles YAML 1.1 and diverges a bit from the spec. The pure perl loaders lack support for a number of features.
I was going over YAML.pm issues end of 2016, integrating old patches from rt.cpan.org and creating some pull requests myself. I realized that it would be difficult to patch YAML.pm to parse YAML 1.1 or even 1.2, and it would also break existing usages relying on the current behaviour.
In 2016 Ingy döt Net initiated two really cool projects:
These projects are a big help for any developer. So I got the idea to write my own parser and started on New Year's Day 2017. Without the test suite and the editor I would have never started this.
I also started another YAML Test project which allows one to get a quick overview of which frameworks support which YAML features:
<https://github.com/yaml/yaml-test-suite>
It contains almost 400 test cases and expected parsing events and more. There will be more tests coming. This test suite allows you to write parsers without turning the examples from the Specification into tests yourself. Also the examples aren't completely covering all cases - the test suite aims to do that.
Thanks also to Felix Krause, who is writing a YAML parser in Nim. He turned all the spec examples into test cases.
This is a tool to play around with several YAML parsers and loaders in vim.
<https://github.com/yaml/yaml-editor>
The project contains the code to build the frameworks (16 as of this writing) and put it into one big Docker image.
It also contains the yaml-editor itself, which will start a vim in the docker container. It uses a lot of funky vimscript that makes playing with it easy and useful. You can choose which frameworks you want to test and see the output in a grid of vim windows.
Especially when writing a parser it is extremely helpful to have all the test cases and be able to play around with your own examples to see how they are handled.
I was curious to see how the different frameworks handle the test cases, so, using the test suite and the docker image, I wrote some code that runs the tests, manipulates the output to compare it with the expected output, and created a matrix view.
<https://github.com/perlpunk/yaml-test-matrix>
You can find the latest build at <https://matrix.yaml.info>
The Perl Foundation <https://www.perlfoundation.org/> sponsored this project (and the YAML Test Suite) with a grant of 2500 USD in 2017-2018.
Copyright 2017-2022 by Tina Müller
This library is free software and may be distributed under the same terms as perl itself.
2024-02-04 | perl v5.38.2 |