The state of YAML in PHP

Due to work on the Jackknife, I had occasion to learn more than I could have ever wanted about parsing YAML in PHP. YAML (‘YAML Ain’t Markup Language’) is a data serialization format – otherwise known as markup – designed to be easily human readable. While it succeeds at that, it’s a pain to support beyond simple key:value pairings. Pretty straightforward, on paper. Where this is relevant is that the company responsible for EVE-Online, CCP, has a fetish for YAML, using it in places it flat out doesn’t make sense, with promise of more to come. So dealing with YAML is a necessary evil, as far as using the EVE API goes.

Here’s where the problem comes in: YAML parsing support is inconsistent as hell. Not just in PHP. However, PHP has a few PECL plugins that allow one to use an established library that actually, you know, works. Except, problem: neither were compiled in by default. Being on a shared host, this means I can’t use either of them – so it necessitated some contortion to get a working YAML parser. I ended up testing 6 methods of parsing YAML in PHP.

The parsers I tried:

Two PECL Extensions, ‘yaml’ and ‘syck’. These are extensions for php that must be compiled then installed and enabled in your php config. God help you if you are on a shared host or a windows system.

Three native PHP implementations: ‘spyc’, ‘symfony/Yaml’, and ‘Horde_YAML’. These are just PHP scripts, and required no installation.

One hack: Using a functional YAML parser via node.js, outputting json, and parsing that json in php. Yes, ew, but it does produce valid output.

The Tests:

1: Implicit arrays. wants should be an array containing an array with two children. (_ are spaces; WP is mangling them)

typeID: 20061
wants:
-_quantity: 321
__typeID: 4051

 

 

2. Inline arrays. characterLinkData should be an array with 3 entries

characterLinkData:
- showinfo
- 1378
- 1018019804
characterName: Cziik

 

3. Quoted, multiline string. body should contain everything between the quotes, with no extra newlines.

body: 'Located at:: Inghenges V - Moon 15 - Impro Factory

Material Efficiency: 1

Productivity: 0'
subject: Lockdown the Orca Blueprint

 

4. Empty array.

locationOwnerID: 667531913
ownerID: 386010388
typeIDs: []

 

5. Standard key-value pairs

cloneBought: null
cloneStationID: 60012709
cloneTypeID: 9931
corpStationID: 60012709
lastCloned: null
podKillerID: 1375512897
skillID: null
skillPointsLost: null

 

RESULTS:

1. syck, node, libyaml produced correct output. spyc and horde did not parse the quoted string correctly, and symfony threw an exception.

2. syck, node, libyaml, symfony produced correct output. spyc and horde got the array members right, but put them as part of the parent object, not as children of the key.

3. syck, node, libyaml produced correct output. None of the native ones did and symfony threw an exception. syck fastest.

4. All passed. spyc was the fastest of the natives, libyaml the fastest of the pecl. node.js – last.

5. All passed. spyc was the fastest of the natives, libyaml the fastest of the pecl. Of course, node.js was last.

Conclusion:

God help you if you are on a shared host (or windows). The PECL extensions all performed flawless and quick as greased lightning. Installing libyaml was very easy with the PECL command line tool, but if you don’t have access to the php install directly you are SOL when it comes to using those two. Or if you are using windows, for which compiling anything descended from nix is a pain.

The native php parsers are all duds for various reasons when using anything more complex than key: value pairs. The node.js hack worked flawlessly, except for being the heaviest and slowest. Unfortunately, as I use a shared host, it’s what I’m stuck with unless I’m crazy enough to port jsyaml or PyYAML.

URLS:

// PECL yaml – http://php.net/manual/en/book.yaml.php
// PECL syck – http://pecl.php.net/package/syck
// spyc – http://code.google.com/p/spyc/
// symfony/Yaml – https://github.com/symfony/Yaml
// horde-yaml – http://pear.horde.org/

My yaml2json hack: http://ridetheclown.com/eveapi/yaml2json.js

PHP code for the rest of the hack (preserving spacing is too cool for WP)


function objectToArray($d) {
if (is_object($d))
$d = get_object_vars($d);

if (is_array($d)) {
return array_map('objectToArray', $d);
} else {
return $d;
}
}

function nodeYAMLParse($str) {
$descriptorspec = array(
0 => array("pipe", "r"), // stdin
1 => array("pipe", "w"), // stdout
2 => array("pipe", "w")  // stderr
);

$process = proc_open("./yaml2json.js", $descriptorspec, $pipes);

if ($process === FALSE) {
trigger_error("Failed to open YAML parser process!");
return FALSE;
}

fwrite($pipes[0], $str);
fclose($pipes[0]);

$json = stream_get_contents($pipes[1]); // get output, if any
fclose($pipes[1]);

$errors = stream_get_contents($pipes[2]); // get error output, if any
fclose($pipes[2]);

$return_value = proc_close($process);

if ($return_value != 0) {
if ($return_value != 255) // something beyond a YAML error occured, like a shell error or node error
trigger_error("Failed to parse YAML, got return code $return_value - $errors \n$str");
return FALSE;
}

// object to array is done so the result can be used as an array; stdClass object (what json_decode returns) does not let you add or modify items.
return objectToArray(json_decode($json));
}

2 thoughts on “The state of YAML in PHP

  1. nice.
    here’s how i installed the yaml pecl extension on a shared host:
    (requires shell access)

    # i like to make things in a directory called src
    mkdir ~/src
    cd ~/src

    # check the website for link to current version
    wget http://pyyaml.org/download/libyaml/yaml-0.1.4.tar.gz
    uz yaml-0.1.4.tar.gz
    cd yaml-0.1.4

    # i like to install things i compile myself to ~/usr
    ./configure –prefix=$HOME/usr
    make
    make install

    # check the website for link to current version
    wget http://pecl.php.net/get/yaml-1.1.0.tgz
    uz yaml-1.1.0.tgz
    cd yaml-1.1.0

    phpize
    ./configure –prefix=$HOME/usr –with-yaml=$HOME/usr
    make
    #
    # make install doesn’t work for some reason
    # so i copied the newly-created modules manually
    #
    mkdir -p $HOME/usr/lib/php/modules/
    cp modules/* $HOME/usr/lib/php/modules/

    # cd to where your php.ini is at
    cd ~/public_html/

    # find out where your extensions dir is
    grep ^extension_dir php.ini
    # extension_dir = “/usr/lib64/php/modules”
    # this is your current extension directory
    # link the existing extensions to your personal modules directory

    ln -s /usr/lib64/php/modules/* ~/usr/lib/php/modules/

    # now edit your php.ini,
    # change the value extension_dir (use full path: /home/yerlogin/lib/php/modules )
    # and add the yaml extension to the end of the [php] section
    # extension = yaml.so

    • Yeah, unfortunately, best that I know dreamhost doesn’t let you override their php.ini…. or, that’s what I was going to say, but on closer inspection it seems they may allow one to do so – they’ve got a wiki entry, anyways. Will have to try that.

Leave a Reply

Your email address will not be published. Required fields are marked *