Serialization, BLOB, INI, XML and JSON
As a regular programmer I was quite accustomed to storing program information as blob files (binary large objects). In C# this is really easy, you just mark a class as serializable via the [Serializable()] attribute, special fields that you don’t wish to be serialized you mark as [NonSerialized]. Finally you create a BinaryFormatter for the actual serialization and de-serialization and you’re done. Source code (note: there’s no error handling in any of the source code to keep things short) would look something like this:
Pointing the stream at a file would generate a binary file. Using blobs is very easy, it takes only a few lines of code to store and retrieve an object. However the generated output is not human readable. And a program which was written in a different language and/or has no access to the original class file (be it in compiled form or not) will not understand, let alone modify the file, correctly without a lot of effort.
A different take on this are INI-files, which are files where an attribute is named and a value is followed. An ini file could look like this:
This file is human readable and easily understood and edited by either computers or humans. This file was generated by the following code, which is not really complex, but there are some big problems which I haven’t solved yet.
As you can see, serialization and de-serialization is not as automated as you’d hope for. We have to write the code for each action ourselves, which can be cumbersome if we are dealing with large and/or numerous classes. Also classes that we serialize need to have only public fields that we want to store, or have some kind of constructor (that maybe accepts a nice struct) that allows us to set all fields. There’s also another problem, there is no clear syntax on arrays, we could of course invent some syntax, but this would differ from program to program. Another problem is how do we save text that contains a ‘=’ char. We could solve this by using escaping or by using a special dummy character, although these problems are all solvable, the way these problems are solved differ from application, which makes it hard to make applications interact with each other.
In the comments Alex correctly states that with reflection you could get automatically get all the objects members automatically, you can even introduce your own attributes or respond to the existing attributes like [nonserialized]. Of course this would make ini serialization and de-serialization a lot easier. However reflection code is often very complex so I won’t put any source code here. But maybe I’ll write a separate tutorial about reflection.
So let’s try another type of serialization, XML (eXtensible Markup Language) has been quite the buzz lately, it’s being used to create web pages (XHTML) and a lot of web services and web APIs communicate via XML. Some databases even allow us to save XML and to search through XML and in .Net there is even a separate XML namespace (System.XML). As with most things, it’s trivially easy to serialize to XML in .NET.
XML is human readable and very understandable by machines. However the syntax is a bit complex sometimes. The tree like structure is understandable, but when do we write an attribute between two tags like ‘Name’ or do we use the parameter syntax like in ‘Stuff’? Or both. There’s also a lot of confusion about arrays, we could use a tree like structure and creating a new leaf for each element in the array, but there are other solutions. The syntax for serialization to XML looks quite like that of serialization to BLOB. We mark the class as serializable by setting it as a root element via [XMLRootAttribute(…)] and stuff we don’t want to serialize we mark with [XmlIgnore]. To beautify the XML output we can even add attributes to different fields to specify the arguments, type and name).
New knowledge for me is that JSON in .Net is fully integrated into the .Net Framework. But you have to manually add the references to “System.ServiceModel.Web” and System.Runtime.Serialization.XmlObjectSerializer to your project to take full advantage of this. Take a look at the following sourcecode:
As you can see the JSON serialization syntax closely matches that of the BLOB (BinaryFormatter) syntax, which makes it easy to use, but there is less control over the output than with XML.
Special thanks to creator1988 for pointing me to the JSON serializer.