Errors when publishing a node

Noticed this problem a few times, and not sure if its been fixed in newer versions of Umbraco, but is definitely an issue in 4.x.

Basically, you get a report from a user saying that they can save, but not publish a page.  The error message is sometimes a .NET yellow screen of death, but sometimes just an Umbraco error.

If you look in the umbracoLog table in the database, you'll see the publish error description contains something like this:

System.Xml.XmlException: ' ', hexadecimal value 0x1F, is an invalid character. Line 1, position 534

We worked out that the user can type (or more likely, paste) whatever they like in to the various text boxes in the Umbraco back office - whether its a visible character, or not (0x1F in the example above being a "unit separator" control character).

Umbraco will happily save this in the database with no errors.  However, as soon as you publish the node, it needs to add it to the cache XML document (/app_data/umbraco.config), and a .NET exception is thrown because 0x1F is not a valid character.

The resolution is (where possible) to edit the node in the back office to remove the characters and then save/publish.

However, there are some situations where this issue can get you in to a state where the umbraco.config cache file is broken and can't be rebuilt on application start.

In this case, you'll see something like this recorded in the umbracoLog table:

Error Republishing: System.Xml.XmlException: ' ', hexadecimal value 0x1F, is an invalid character

If this happens, you can search the cmsContentXml table in the database to find records with the invalid character, and manually edit the field value to remove them.

I used the following query to identify the broken nodes (replace "your_schema" with the name of your database, and "0x1F" with the hex value of the character you're searching for):

SELECT [nodeId],[xml] FROM [your_schema].[dbo].[cmsContentXml] WHERE xml LIKE CONCAT('%',CONVERT(VARCHAR, 0x1F),'%');

Comments