Apache IE8 HTML entities filter

One of the pages in a web application displays text log file output in popup browser windows.

If that output includes this statement:

<?xml version="1.0" encoding="utf-8"?>

IE8 will try to parse the content as XML, and it will show an error:

The XML page cannot be displayed
Cannot view XML input using style sheet.
Please correct the error and then click the Refresh button, or try again later.
Invalid at the top level of the document. Error processing resource:

I didn’t want to add any scripting to the pages, since they’re text, and I didn’t want to make any coding changes. One solution is to use an Apache output filter to convert the text into HTML entities, and force the document type to text/html.

ExtFilterDefine htmlentities mode=output cmd="/usr/bin/php -R 'echo htmlentities(fgets(STDIN));'"

<FilesMatch "\.txt$">
  ForceType text/html
  SetOutputFilter htmlentities
</FilesMatch>

This is definitely a quick solution that may not be ideal for every situation, or could be refined.

The documents aren’t HTML, they are text. They don’t have any tags in them, and those that are there should not be treated as tags, but as text. Forcing the type to text/plain didn’t work.

Regardless, this is one way you can convert characters into HTML entities without modifying your code.

Different solutions:

  • Extend the filter to add the HTML tags necessary for a true text/html document
  • Modify the code to convert the document to HTML
  • Install recode (see link above)
  • Do something entirely different