Tuesday, October 20, 2015

SyntaxHighlighter on Blogger and an unexpected output in a Python snippet

Syntax highlighting on Blogger

I was looking for an easy way to add syntax highlighting to my code snippets here on Blogger. Yes, you can embed Gist code snippets, but this approach is not ideal always. There must be a better way for doing this. And of course there is!

I found SyntaxHighlighter after a few minutes of googling. It does exactly what I want, it's easy to use, supports many syntaxes and is nicely documented. But, for some Python code, it is doing a bit more than it should.


SyntaxHighlighter on Blogger

So how you enable SyntaxHighlighter here on Blogger? It's easy. From the left side menu in the Blogger admin page go to:

Template --> Edit HTML

and add the following code:

<!--SYNTAX HIGHLIGHTER STARTS--> 
<link href='http://alexgorbatchev.com/pub/sh/3.0.83/styles/shCore.css' rel='stylesheet' type='text/css'/> 
<link href='http://alexgorbatchev.com/pub/sh/3.0.83/styles/shThemeDefault.css' rel='stylesheet' type='text/css'/> 

<script src='http://alexgorbatchev.com/pub/sh/3.0.83/scripts/shCore.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/3.0.83/scripts/shBrushXml.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/3.0.83/scripts/shBrushBash.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/3.0.83/scripts/shBrushPython.js' type='text/javascript'/>
<script language='javascript'> 
  SyntaxHighlighter.config.bloggerMode = true; 
  SyntaxHighlighter.all(); 
</script> 
<!--SYNTAX HIGHLIGHTER ENDS-->
right before the </head> tag.

Add brushes for your targeted language or change version number (here 3.0.83) if you want to use some other version or if a newer version is out.

Then all you need to do is to enclose your code in:

<pre class="brush:replace_this_with_brush_name;">
  your code here
</pre>
Refer to the project documentation for more details.

Unexpected output

I noticed a bit awkward behavior in Python snippets. SyntaxHighlighter is adding an extra line of some strange output. For example, if a snippet ends with a Python Traceback, the following line is added.

</module></module></stdin>
Similar line is added if a Python snippet ends with a print statement.

I didn't know hot to prevent this from happening and therefore posted a question on StackOverflow: SyntaxHighlighter on Blogger - unexpected line added to Python snippet.

Fix aka jQuery workaround

If this behavior cannot be prevented, those extra lines can still be removed with JavaScript.

So I came up with a simple jQuery workaround (it is kind of a dirty trick) which removes lines starting with </ and ending with &gt from the SyntaxHighlighter formatted code.

<script language='javascript'> 
$(window).load(function() {
    // line number regex
    var lineNumberRegex = /number\d+/;
    // </some_string> regex
    var unvantedOutputRegex = /^<\/.*>$/;

    // code for removing unvanted last lines added by SyntaxHighlighter to a Python code snippet
    // removes last lines like: </module></module></stdin>, i.e. starting with '</' and ending with '>'
    var syntaxHighlighters = $('.syntaxhighlighter');
    for (var i = 0; i < syntaxHighlighters.length; i++) {
      var syntaxHighlighter = syntaxHighlighters.eq(i);

      var lastLine = syntaxHighlighter.find('.line').last();
      var lastLineText = lastLine.text();

      if (unvantedOutputRegex.test(lastLineText)){
        var lastLineClasses = lastLine.attr('class');
        var lineToRemove = lastLineClasses.match(lineNumberRegex)[0];

        var targetLines = syntaxHighlighter.find('.' + lineToRemove);
        for (var y = 0; y < targetLines.length; y++) {
          targetLines[y].remove()
        }
      }
    };
});
</script>
If you add this code to the end of your page it will remove unwanted lines as soon as the page is completely loaded.