Highlight documentation
Highlight manual
Content
Overview
Highlight converts sourcecode to HTML, XHTML, RTF, LaTeX, TeX, SVG, BBCode and terminal escape sequences with coloured syntax highlighting. Language definitions and colour themes are customizable.
Intended purpose
Highlight was designed to offer a flexible but easy to use syntax highlighter for several output formats. Instead of hardcoding syntax or colouring information, all relevant data is stored in configuration scripts. These scripts may be altered or enhanced with plug-in scripts.
Feature list
- highlighting of keywords, types, strings, numbers, escape sequences, comments, operators and preprocessor directives
- coloured output in HTML, XHTML 1.1, RTF, TeX, LaTeX, SVG, BBCode and terminal escape sequences
- supports referenced stylesheet files for HTML, LaTeX, TeX or SVG output
- syntax elements are defined as regular expressions or plain string lists
- customizable keyword groups
- recognition of nested languages within a file
- all configuration files are Lua scripts
- supports plug-in scripts to tweak language definitions and themes
- reformatting and indentation of C, C++, C# and Java source code
- wrapping of long lines
- configurable output of line numbers
Supported programming and markup languages
Please see the supported language list.
Usage and options
Quick introduction
The following examples show how to produce a highlighted C++ file, using an input file called main.cpp:
- Generate HTML: highlight -i main.cpp -o main.cpp.html highlight < main.cpp > main.cpp.html --syntax cpp You will find the HTML file and highlight.css in the working directory. If you use IO redirection, you must define the programming language with --syntax. - Generate HTML with embedded CSS definitions and line numbers: highlight -i main.cpp -o main.cpp.html --include-style --line-numbers - Generate HTML with inline CSS definitions: highlight -i main.cpp -o main.cpp.html --inline-css - Generate HTML using "horstmann" source formatting style and "neon" colour theme: highlight -i main.cpp -o main.cpp.html --reformat horstmann --style neon - Generate LaTeX: highlight --out-format=latex -i main.cpp -o main.cpp.tex The following output formats may be used with --out-format: html: HTML 4.01 xhtml: XHTML 1.1 tex: Plain TeX latex: LaTeX rtf: RTF ansi: Terminal 16 color escape codes xterm256: Terminal 256 color escape codes svg: SVG bbcode: BBCode Default output is HTML if no other format is specified. - Customize font settings: highlight --syntax ada --out-format=xhtml --font-size 12 --font Consolas,\'Courier\ New\' highlight --syntax ada --out-format=latex --font-size tiny --font sffamily - Define an output directory: highlight -d some/target/dir/ *.cpp *.h
CLI options
The command line version of highlight offers following options:
USAGE: highlight [OPTIONS]... [FILES]...
General options:
-B, --batch-recursive=<wc> convert all matching files, searches subdirs
(Example: -B '*.cpp')
-D, --data-dir=<directory> set path to data directory (deprecated)
--config-file=<file> set path to a lang or theme file
-d, --outdir=<directory> name of output directory
-h, --help print this help
-i, --input=<file> name of single input file
-o, --output=<file> name of single output file
-p, --list-langs list installed language definitions (deprecated)
-P, --progress print progress bar in batch mode
-q, --quiet supress progress info in batch mode
-S, --syntax=<type> specify type of source code
-v, --verbose print debug info
-w, --list-themes list installed colour themes (deprecated)
--force generate output if input syntax is unknown
--list-scripts=<type> list installed scripts
<type>=[langs, themes, plugins]
--plug-in=<script> execute Lua plug-in script; repeat option to
execute multiple plug-ins
--plug-in-read=<path> set input file for a plug-in (e.g. "tags")
--print-config print path configuration
--print-style print stylesheet only (see --style-outfile)
--skip=<list> ignore listed unknown file types
(Example: --skip='bak;c~;h~')
--start-nested=<lang> define nested language which starts input
without opening delimiter
--validate-input test if input is text, remove Unicode BOM
--version print version and copyright information
Output formatting options:
-O, --out-format=<format> output file in given format
<format>=[html, xhtml, latex, tex,
odt, rtf, ansi, xterm256, bbcode, svg]
-c, --style-outfile=<file> name of style file or print to stdout, if
'stdout' is given as file argument
-e, --style-infile=<file> file to be included in style-outfile
-f, --fragment omit document header and footer
-F, --reformat=<style> reformats and indents output in given style
<style>=[allman, banner, gnu,
horstmann, java, kr, linux, otbs,
stroustrup, whitesmith]
-I, --include-style include style definition
-J, --line-length=<num> line length before wrapping (see -W, -V)
-j, --line-number-length=<num> line number width incl. left padding
-k, --font=<font> set font (specific to output format)
-K, --font-size=<num?> set font size (specific to output format)
-l, --line-numbers print line numbers in output file
-m, --line-number-start=<cnt> start line numbering with cnt (assumes -l)
-s, --style=<style> set colour style (see -w)
-t, --replace-tabs=<num> replace tabs by <num> spaces
-T, --doc-title=<title> document title
-u, --encoding=<enc> set output encoding which matches input file
encoding; omit encoding info if set to NONE
-V, --wrap-simple wrap long lines without indenting function
parameters and statements
-W, --wrap wrap long lines
--wrap-no-numbers omit line numbers of wrapped lines
(assumes -l)
-z, --zeroes pad line numbers with 0's
--kw-case=<case> change case of case insensitive keywords
<case> = [upper, lower, capitalize]
--delim-cr set CR as end-of-line delimiter (MacOS 9)
--no-trailing-nl omit trailing newline
(X)HTML output options:
-a, --anchors attach anchor to line numbers
-y, --anchor-prefix=<str> set anchor name prefix
-N, --anchor-filename use input file name as anchor prefix
-C, --print-index print index with hyperlinks to output files
-n, --ordered-list print lines as ordered list items
--class-name=<name> set CSS class name prefix;
omit class name if set to NONE
--inline-css output CSS within each tag (verbose output)
--enclose-pre enclose fragmented output with pre tag
(assumes -f)
LaTeX output options:
-b, --babel disable Babel package shorthands
-r, --replace-quotes replace double quotes by \dq{}
--pretty-symbols improve appearance of brackets and other symbols
RTF output options:
-x, --page-size=<ps> set page size
<ps> = [a3, a4, a5, b4, b5, b6, letter]
--char-styles include character stylesheets
SVG output options:
--height set image height (units allowed)
--width set image width (see --height)
GNU source-highlight compatibility options:
--doc create stand alone document
--no-doc cancel the --doc option
--css=filename the external style sheet filename
--src-lang=STRING source language
-t, --tab=INT specify tab length
-n, --line-number[=0] number all output lines, optional padding
--line-number-ref[=p] number all output lines and generate an anchor,
made of the specified prefix p + the line
number (default='line')
--output-dir=path output directory
--failsafe if no language definition is found for the
input, it is simply copied to the output
GUI options
The Graphical User Interface offers a subset of the CLI features. It includes a dynamic preview of the output file's apperarance. Please see screenshots and GUI animations.
Input and output
If no input or output file name is defined by --input and --output options, highlight will use stdin and stdout for file processing.
If no input filename is defined by --input or given at the prompt, highlight is not able to determine the language type by means of the file extension (but some scripting languages are determined by the shebang in the first input line). In this case you have to pass highlight the given langage with --syntax (this should be the file suffix of the source file in most cases).
Example: If you want to convert a Python file, highlight needs to load the py.lang language definition. The correct argument of --syntax would be "py". If you pass the filename directly to highlight, the program fetches the ".py" extension from the file name.
highlight test.py highlight < test.py --syntax py # --syntax option necessary cat test.py | highlight --syntax py
If there exist multiple suffixes (like C, cc, cpp, h with C++ - files), you assign them to the matching language definition in the file $CONF_DIR/filetypes.conf.
Highlight enters the batch processing mode if multiple input files are defined
or if --batch-recursive is set.
In batch mode, highlight will save the generated files using the original
filename, appending the extension of the chosen output type.
If files in the input directories happen to share the same name, the output
files will be prefixed with their source path name.
The --out-dir option is recommended in batch mode. Use --quiet to improve
performance (recommended for usage in shell scripts).
HTML, TeX, LaTeX and SVG output
The HTML, TeX, LaTeX and SVG output formats allow to reference style definition files which contain the formatting information (stylesheets).
In HTML and SVG output, this file contains CSS definitions and is saved as 'highlight.css'. In LaTeX and TeX, it contains macro definitions, and is saved as 'highlight.sty'.
Name and path of the stylesheet may be modified with --style-outfile. If the --outdir option is given, all generated output, including stylesheets, are stored in this directory.
Use --include-style to embed style information in the output documents without referencing a stylesheet.
Referenced style definitions have the advantage to share all formatting information in a single file, which affects all referencing documents.
With --style-infile you define a file to be included in the final formatting information of the document. This way you enhance or redefine the default highlight style definitions without editing generated code.
GNU source-highlight compatibility
The command line interface is extensively harmonised with source-highlight
(http://www.gnu.org/software/src-highlite/).
The following highlight options have the same meaning as in source-highlight:
--input, --output, --help, --version, --out-format, --title, --data-dir, --verbose, --quiet, --ctags-file
These options were added to enhance compatibility:
--css, --doc, --failsafe, --line-number, --line-number-ref, --no-doc, --tab, --output-dir, --src-lang
These switches provide a common highlighter interface for scripts, plugins etc.
Advanced options
Adding Exuberant Ctags information
HTML output can be enhanced with descriptive tooltips based on ctags data:
ctags *.* highlight --ctags-file *.cpp
The default ctags-file parameter is "tags", so it is omitted in this example. This command will add the type, namespace and definition file path of recognized language tokens.
Example: "member | class:highlight::HtmlGenerator | htmlgenerator.h"
Prevent parsing of binary input files
If highlight could be invoked with all kinds of input, you can disable parsing of binary files using --validate-input. This flag causes highlight to match the input file header with a list of magic numbers. If a binary file type is detected, highlight quits with an error message. This switch also removes the UTF-8 BOM.
Highlight nested code without starting delimiter
If a file starts with an embedded code section which misses the starting delimiter, the --start-nested option will switch to the nested language mode. This can happen with LuaTeX files:
highlight luatex.tex --latex --start-nested=inc_luatex
The inc_luatex definition is a Lua definition with TeX line comments. Note that the nested code section has to end with the ending delimiter defined in the host language definition.
Tips and tricks
Test new configuration scripts
The option --config-file helps to test new config files before installing them.
The given file must be a lang or theme file.
highlight --config-file xxx.lang --config-file yyy.theme -I
Debug language definitions
Use --verbose to display the Lua and syntax data.
Modify HTML line number formatting
01 /* content of user.css (adds border and a line 02 to the line numbering) */ 03 pre.hl { 04 border-width: 1px; 05 border-style:solid; 06 border-left-color: silver; 07 border-top-color: silver; 08 border-right-color: gray; 09 border-bottom-color: gray; 10 } 11 12 .hl.lin { 13 border-right:1px solid #555555; 14 padding-left:0.5em; 15 padding-right:0.5em; 16 margin-right:1em; 17 text-decoration:none; 18 }
Usage:
highlight -l --style-infile user.css main.cpp
HTML list formatting tricks
The following examples assume that HTML was generated as ordered list using
the --ordered-list switch. Include the CSS snippets with --style-infile.
01 /* highlight odd lines */ 02 ol li:nth-child(odd) { 03 background-color: #1f3030; 04 } 05 06 /* highlight every 5th line*/ 07 ol li:nth-child(5n) { 08 background-color: #1f3030; 09 } 10 11 /* highlight every 10th line number*/ 12 ol li:nth-child(10n) { 13 color: #ffff00; 14 }
Remove an UTF-8 BOM
Use --validate-input to get rid of the UTF-8 byte order mark.
Configuration
File format
The configuration files are Lua scripts. These constructs are sufficient to edit the scripts:
Variable assigment:
name = value
(variables have no type, only values have)
Strings
string1="string literal with escape: \n"
string2=[[raw string without escape sequence]]
If a raw string value starts with "[" or ends with "]", pad the paranthesis with
space to avoid a syntax error. Highlight will strip the string.
Comments
-- line comment
--[[ block comment ]]
Arrays
array = { first=1, second="2", 3, { 4,5 } }
Arrays may have identifiers and can be nested.
Please refer to the Lua manual for more details about the Lua syntax.
Regular expressions
Please see Regular expressions for the supported regex constructs.Language definitions
A language definition describes all elements of a programming language which will be highlighted by different colours and font types. Save the new file in $HL_DIR/langDefs, using the following name convention:
<usual extension of sourcecode files>.lang
Examples: PHP -> php.lang, Java -> java.lang If there exist multiple suffixes, list them in $HL_DIR/filetypes.conf.
Keywords = { Id, List|Regex, Group? }
Id: Integer, keyword group id (values 1-4, can be used for several keyword
groups)
List: List, list of keywords
Regex: String, regular expression
Group: Integer, capturing group id of regular expression, defines part of regex
which should be returned as keyword (optional; if not set, the match
with the highest group number is returned (counts from left to right))
Comments = { {Block, Nested?, Delimiter=} }
Block: Boolean, true if comment is a block comment
Nested: Boolean, true if block comments can be nested (optional)
Delimiter: List, contains open delimiter regex (line comment) or open and close
delimiter regexes (block comment)
Strings = { Delimiter|DelimiterPairs={Open, Close, Raw?}, Escape?, RawPrefix? }
Delimiter: String, regular expression which describes string delimiters
DelimiterPairs: List, includes open and close delimiters if not equal (regex),
includes optional Raw flag as boolean which marks
delimiter pair as raw string
Escape: String, regular expression of escape sequences (optional)
RawPrefix: String, defines raw string indicator (optional)
PreProcessor = { Prefix, Continuation? }
Prefix: String, regular expression which describes open delimiter
Continuation: String, contains continuation character (optional)
NestedSections = {Lang, Delimiter= {} }
Lang: String, name of nested language
Delimiter: List, contains open and close delimiters of the code section
Description: String, Defines syntax description
Digits: String, Regular expression which defines digits (optional)
Identifiers: String, Regular expression which defines identifiers
(optional)
Operators: String,Regular expression which defines operators
EnableIndentation: Boolean, set true if syntax may be reformatted and indented
IgnoreCase: Boolean, set true if keyword case should be ignored
Script Environment:
The following variables are defined when a script is executed:
hl_lang_dir: current path of language definitions (use with dofile)
Identifiers: Default regex for identifiers;
Digits: Default reegx for numbers
The following variables are integers which represent the internal highlighting
states:
HL_STANDARD
HL_STRING
HL_NUMBER
HL_LINE_COMMENT
HL_BLOCK_COMMENT
HL_ESC_SEQ
HL_PREPROC
HL_PREPROC_STRING
HL_OPERATOR
HL_LINENUMBER
HL_KEYWORD
HL_STRING_END
HL_LINE_COMMENT_END
HL_BLOCK_COMMENT_END
HL_ESC_SEQ_END
HL_PREPROC_END
HL_OPERATOR_END
HL_KEYWORD_END
HL_EMBEDDED_CODE_BEGIN
HL_EMBEDDED_CODE_END
HL_IDENTIFIER_BEGIN
HL_IDENTIFIER_END
HL_UNKNOWN
Hook functions:
OnStateChange(oldState, newState, token)
Hook Event: Highlighting parser state change
Input: Old state, intended new state and the current token which led to
the new state
Returns: Correct state to continue
See the file README_REGEX for a detailed description of the regular expression
syntax.
Example:
01 Description="C and C++" 02 03 Keywords={ 04 { Id=1, 05 List={"goto", "break", "return", "continue", "asm", "case", "default", 06 -- [..] 07 } 08 }, 09 -- [..] 10 } 11 12 Strings = { 13 Delimiter=[["|']], 14 RawPrefix="R", 15 } 16 17 Comments = { 18 { Block=true, 19 Nested=false, 20 Delimiter = { [[\/\*]], [[\*\/]] } }, 21 { Block=false, 22 Delimiter = { [[//]] } } 23 } 24 25 IgnoreCase=false 26 27 PreProcessor = { 28 Prefix=[[#]], 29 Continuation="\\", 30 } 31 32 Operators=[[\(|\)|\[|\]|\{|\}|\,|\;|\.|\:|\&|\<|\>|\!|\=|\/|\*|\%|\+|\-|\~]] 33 34 EnableIndentation=true
Theme definitions
Colour themes contain the formatting information of the language elements which are described in language definitions.
The files have to be stored as *.theme in HL_DIR/themes*. Apply a style with the --style option.
Format attributes:
Attributes = {Colour, Bold?, Italic?, Underline? }
Colour: String, defines colour in HTML hex notation ("#rrggbb")
Bold: Boolean, true if font should be bold (optional)
Italic: Boolean, true if font should be italic (optional)
Underline: Boolean, true if font should be underlined (optional)
Theme elements:
Description: String, Defines theme description
Default = Attributes (Colour of unspecified text)
Canvas = Attributes (Background colour )
Number = Attributes (Formatting of numbers)
Escape = Attributes (Formatting of escape sequences)
String = Attributes (Formatting of strings)
PreProcessor = Attributes (Formatting of preprocessor directives)
StringPreProc = Attributes (Formatting of strings within
preprocessor directives)
BlockComment = Attributes (Formatting of block comments)
LineComment = Attributes (Formatting of line comments)
LineNum = Attributes (Formatting of line numbers)
Operator = Attributes (Formatting of operators)
Keywords= {
Attributes1,
Attributes2,
Attributes3,
Attributes4,
}
AttributesN: Formatting of keyword group N. There should be at least four items
to match the number of keyword groups defined in the language
definitions
Example:
01 Default = { Colour="#000000" } 02 Canvas = { Colour="#ffffff" } 03 Number = { Colour="#000000" } 04 Escape = { Colour="#bd8d8b" } 05 String = { Colour="#bd8d8b" } 06 StringPreProc = { Colour="#bd8d8b" } 07 BlockComment = { Colour="#ac2020", Italic=true } 08 PreProcessor = { Colour="#000000" } 09 LineNum = { Colour="#555555" } 10 Operator = { Colour="#000000" } 11 LineComment = BlockComment 12 13 Keywords = { 14 { Colour= "#9c20ee", Bold=true }, 15 { Colour= "#208920" }, 16 { Colour= "#0000ff" }, 17 { Colour= "#000000" }, 18 }
Keyword groups
You may define custom keyword groups and corresponding highlighting styles. This is useful if you want to highlight functions of a third party library, macros, constants etc.
You define a new group in two steps:
1. Define a new group in your language definition (lang file):
Keywords = {
-- add your keyword description:
{Id=5, List = {"ERROR", "DEBUG", "WARN"} }
}
2. Add a corresponding highlighting style in your colour theme (theme file):
Keywords= {
--add your keyword style as 5th item in the list:
{ Colour= "#ff0000", Bold=true },
}
It is recommended to define keyword groups in user-defined plugin scripts to avoid editing of original highlight files.
Plug-ins
The --plug-in option receives the name of a Lua script which can override and enhance the settings of theme and language definition files. Using plug-ins, it is possible to apply costum settings without editing installed highlight configuration files.
Format:
-- function definitions, variables etc
-- Plugin list:
Plugins={
{ Type, Chunk },
}
Type: String, is one of ["theme", "lang"]
Chunk: Name of Lua function
If type is "theme", the chunk will applied to the colour theme. If type is "lang", the chunk will applied to the language definition.
The chunk function will receive an optional parameter will contain a string with the description of the theme or language ("Description" parameter). The chunks are interpreted after the theme or lang file were loaded, so you can refer to elements of these files.
Example:
01 -- function to update language definition with syslog levels 02 function syntaxUpdate(desc) 03 if desc=="C and C++" then 04 table.insert( Keywords, 05 { Id=5, List={"LOG_EMERG", "LOG_CRIT", "LOG_ALERT", 06 "LOG_ERR", "LOG_WARNING","LOG_NOTICE","LOG_INFO", 07 "LOG_DEBUG"} 08 } ) 09 end 10 end 11 12 -- function to update theme definition 13 function themeUpdate(desc) 14 if desc=="Kwrite Editor" then 15 Canvas={ Colour="#E0EAEE" } 16 end 17 table.insert(Keywords, {Colour= "#ff0000", Bold=true}) 18 end 19 20 Plugins={ 21 { Type="theme", Chunk=themeUpdate }, 22 { Type="lang", Chunk=syntaxUpdate }, 23 }
File mapping
The script filetypes.conf assigns file extensions and shebang descriptions to language definitions.
Format:
FileMapping={
{ Lang, Extensions|Shebang },
}
Lang: String, name of language definition
Extensions: list of strings, contains file extensions referring to "Lang"
Shebang: String, Regular expression which matches the first line of the input
file
Edit the file gui_files/ext/fileopenfilter.conf to add new syntax types to the file open filter of the GUI.
Embedding highlight
Sample scripts
See the /examples subdirectory in the highlight source directory for some example scripts in PHP, Perl and Python which invoke highlight and retrieve its output as string. These scripts may be used to develop plug-ins for other applications.
SWIG interface
A SWIG interface file is located in /examples/swig. See README_SWIG for installation instructions and the example scripts as programming reference.
Third party scripts and plug-ins
See the /examples/web_plugins subdirectory in the highlight installation for some plugins which integrate highlight in Wiki and blogging software:
- DokuWiki
- MovableType
- Wordpress
- Serendipity
Other uses of highlight can be found online:
- MediaWiki plugin
- MacOS X Quicklook plugin
- CamlHighlight (Ocsigen extension)
- highlighter package for the R language
- Inkscape plug-in
Building and installing
Precompiled packages
Highlight is written in ISO C++. The following packages are available:- UNIX console and GUI application
- W32 console and GUI application
- statically and dynamically linked library
The website www.andre-simon.de offers links to precompiled packages for several operating systems (like Debian, Arch Linux, Ubuntu, Darwin, FreeBSD). The website distributes the latest upstream sources.
Building dependencies
Highlight is known to compile with gcc and suncc.
It depends on Boost headers and Lua 5.1 developer packages.
The optional GUI depends on Qt4 developer packages.
Please see the makefile for further options.
Packaging example
See Packaging resources for Debian and Fedora packaging examples.
More information can be found in the Wiki.
Deutsche Dokumentation