How to Write Custom Neovim Queries
A Case-Study using Quarto
neovim, lua, syntax, syntax highlighting, quarto, pandoc, markdown, neovim injection, TreeSitter injection, vim, neovim customization
In this blog post, I will be discussing how I added a number of highlight and injection groups to neovim
to improve my quality of life while authoring quarto
documents. quarto
is a flavor of markdown for writing computational notebooks that I use very frequently (and in fact this document is written in quarto).
While this pertains to quarto, the principles discussed here are applicable to any language and the development of queries and convenient means to do so are discussed here.
Much of the code discussed here is shared in the Github repository for my Neovim configuration which can be found bellow:
Highlighting Quarto Metadata
My first objective was highlighting quarto metadata in python documents. Metadata comments used in quarto code blocks for python
start with #|
or # |
and should contain valid YAML
, see for instance Figure 1. The end result is like that which is shown in Figure 2.
# | label: my-figure
# | fig-cap: An empty plot
# This is a normal comment
import matplotlib.pyplot as plt
plt.show()
data:image/s3,"s3://crabby-images/7021d/7021decb11c84b6595917b587672193082249a6d" alt=""
python
code with metadata comments highlighted for contrast in NeoVim.
A Trick for Writing Queries Easily
First and for most, it is much easier to write queries by using the :Inspect
and :InspectTree
commands. These allow you to peak into how TreeSitter sees the document and decides how syntax highlighting will work. In Figure 3 we can see just how easy it is:
data:image/s3,"s3://crabby-images/50cc1/50cc1a2c0670e71bcb4b43b71f03205ea2c41324" alt=""
:InspectTree
. On the left hand side the fenced_code_block
contains some python
code which itself contains comments
. See the derived query in Figure 4.
This is not too difficult - all that is required is to match any comment starting with #|
or # |
and label it with an appropriate capture. The following is added to my ~/.config/nvim/queries/python/highlights.scm
and does exactly this:
;;extends
; Cannot do injections on comments, since they have no `inner` like strings.
(
(comment) @comment.python.quarto_metadata
(#match? @comment.python.quarto_metadata "^# \\|") )
#|
or # |
using the match#?
directive. The documentation on queries can tell you more about what each of the various matchers and directives do.
The capture name is arbitrary (so long as it does not collide with an existing capture). However, I named my capture @comment.python.quarto_metadata
since the @comment.python
capture group already exists and these metadata comments are a subset of that capture group. However, the name is completely up to you, the user.
Using the Query
Writing the query on its own does not tell NeoVim what to do with the capture group - it is necessary to specify how the highlight group should be used by providing coloration, font weight and style, and more using nvim_set_hl
as in Figure 5.
vim.api.nvim_set_hl(0, "@comment.python.quarto_metadata", { fg = "d3869b" })
~/.config/nvim/init.lua
.
Next (after saving all of your changes), restart Neovim and move the cursor over some code that should match the selector and use :Inspect
. You should see something like Figure 6. After highlighting looked like Figure 1.
data:image/s3,"s3://crabby-images/be970/be970442cbc70b44c122dcd67e4e79886bc2dacf" alt=""
quarto
metadata comments using :Inspect
. Notice that the highlight group @comment.python.quarto_metadata
as defined in Figure 4 is now shown.
Highlighting Quarto Fenced Divs
data:image/s3,"s3://crabby-images/749ed/749edfd44e75f2910d4fccde446f2bf5e658c90d" alt=""
Fenced div’s are a fundamental piece of pandoc
flavored markdown and pandoc
filters, and thus quarto
markdown and quarto
filters. When not highlighted, I find it less convenient to keep track of the fences, for instance in Figure 8. However, with the fences highlighted in is much easier to make sure that all of the fenced divs are closed. If fenced divs are not closed, some frustrating and strange errors can arise in quarto - notice how easy it is to count the fenced divs in Figure 7.
---
title: Listing page
listing:
- id: my-listings
type: grid
image-height: 256px
sort:
- date desc
---
The outer div _(below, starting with four colon marks)_ will specify some extra
padding to add around the listing and the text above it.
:::: { .px-5 }
Page listings will show up in the fenced div bellow:
::: { #my-listings}
:::
::::
```default
```
By default, fenced divs come with no additional highlighting and can be quite difficult to manage when many are nested and mixed in with text.
data:image/s3,"s3://crabby-images/c1c25/c1c25716f97699ace33d7f2199024fb2a76c58fd" alt=""
:InspectTree
over a quarto fence.
Using :InspectTree
as in Figure 9, we can see that TreeSitter recognizes these fences as paragraph
nodes, so the query (under ~/.config/nvim/queries/markdown/highlights.scm
) will be
(
(
(paragraph) @_
(#match? @_ "^:::+ *(\\{ *.* *\\})?$")
)
@fence.start
)
(
(
(paragraph) @_
(#match? @_ "^:::+ *$")
)
@fence.stop )
@fence.start
and a line starting a sequence of three or more colons and @fence.stop
. Notice the similarities to the output of :InspectTree
in Figure 9.
data:image/s3,"s3://crabby-images/fcd90/fcd90e561daf94104898f4c195fa4ed197e2be19" alt=""
:Inspect
over a quarto fence.
Using :Inspect
will show that the capture was successful like Figure 11. Finally, Adding highlighting can be done using vim.api.nvim_set_hl
:
vim.api.nvim_set_hl(0, "@fence", { fg = "#dc322f", italic = true })
Highlighting Quarto Raw HTML
data:image/s3,"s3://crabby-images/ca4ec/ca4ec11a76b7bbcb3a925e5b9690986ff92cdb1e" alt=""
HTML
with nice syntax highlighting in quarto
. Since HTML
usually contain injected highlighting for scripts and styling (using javascript
and css
respectively) the highlighting for those will occur in the HTML
too. Put simply, injections can also contain injections.
When writing normal HTML
blocks using markdown code blocks in quarto
documents I found that I was getting complete feedback from my lsp
and fantastic highlighting, including highlighting of injected code within the HTML
, e.g. highlighting and LSP
feedback in CSS
in style
tags and javascript
in script
tags.
I was not getting this when using raw HTML
in quarto
(HTML
code that should be put directly into the rendered output, where catching errors in neovim
could save me a great deal of time) - often when authoring with quarto I have found it convenient to use raw HTML blocks like that in Figure 13.
`HTML` directly in quarto:
Some
```{=html}
<p>This is some HTML that should be put directly into the quarto output.</p>
<script type="module">
import * as live from "/js/live/index.js"
console.log("Nested highlighting!")
console.log("Injections in injections!")
</script>
```
```html
<h1>
<code>HTML</code> will be displayed directly in the quarto document as
This
code and not rendered in the browser.</h1>
```
data:image/s3,"s3://crabby-images/4472b/4472bd6cdbacd3efe5fc8e9ae9a4264b6aa452d5" alt=""
:InspectTree
at an HTML
code block. On the left hand side, we can see the TreeSitter tree.
To make this happen, I used :InspectTree
as in Figure 14 (a) to see which pattern I wanted to match, and came up with the query in Figure 15. In this case, it is enough to stop here. Quarto will recognize that the content inside of {=html}
code blocks should be highlighted as HTML
.
;;extends
(
fenced_code_block
(info_string
(language) @_lang
)
(code_fence_content) @injection.content
(#eq? @_lang "=html")
(#set! injection.language "html" ) )
{=html}
. In my case, I added this to ~/.config/nvim/queries/markdown/injections.scm
. Notice the resemblance to Figure 14.
In this case, unlike what was seen in the first section, the capture name must be @injection.content
since this is how neovim
find injected content.
Quarto in Quarto
Additionally, I wanted to add a background to all of quarto
in quarto
blocks (as in Figure 7) so that they were easier to distinguish from the rest of the markup. This is easily achieved by adding the following to the corresponding highlights.scm
:
(
(
fenced_code_block
(info_string (language) @_lang)
(#eq? @_lang "quarto")
(code_fence_content) @quarto_in_quarto
) )
and then
vim.api.nvim_set_hl(0, "@quarto_in_quarto", { bg = "#7c6f64", fg = "#fbf1c7" })
to highlight the background to ~/.config/nvim/init.lua
. This will only put a background behind the text not behind the entire code fence, and it would be bad practice to follow the text with whitespace (since many linters will trim it) thus it makes sense to fill in the remaining background using virtual text, the same text that is used to provide diagnostics from language server providers and other tools.
Extra Fun: Extending Background Code Block Background Highlighting Using Virtual Text
To ensure that the background highlighting was extended all the way to column 80
(even on empty lines) I added the following lua
to init.lua
:
---@alias CodeFenceHLData {language: string, start: number, stop: number, }
---@alias CodeFenceHLOptions {hl_group: string, codefence_language: string, include_delim: boolean, hl: table}
---If a node is a fenced code block, then return the language name if it can
---be determined.
---
---@param bufnr number - Buffer number.
---@param node TSNode -- Treesitter node
---@param options CodeFenceHLOptions
---
---@return CodeFenceHLData?
---
local function get_code_fence_data(bufnr, node, options)
if node:type() ~= "fenced_code_block" then
return
end
local node_info_string = nil
local node_code_fence_content = nil
-- NOTE: Look for the info string.
for child in node:iter_children() do
local ttt = child:type()
if ttt == "info_string" then
node_info_string = child
end
if ttt == "code_fence_content" then
node_code_fence_content = child
end
end
if not node_info_string then
-- vim.print("No info string for code block")
return
end
if not node_code_fence_content then
return
end
local node_language = nil
for child in node_info_string:iter_children() do
if child:type() == "language" then
node_language = child
end
end
if node_language == nil then
-- print("No language for code block.")
return
end
local row_start, col_start, _ = node_language:start()
local row_end, col_end, _ = node_language:end_()
local lines = vim.api.nvim_buf_get_lines(bufnr, row_start, row_end + 1, false)
if not lines then
return
end
local start, stop
if options.include_delim then
start, _, stop, _ = node:range()
else
start, _, stop, _ = node_code_fence_content:range()
end
for _, line in ipairs(lines) do
return { language = string.sub(line, col_start + 1, col_end), start = start, stop = stop }
end
end
---Tack on extra characters to reach `80` character of background using
---some highlight group.
---
---This is used to fill out the background of the `@fenced_code_block.quarto` capture.
---
---@param ns number - Namespace number.
---@param bufnr number - Buffer number.
---@param data CodeFenceHLData
---@param options CodeFenceHLOptions
---@return nil
---
local function append_code_fence_virtual_text(ns, bufnr, data, options)
vim.api.nvim_buf_clear_namespace(bufnr, ns, data.start, data.stop)
local lines = vim.api.nvim_buf_get_lines(bufnr, data.start, data.stop, false)
for index, line in ipairs(lines) do
local line_len = string.len(line)
local line_remainder = 80 - line_len
local spacer = string.rep(" ", line_remainder)
vim.api.nvim_buf_set_extmark(bufnr, ns, data.start + index - 1, -1, {
hl_group = options.hl_group,
virt_text = { { spacer, options.hl_group } }, -- Extend bg
virt_text_pos = "inline",
})
end
end
---Recursively look for code fences.
---
---@param ns number - Namespace number.
---@param bufnr number - Buffer number.
---@param root TSNode - Node to inspect.
---@param options CodeFenceHLOptions
---@return nil
---
local function update_code_fence(ns, bufnr, root, options)
for node in root:iter_children() do
local code_fence_data = get_code_fence_data(bufnr, node, options)
if code_fence_data ~= nil then
vim.print(code_fence_data)
if code_fence_data.language ~= options.codefence_language then
return
end
(ns, bufnr, code_fence_data, options)
append_code_fence_virtual_textelse
(ns, bufnr, node, options)
update_code_fenceend
end
end
---Used to make codeblocks look like full pages by adding virual text.
---Otherwise background only appears behind text.
---
---Using `options.include_delim` will require highlighting the entire code fence.
---
---@param options CodeFenceHLOptions
---@return nil
---
local function add_codefence_virtual_text(options)
local ns = vim.api.nvim_create_namespace("quarto_code_bg")
local bufnr = vim.api.nvim_get_current_buf()
local parser = vim.TreeSitter.get_parser(bufnr, "markdown")
local tree = parser:parse()[1]
local root = tree:root()
vim.print(options)
(ns, bufnr, root, options)
update_code_fenceend
---@param options CodeFenceHLOptions
---@return nil
local function _codefence(options)
vim.api.nvim_set_hl(0, options.hl_group, options.hl)
vim.api.nvim_create_autocmd({ "BufEnter", "TextChanged", "TextChangedI" }, {
pattern = "*.qmd",
callback = function()
return add_codefence_virtual_text(options)
end,
})
end
---@param options_list CodeFenceHLOptions[]
local function codefence(options_list)
for _, options in ipairs(options_list) do
(options)
_codefenceend
end
-- NOTE: Colors from solarized pallete: https://en.wikipedia.org/wiki/Solarized
vim.api.nvim_set_hl(0, "@comment.python.quarto_metadata", { fg = "#d3869b" })
vim.api.nvim_set_hl(0, "@comment.mermaid.quarto_metadata", { fg = "#d3869b" })
vim.api.nvim_set_hl(0, "@fence", { fg = "#dc322f", italic = true })
({
codefence{
codefence_language = "quarto",
hl_group = "@fenced_code_block.quarto",
include_delim = true,
hl = { bg = "#002b36" },
},
{
codefence_language = "python",
hl_group = "@fenced_code_block.python",
include_delim = true,
hl = { bg = "#073642" },
},
{
codefence_language = "default",
hl_group = "@fenced_code_block.default",
include_delim = true,
hl = { bg = "#002b36", fg = "#268bd2" },
},
{
codefence_language = "=html",
hl_group = "@fenced_code_block.html",
include_delim = true,
hl = { bg = "#002b36", fg = "#268bd2" },
},
{
hl = { bg = "#002b36", fg = "#268bd2" },
hl_group = "@fenced_code_block.mermaid",
include_delim = true,
codefence_language = "mermaid",
},
})