Find Org Files
After some discussions with a friend, I wrote an essay with code for how to load a file in Emacs by first selecting a tag (that you would have added to your files), and then selecting from files that have that tag. Seemed like a good idea at the time, but it dawned on me that I could take advantage of the filtering capabilities with new extensions to completing-read.
In this essay, I explain how I can select a Org files based on either filename, title or associated tags … or some combination of them. Once we would write a program to search all files, gather the information (indexing the function names, but in this case, I’d index the file’s title and any associated tags) and write that into a TAGS
file for Emacs to read, or at least, extend Gnu Global. New search programs, like ripgrep, are so fast, we can use it to dynamically search the files at run-time.
Calling my new org-find-file
and typing eshell
shows the following filtered file collection:
org-find-file
As a thin-wrapper around find-file
, my org-find-file
function interactively filters the list of available files by calling org-find-file—choose-file
function:
(defun org-find-file (file) "Load org-specific file like `find-file'. If called interactively, the list of files inclues the Org's title as well as any headline tags." (interactive (list (org-find-file--choose-file))) (find-file file))
The job of org-find-file--choose-file
is two-fold:
- Display a nice version of all org files
- When chosen, return the filename of the selection
The completing-read
function takes a prompt as a string, and a list of possible choices from file-choices
. Since most of us use something like Ivy, vertico, or Selectrum, we can filter the list until we narrow it down to our selection.
(completing-read "Choose a fruit: " '("apple" "orange" "banana" "other"))
Shows this after typing an
:
The completing-read
can also accept alists, plists, hash-tables, etc.
(let ((alist '(("red" . "apple") ("orange" . "orange") ("yellow" . "banana")))) (completing-read "Choose a fruit: " alist))
In this example, completing-read
shows the colors, and when I select yellow
, the function returns yellow. But, I could use that selection to lookup the other value:
(let* ((alist '(("red" . "apple") ("orange" . "orange") ("yellow" . "banana"))) (choice (completing-read "Choose a fruit: " alist))) (alist-get choice alist nil nil 'equal))
When running this code, selecting yellow
return banana
.
So org-find-file—chose-file
needs to create an associative list, like:
( ;; ... ("eshell-why.org : Why use EShell?" . "eshell-why.org") ("eshell.org : Introduction to Emacs Shell :emacs :technical" . "eshell.org") ("eshell-fun.org : Eschewing Zshell for Emacs Shell :emacs :technical" . "eshell-fun.org") ("piper-presentation.org : Death to the Shell :emacs :presentation :eshell" . "piper-presentation.org") ("eshell-present.org : Presenting the Eshell :technical :shell :emacs :presentation" . "eshell-present.org") ("eshell-presentation.org : Presenting the EShell :technical :shell :emacs :presentation" . "eshell-presentation.org") ("eshell-present-and-notes.org : Presenting Eshell :technical :shell :emacs :presentation :noexport" . "eshell-present-and-notes.org") ;; ...
Where to select the org files?
- If given a
directory
, look for files there - If in a project, (the
project-current
returns non-nil), start from the top of that project - Otherwise, look in the current directory
With the default-directory
set, we get a list of the files from org-find-file—file-choices
, and assign it to a local variable, file-choices
:
(defun org-find-file--choose-file (&optional directory) "Use `completing-read' to present Org files for selection. Acquires the list of files (and their descriptive text) from calling `org-find-file--file-choices' (which returns an alist)." (let* ((default-directory (if (project-current) (project-root (project-current)) (or directory default-directory))) (file-choices (org-find-file--file-choices)) (chosen-file (completing-read "File: " file-choices))) (alist-get chosen-file file-choices nil nil 'equal)))
Creating the AList
The nicely displayed list of org files is a combination of filename, the title, and the tags, so I create two functions for this:
org-find-file—gather-titles
- Returns an
alist
of(filename . title)
org-find-file—gather-tags
- Returns a
hash-table
where key is thefilename
, and value is a list of tags.
To smash them, er. format them, I call seq-map with a λ that cons
a pretty title (from org-find-file—file-format
) and the filename:
(defun org-find-file--file-choices () "Return alist of file _labels_ and the file references." (let ((titles (org-find-file--gather-titles)) (tags (org-find-file--gather-tags))) (seq-map (lambda (entry) (seq-let (file title) entry (cons (org-find-file--file-format file title (gethash file tags)) file))) titles)))
The pretty (and desciptive) filename comes from this function.
(defun org-find-file--file-format (file title tags) "Return a nicely format string containing the parameters." (let* ((title-color `(:foreground ,(face-attribute 'org-document-title :foreground))) (title-str (string-trim title)) (title-pretty (propertize title-str 'face title-color)) (tag-str (string-join tags " "))) ; <-- Updated (format "%s : %s %s" file title-pretty tag-str)))
Note the use of propertize to distinguish the title from both the filename and the tags.
Gathering the Titles
At this point, I need to get the org-file’s titles and tags, and with quick grep-replacements, like ripgrep, I can get the titles of all files using a call like:
rg --ignore-case --no-heading --no-line-number "#\+title:"
Which can return something like:
... Technical/Learning/index.org:#+TITLE: Teaching Programming to Middle Schoolers Technical/Learning/python.org:#+TITLE: Programming with Python Technical/Learning/Python/index.org:#+TITLE: Learning Python Technical/index.org:#+TITLE: 1 Technical/Python/new-project.org:#+TITLE: New Projects in Python Technical/Learning/java.org:#+TITLE: Learning Java Technical/OpenStack/using-heat-templates.org:#+TITLE: Using Heat Templates README.org:#+title: My Website ...
The —gather-titles
function calls rg
, splits each line on the :
character, and returns a list of the filename, and the title:
(defun org-find-file--gather-titles () "Return list " (thread-last "rg --ignore-case --no-heading --no-line-number '^#\\+title:'" (shell-command-to-list) (--map (split-string it ":")) (--map (list (nth 0 it) (nth 2 it)))))
When I will call rg
, using shell-command-to-string, but need the results as a list of strings, split on newline characters:
(defun shell-command-to-list (command) "Return call to COMMAND as a list of strings for each line." (thread-first command shell-command-to-string (string-lines t)))
Gathering the Tags
Getting all the tags for the files is a bit more complicated.
Brief Review of Org Tags
Org surprised me with how easy to add tags, but how little searchability org provides. I mean, you can display an org file with headlines that match a tag (using org-tags-view
) and have the agenda limit its display of TODO
items (with org-agenda-filter-by-tag
).
You associate tags with a headline, like:
* A headline about Something :foo:bar:bird:
This has three tags, foo
, bar
, and bird
.
Like everything in Org, you can type those tags, or you can call the function org-set-tags-command
(defaults to C-c C-q
).
Also, you can associate every headline with a tag, by adding something like this in your buffer:
You’ll notice that on a headline, you surround tags with colons, but on the #+TAGS:
line, you surround them with spaces. This becomes important later as we search for both types.
A feature of Org tags to consider is that headlines inherit tags from parent headlines. For instance:
* Top-level Headline :foo: ** Sub-level Headline :bar: *** Interesting Headline :baz:
The Interesting Headline
has three tags, foo
, bar
, and baz
. Because of this, ripgrep
(or other line-oriented search tools) aren’t as effective in pin-pointing a headline with a particular tag. Since my goal is to open a file based on the tag anywhere in document, this won’t influence the design.
Regular Expression for Tags
While it wouldn’t take much to craft a regular expression to parse the tags from a headline, Org already supplies org-tag-line-re
. But as an Emacs-oriented regexp (obviously), we need to convert it before passing it over to ripgrep
. The pcre2el project can convert this, like:
(format "rg --no-heading '%s' %s" (rxt-elisp-to-pcre org-tag-line-re) project-dir)
The org-tag-line-re
works for headline tags, but not for tags affecting all headlines in a file. I thought could work:
(rx (or (regexp org-tag-line-re) (seq line-start "#+tags:" (one-or-more space) (group (one-or-more (any alnum "@" "_" space))))))
Since both our regular expression as well as org-tag-line-re
start with line-start
(i.e. ^
), the or
clause (i.e. |
) fails to match. In other words, we need to expand this combination to create our own:
(defvar org-find-files-tag-line-re (rx line-start (or (seq (one-or-more "*") " " (+? any) ":" (group (one-or-more (any alnum "@_#%:"))) ":") (seq "#+tags:" (one-or-more space) (group (one-or-more (any alnum "@_#%" space))))) line-end) "Regular expression that matches either headline or global file tags.")
I’m glad to use the rx macro to make the regular expression more readable.
Code to Acquire the Tags
Using this regular expression to search for tags leads to a problem, as the following is possible:
focused-work.org:#+TAGS: emacs hamacs focused-work.org:* Timers :noexport: focused-work.org:* Technical Artifacts :noexport: literate-database.org:#+tags: emacs technical ...
Here, a single file, could have more than one entry, due to repeated tags on different headlines.
The function, org-find-file--gather-tags
, calls rg
with a converted version of org-find-files-tag-line-re
, and shows me all
the filenames and tags, but I need to merge it. My solution was to use a hash table, where I could append the new tags (found on the current line) to any tags found earlier:
(defun org-find-file--gather-tags () "Return hash-table of key as filename, and values are tags. Note that the tags are _all_ tags in the file." (let ((results (make-hash-table :test 'equal)) (tag-list (thread-last (format "rg --ignore-case --no-heading --no-line-number '%s'" (rxt-elisp-to-pcre org-find-files-tag-line-re)) (shell-command-to-list) (--map (split-string it ":"))))) (dolist (entry tag-list) (seq-let (file ignored tags) entry (let ((prev-tags (gethash file results)) (new-tags (org-find-file--massage-tags tags))) (puthash file (seq-union prev-tags new-tags) results)))) results))
This function uses the following helper function to convert the tag-portion of the rg
command line, into a list of tags:
(defun org-find-file--massage-tags (tag-string) "Return TAG-STRING as a list of tags. For instance, the string: foo:bar -> '(\"foo\" \"bar\")" (let* ((tag-separators (rx (1+ (any space ":")))) (tag-list (split-string tag-string tag-separators t))) (--map (concat ":" it) tag-list)))
Since this last function is functional and easy to test:
(ert-deftest org-find-file--massage-tags-test () (should (equal (org-find-file--massage-tags "foo") '(":foo"))) (should (equal (org-find-file--massage-tags "foo bar") '(":foo" ":bar"))) (should (equal (org-find-file--massage-tags "foo:bar") '(":foo" ":bar"))) (should (equal (org-find-file--massage-tags " foo ") '(":foo"))))
There we go. That is everything needed for a search function to list org files, allowing you to select by filename, words in its title, or even tags. If you find this idea interesting, grab the source code.