Skip to content

Generating Emacs Autoload Files – Fast Barebone Style

The code to accompany this article can be found here.

To speed things up, Emacs has some tricks to delay loading code until it’s really needed. Autloading is one of those. Certain functions can be marked as autoloaded, and for those Emacs will create just a stub that tells Emacs where to find the code when it is first requested.

This mechanism is used extensively together with Emacs packages to load them on demand. When a package is installed, Emacs scrapes through all .el files in that package, and collects all those autoloaded symbols into a special file called *-autoload.el. Anatomy of this file is quite simple, there will be a statement that adds that package directory to Emacs load-path, followed by all autoload statements. For each installed package there is one such file, and those files gets loaded when Emacs starts.

It turned out it is a bit slow to iterate through each directory and load those files on start, so package-quicstart.el was born. It is nothing but all those autoload files concatenated into one big file. Instead of iterating through each directory and loading many small files, Emacs loads just one big file when it starts which is a big speedup.

While I think that package-quickstart is a good idea, there is a small part I dislike about it. Load-path will get populated with long path strings, one per each package directory, and due to how it is populated, with add-to-list, it will spend lots of cpu cycles for completely unnecessary calculations. For example I have ~200 packages installed. Since add-to-list will check if an element is already present in the list, it means we will do N(N-1) string comparisons each and every time we start Emacs. The speed is not the problem, what I dislike is to spend the cpu cycles, which translates to battery life, on calculations which we know are completely unnecessary and which we can quite easily skip.

My second disagreement with package-quickstart, is not really a disagreement, but just that it does a bit more than I need it to. It takes in documentation for symbols, which itself is a good thing, since we can lookup things before they are loaded. Problem is how people write their docs. Some people seems to put entire novels in their symbol documentation. Docs makes package-quickstart.el to really blow up in size. Mine is ~800 kb. I personally use helpful to lookup help for symbols, which will lookup the source code for a function anyway, so I can live without docs in autoload stubs.

Finally package-quickstart also adds info paths to Emacs, so you can read info manuals for external packages in your Emacs as well. This is by no mean a bad thing, but very few packages seems to come with info files, and I haven’t found myself reading those anyway, so I personally can live without those too.

Now I don’t think that package-quickstart is badly designed. Considering the rest of design it builds on, it does things as best it can. Even the first point is not really something quickstart can do about. It is the way package management is designed, and I personally don’t think they could do so much differently, to be honest. It is just that my personal needs are slightly more modest, and I am more specific in what I want from my Emacs, so I can get rid of some generality I know I won’t use anyway.

What I did before to suite package-quickstart.el more to my taste, was to actually programmatically remove some statements from package-quickstart.el. That works but is error prone and feels unreliable. I am also a bit annoyed with speed ofpackage-quickstart.el generation. It takes a while, and it takes a while for my program which generates my init file to process it. For a while, I have been doing things slightly differently: I am generating my own quickstart file, or rather to say, autoloads file. Emacs has a savvy function, update-file-autoloads, which can be called per file, and which also can emit all extracted autoloads into a file. So I had something like this:

(let ((al (expand-file-name "autoloads.el" user-emacs-directory))
        (pkgdir (expand-file-name "elpa/" user-emacs-directory))
        (autoload-modified-buffers nil))
    (unless (file-exists-p al)
      (verbose "Generating autoloads: %s" al)
      (dolist (src (directory-files-recursively pkgdir "\\.el$" nil t))
        (update-file-autoloads src t al))

What happeneds here is that I just collect all .el files recursively in elpa directory and generate autoloads in a file ~/.emacs.d/autoloads.el.

So far so good, but I also have a directory called ”lisp” in my .emacs.d, where I put code I write myself, or files I downloaded from the web that does not belong to any packages, like from Emacs Wiki or so. I used to generate autoloads for this directory with

(package-generate-autoloads "lisp" lisp-dir)

but that places autoloads into ”lisp” directory, and I had to cut and paste that programmatically into my init file. My whole emit autoloads routine looked like this:

(defun emit-autoloads ()
  (message "Emiting autoloads")
  (let* ((al "autoloads.el")
         (lisp-dir (expand-file-name "lisp/" user-emacs-directory))
         (ll (expand-file-name "lisp-autoloads.el" lisp-dir))
         (pq (expand-file-name "package-quickstart.el" user-emacs-directory)))
    (when (file-exists-p pq)
      (verbose "removing %s" pq)
      (delete-file pq)
      (package-quickstart-refresh))
    (require 'package)
    (package-generate-autoloads "lisp" lisp-dir)
    (with-temp-file al
      (when (file-exists-p pq)
        (verbose "Baking package quickstart")
        (insert-file-contents pq))
      (goto-char (point-min))
      (kill-line 2)
      (insert "(defvar package-activated-list nil)")
      (goto-char (point-max))
      (insert "(package-activate-all)")
      (kill-line -4)
      (goto-char (point-max))
      (when (file-exists-p ll)
        (verbose "Baking site autoloads file %s" ll)
        (insert-file-contents ll))
      (goto-char (point-min))
      (while (not (eobp))
        (when (re-search-forward "^(add-to-list" (line-end-position) t)
          (beginning-of-line)
          (kill-line 2))
        (beginning-of-line)
        (forward-line 1)))))

Horrible, isn’t it? I never liked it, but it did the job.

So I wanted to simplify that, and re-wrote entire routine:

(defun emit-autoloads ()
  (message "Emiting autoloads")
  (let ((al (expand-file-name "autoloads.el" user-emacs-directory))
        (pkgdir (expand-file-name "elpa/" user-emacs-directory))
        (autoload-modified-buffers nil))
    (unless (file-exists-p al)
      (verbose "Generating autoloads: %s" al)
      (dolist (src (directory-files-recursively pkgdir "\\.el$" nil t))
        (update-file-autoloads src t al))
      (dolist (src (directory-files-recursively init-file-lisp-directory "\\.el$" nil t))
        (autoload-generate-file-autoloads src nil al)
        (autoload-save-buffers))
      (with-temp-file al
        (insert-file-contents al)
        (goto-char (point-min))
        (kill-line 2)
        (insert "(defvar package-activated-list nil)")
        (goto-char (point-max))
        (kill-line -4)
        (insert "(package-activate-all)")))))

Still not something I would take to a CS class, but a little bit better. But the horror now: when the update-file-autoloads finnishes, all paths in generated autoloads from files in ”lisp” directory are prefixed with lisp/ in symbol names. I have ”~/.emacs.d/lisp” in path, so ’term-toggle ’would be found, but not ’lisp/term-toggle’. I would have to put my .emacs.d in load-path for this to work, which I don’t want. So I digged in and used autoload-generate-file-autoloads routine. As a bonus I spent quite some time in Emacs code until I realized that I have to call another routine to actually flush out generated autoloads to a file: autoload-save-buffers. Hmm …? No well, I guess that code wasn’t ment to be used by end-users like me.

Unfortunately, that still left generated autoloads prefixed with the directory name they were in. Also, frankly, the autoload file generation was so horrrrrrrribly slow. I had to sit and wait for autoloads for entire elpa directory to be generated.

Fed up with failure to easily generate autoloads they way I prefer it, I wrote my own program to generate autoloads. How difficult it was? Actually I am surprised, both on simplicity and the speed. It is just a fraction of the size of autoload-generate-file-autoloads routine which is the one that does most of the work in autoloads generation. It also does only the fraction of the job that Emacs routine does :).

The strategy I did was to scrape each file for autoload cookies and put the corresponding statement into a hash table. When all files are processed, I simply iterate the hash table and generate autoload statement for each symbol to autoload file. Heavy lifting is done by directory-files-recursively, which is responsible to collect all source files:

(defun al--get-sources (dir-tree-or-dir-tree-list)
  (let (srcs)
    (if (listp dir-tree-or-dir-tree-list)
      (dolist (dir-tree dir-tree-or-dir-tree-list)
        (setq srcs (nconc srcs (directory-files-recursively dir-tree "\\.el$"))))
      (setq srcs (directory-files-recursively dir-tree-or-dir-tree-list "\\.el$")))
    srcs))

The only tricky part was to actually emit correct thing to autoload statement. It was tricky because, as I learned, people put entire programs in autoloads. Check for example anaphora package. The author has put entire program to be executed as autoload! To be honest I don’t know, maybe it is really clever usage of autoload, but I have a feeling that autoload isn’t designed for that purpose. If you have to put entire program in autoload statement, than put it in your source file and tell me it has to be required, don’t maskerade entire program as autoloaded function. If it wasn’t for the fact that people put weird stuff into autoloads, it would be straightforwad to scrape for autoloads: re-serach-forward for autoload cookie and read next-sexp. However, due to mentioned trickery I ended up with this:

(defun al-quoted (sym)
  (if (and (consp sym) (eq (car sym) 'quote))
      sym `(quote ,sym)))

(defun al-collect-autoloads (src index)
  (let (sxp sym)
    (with-current-buffer (get-buffer-create "*al-buffer*")
      (erase-buffer)
      (insert-file-contents src)
      (goto-char (point-min))
      (while (re-search-forward "^;;;###autoload" nil t)
        (setq sxp nil sym nil)
        (setq sxp (ignore-errors (read (current-buffer))))
        (when (listp sxp)
          (setq sym (al-quoted (cadr sxp)))
          (unless (listp (cadr sym))
            (puthash sym index al-amap)))))))

Extra checks ensure that we really have a simple symbol, not a complex s-expression. Obviously I simply ignore anything that isn’t just a function or variable name, which leads to some loss of autloaded symbols. For the bad and the good. I also don’t take the docs, nor do I care to tell Emacs that autoloaded symbol is a function, I just record the bare minimum, symbol name and the file name to load:

(defun al-write-autoloads (tofile)
  (with-temp-file tofile
    (maphash (lambda (symbol index)
               (let ((path (file-name-nondirectory (gethash index al-lmap))))
                 (prin1 `(autoload ,symbol ,path)
                        (current-buffer))
                 (insert "\n"))) al-amap)))

That is pretty much it. Of course there is a small entry routine to kick things going, but all in all, it is less than 50 lines of code all togehter. Somewhat simplistic, but thus far I am happy with it. I have used it for a day, normal usage, and I miss nothing. The size of my autoload file is ~98kb, compared to ~780kb for package-quickstart.el. It is about 7x smaller. Also init time has decreased somewhat. Not by much, it used to be ~0.22, it is now ~0.18. What I am really happy about is generation time. It is so fast that first time I run the function I thought nothing was generated. Compared to what I used to do:

(dolist (src (directory-files-recursively pkgdir "\\.el$" nil t))
  (update-file-autoloads src t al))
(al-gen-autoloads '("~/.emacs.d/elpa" "~/.emacs.d/lisp"))

feels like it is flying. When becnhmarked with benchmark-run it says ~0.5 seconds. I used to sit and wait for the old one to finnish what seemed to be forever.

In regards to package management, I could also hack package.el to generate simplified autoload files at installation time too, or to not generate them at all, but I haven’t done that yet.

Since I can generate one autoloads file for many directory trees, I can easily add some other repo directory which has nothing to do with built-in package management, for example my ~/repos path where I clone git repositories I am interested on hacking. It would need a tad bit more work to also emit load paths for directories in there, but since they are recorded in a hashmap anyway, it shouldn’t be hard to do; it is just to output them to a file.

Finally should I recommend this? I do not. This is really a bare-bone of autoloading that works for me, but there are probably many use-cases where it wouldn’t work. Info files are completely missing. Package-quickstart also adds paths to info files, so you have that documentation in your Emacs too. This does not, so don’t use it if you are used to read info docs for external packages.

If you still would like to try this, to experiment with it or improve it, the code can be found on my github. A thing to note is that the generated file is probably not usable on it’s own. This, because if you would like to replace package-quickstart.el with this, you will have to add some additional code to activate packages and to emit load paths, I do concatenate autoloads with my init file where I also add the statement to activate packages, and I emit entire precomputed load-path to my early-init.el. So for me, I just need autoload statements scraped to one place, but for you it might not be enough, but I do hope that you enjoyed this little experiment as a sort of Emacs entertainment at least :).

Post a Comment

Your email is never published nor shared. Required fields are marked *