10

Does elisp provide a builtin efficient set data structure, similar to set in Python and std::set in C++?

xuhdev
  • 1,839
  • 13
  • 26
  • 1
    Hash tables can be used for that purpose, in fact that's how sets in Ruby work under the hood. – wasamasa Nov 28 '16 at 08:09
  • @wasamasa Do you mean using hash tables with all values set to nil? – xuhdev Nov 28 '16 at 08:12
  • 1
    @xhudev No, with all values set to `t`. That makes it more convenient to check for set inclusion than with `nil` as value. – wasamasa Nov 28 '16 at 08:16
  • Specify more clearly what properties you want of the data structure. Sets can be implemented in various ways, depending on which of their properties are considered most important in a particular context. There is no single implementation that is best for all uses. – Drew Nov 28 '16 at 15:30

2 Answers2

11

The standard way to handle sets in lisps is to use hash tables. In fact, this is how sets are implemented in languages which do provide a dedicated interface (like Python). This offers O(1) access cost, so is perfect for larger sets.

E.g.:

(defun make-set ()
  (make-hash-table))
(defun add-to-set (element set)
  (puthash element t set))
(defun in-set-p (element set)
  (gethash element set nil))
(defun remove-from-set (element set)
  (remhash element set))

Set-level operations, like union and intersection, require more work.

PS. When I said "lisps" above I kinda lied. Common Lisp actually offers list-based set-level functions ("Lists as Sets"), but these have access complexity O(N) and set-level complexity O(N^2), so they are only suitable for smaller sets.

sds
  • 5,928
  • 20
  • 39
11

There are many different ways to implement sets, with very different efficiency profiles. In Elisp, the two simplest ways to implement sets are probably:

  • As a list: This tends to have poor algorithmic efficiency (such as O(N) complexity to test membership) but works well for small sets. Adding an element can be as simple as push or cl-pushnew, and testing is just memq or member.

  • As a hash table: This has a higher constant cost, but a better algorithmic efficiency, so it works much better for larger sets. Adding an element is just (puthash VAL t SET) and testing membership is just (gethash VAL SET).

Stefan
  • 26,154
  • 3
  • 46
  • 84