Action framework

The Dragonfly library contains an action framework which which offers easy and flexible interfaces to common actions, such as emulating keystrokes. These action types have been objectified, which means that they are first-class Python objects and can be treated as such.

Perhaps the most important method of Dragonfly’s actions is their dragonfly.actions.action_base.ActionBase.execute() method, which performs the actual event associated with them.

Dragonfly’s action types are derived from the dragonfly.actions.action_base.ActionBase class. This base class implements standard action behavior, such as the ability to concatenate multiple actions and to duplicate an action.

Basic examples

The code below shows the basic usage of Dragonfly action objects. They can be created, combined, executed, etc.

from dragonfly.all import Key, Text

a1 = Key("up, left, down, right")   # Define action a1.
a1.execute()                        # Send the keystrokes.

a2 = Text("Hello world!")           # Define action a2, which
                                    #  will type the text.
a2.execute()                        # Send the keystrokes.

a4 = a1 + a2                        # a4 is now the concatenation
                                    #  of a1 and a2.
a4.execute()                        # Send the keystrokes.

a3 = Key("a-f, down/25:4")          # Press alt-f and then down 4 times
                                    #  with 25/100 s pause in between.
a4 += a3                            # a4 is now the concatenation
                                    #  of a1, a2, and a3.
a4.execute()                        # Send the keystrokes.

Key("w-b, right/25:5").execute()    # Define and execute together.

Combining voice commands and actions

A common use of Dragonfly is to control other applications by voice and to automate common desktop activities. To do this, voice commands can be associated with actions. When the command is spoken, the action is executed. Dragonfly’s action framework allows for easy definition of things to do, such as text input and sending keystrokes. It also allows these things to be dynamically coupled to voice commands, so as to enable the actions to contain dynamic elements from the recognized command.

An example would be a voice command to find some bit of text:

  • Command specification: please find <text>
  • Associated action: Key("c-f") + Text("%(text)s")
  • Special element: Dictation("text")

This triplet would allow the user to say “please find some words”, which would result in control-f being pressed to open the Find dialogue followed by “some words” being typed into the dialog. The special element is necessary to define what the dynamic element “text” is.

Action class reference

ActionBase base class

class ActionBase
Base class for Dragonfly’s action classes.

Key action – send a sequence of keystrokes

This section describes the Key action object. This type of action is used for sending keystrokes to the foreground application. Examples of how to use this class are given in Example key actions.

Keystroke specification format

The spec argument passed to the Key constructor specifies which keystroke events will be emulated. It is a string consisting of one or more comma-separated keystroke elements. Each of these elements has one of the following two possible formats:

Normal press-release key action, optionally repeated several times:
[modifiers -] keyname [/ innerpause] [: repeat] [/ outerpause]
Press-and-hold a key, or release a held-down key:
[modifiers -] keyname : direction [/ outerpause]

The different parts of the keystroke specification are as follows. Note that only keyname is required; the other fields are optional.

  • modifiers – Modifiers for this keystroke. These keys are held down while pressing the main keystroke. Can be zero or more of the following:

    • a – alt key
    • c – control key
    • s – shift key
    • w – Windows key
  • keyname – Name of the keystroke. Valid names are listed in Key names.

  • innerpause – The time to pause between repetitions of this keystroke.

  • repeat – The number of times this keystroke should be repeated. If not specified, the key will be pressed and released once.

  • outerpause – The time to pause after this keystroke.

  • direction – Whether to press-and-hold or release the key. Must be one of the following:

    • down – press and hold the key
    • up – release the key

    Note that releasing a key which is not being held down does not cause an error. It harmlessly does nothing.

Key names

  • Lowercase alphabet: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
  • Uppercase alphabet: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
  • Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
  • Navigation keys: left, right, up, down, pgup, pgdown, home, end
  • Editing keys: space, enter, backspace, del, insert
  • Symbols: ampersand, apostrophe, asterisk, at, backslash, backtick, bar, caret, colon, comma, dollar, dot, dquote, equal, escape, exclamation, hash, hyphen, minus, percent, plus, question, slash, squote, tilde, underscore
  • Function keys: f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24
  • Modifiers: alt, ctrl, shift
  • Brackets: langle, lbrace, lbracket, lparen, rangle, rbrace, rbracket, rparen
  • Special keys: apps, win
  • Numberpad keys: np0, np1, np2, np3, np4, np5, np6, np7, np8, np9, npadd, npdec, npdiv, npmul, npsep, npsub

Example key actions

The following code types the text “Hello world!” into the foreground application:

Key("H, e, l, l, o, space, w, o, r, l, d, exclamation").execute()

The following code is a bit more useful, as it saves the current file with the name “dragonfly.txt” (this works for many English-language applications):

action = Key("a-f, a/50") + Text("dragonfly.txt") + Key("enter")
action.execute()

The following code selects the next four lines by holding down the shift key, slowly moving down 4 lines, and then releasing the shift key:

Key("shift:down, down/25:4, shift:up").execute()

The following code locks the screen by pressing the Windows key together with the l:

Key("w-l").execute()

Key class reference

class Key(spec=None, static=False)

Keystroke emulation action.

Constructor arguments:
  • spec (str) – keystroke specification
  • static (boolean) – flag indicating whether the specification contains dynamic elements

The format of the keystroke specification spec is described in Keystroke specification format.

This class emulates keyboard activity by sending keystrokes to the foreground application. It does this using Dragonfly’s keyboard interface implemented in the keyboard and sendinput modules. These use the sendinput() function of the Win32 API.

Text action – type a given text

This section describes the Text action object. This type of action is used for typing text into the foreground application.

It differs from the Key action in that Text is used for typing literal text, while dragonfly.actions.action_key.Key emulates pressing keys on the keyboard. An example of this is that the arrow-keys are not part of a text and so cannot be typed using the Text action, but can be sent by the dragonfly.actions.action_key.Key action.

class Text(spec=None, static=False, pause=0.02, autofmt=False)

Action that sends keyboard events to type text.

Arguments:
  • spec (str) – the text to type
  • static (boolean) – if True, do not dynamically interpret spec when executing this action
  • pause (float) – the time to pause between each keystroke, given in seconds
  • autofmt (boolean) – if True, attempt to format the text with correct spacing and capitalization. This is done by first mimicking a word recognition and then analyzing its spacing and capitalization and applying the same formatting to the text.

Paste action – insert a specific text by pasting it from the clipboard

class Paste(content, format=None, paste=None, static=False)

Paste-from-clipboard action.

Constructor arguments:
  • content (str) – content to paste
  • format (int, Win32 clipboard format) – clipboard format
  • paste (instance derived from ActionBase) – paste action
  • static (boolean) – flag indicating whether the specification contains dynamic elements

This action inserts the given content into the Windows system clipboard, and then performs the paste action to paste it into the foreground application. By default, the paste action is the Control-v keystroke. The default clipboard format to use is the Unicode text format.

Mimic action – mimic a recognition

class Mimic(*words)

Mimic recognition action.

The constructor arguments are the words which will be mimicked. These should be passed as a variable argument list. For example:

action = Mimic("hello", "world", r"!\exclamation-mark")
action.execute()

If an error occurs during mimicking the given recognition, then an ActionError is raised. A common error is that the engine does not know the given words and can therefore not recognize them. For example, the following attempts to mimic recognition of one single word including a space and an exclamation-mark; this will almost certainly fail:

Mimic("hello world!").execute()   # Will raise ActionError.

WaitWindow action – wait for a specific window context

class WaitWindow(title=None, executable=None, timeout=15)

Wait for a specific window context action.

Constructor arguments:
  • title (str) – part of the window title: not case sensitive
  • executable (str) – part of the file name of the executable; not case sensitive
  • timeout (int or float) – the maximum number of seconds to wait for the correct context, after which an ActionError will be raised.

When this action is executed, it waits until the correct window context is present. This window context is specified by the desired window title of the foreground window and/or the executable name of the foreground application. These are specified using the constructor arguments listed above. The substring search used is not case sensitive.

If the correct window context is not found within timeout seconds, then this action will raise an :class`ActionError` to indicate the timeout.

Pause action – wait for a specific amount of time

class Pause(spec=None, static=False)

Pause for the given amount of time.

The spec constructor argument should be a string giving the time to wait. It should be given in hundredths of a second. For example, the following code will pause for 20/100s = 0.2 seconds:

Pause("20").execute()

The reason the spec must be given as a string is because it can then be used in dynamic value evaluation. For example, the following code determines the time to pause at execution time:

action = Pause("%(time)d")
data = {"time": 37}
action.execute(data)