The Dragonfly library contains an action framework which which offers easy and flexible interfaces to common actions, such as emulating keystrokes. These action types have been objectified, which means that they are first-class Python objects and can be treated as such.
Perhaps the most important method of Dragonfly’s actions is their dragonfly.actions.action_base.ActionBase.execute() method, which performs the actual event associated with them.
Dragonfly’s action types are derived from the dragonfly.actions.action_base.ActionBase class. This base class implements standard action behavior, such as the ability to concatenate multiple actions and to duplicate an action.
The code below shows the basic usage of Dragonfly action objects. They can be created, combined, executed, etc.
from dragonfly.all import Key, Text
a1 = Key("up, left, down, right") # Define action a1.
a1.execute() # Send the keystrokes.
a2 = Text("Hello world!") # Define action a2, which
# will type the text.
a2.execute() # Send the keystrokes.
a4 = a1 + a2 # a4 is now the concatenation
# of a1 and a2.
a4.execute() # Send the keystrokes.
a3 = Key("a-f, down/25:4") # Press alt-f and then down 4 times
# with 25/100 s pause in between.
a4 += a3 # a4 is now the concatenation
# of a1, a2, and a3.
a4.execute() # Send the keystrokes.
Key("w-b, right/25:5").execute() # Define and execute together.
A common use of Dragonfly is to control other applications by voice and to automate common desktop activities. To do this, voice commands can be associated with actions. When the command is spoken, the action is executed. Dragonfly’s action framework allows for easy definition of things to do, such as text input and sending keystrokes. It also allows these things to be dynamically coupled to voice commands, so as to enable the actions to contain dynamic elements from the recognized command.
An example would be a voice command to find some bit of text:
- Command specification: please find <text>
- Associated action: Key("c-f") + Text("%(text)s")
- Special element: Dictation("text")
This triplet would allow the user to say “please find some words”, which would result in control-f being pressed to open the Find dialogue followed by “some words” being typed into the dialog. The special element is necessary to define what the dynamic element “text” is.
This section describes the Key action object. This type of action is used for sending keystrokes to the foreground application. Examples of how to use this class are given in Example key actions.
The spec argument passed to the Key constructor specifies which keystroke events will be emulated. It is a string consisting of one or more comma-separated keystroke elements. Each of these elements has one of the following two possible formats:
The different parts of the keystroke specification are as follows. Note that only keyname is required; the other fields are optional.
modifiers – Modifiers for this keystroke. These keys are held down while pressing the main keystroke. Can be zero or more of the following:
- a – alt key
- c – control key
- s – shift key
- w – Windows key
keyname – Name of the keystroke. Valid names are listed in Key names.
innerpause – The time to pause between repetitions of this keystroke.
repeat – The number of times this keystroke should be repeated. If not specified, the key will be pressed and released once.
outerpause – The time to pause after this keystroke.
direction – Whether to press-and-hold or release the key. Must be one of the following:
- down – press and hold the key
- up – release the key
Note that releasing a key which is not being held down does not cause an error. It harmlessly does nothing.
- Lowercase alphabet: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
- Uppercase alphabet: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
- Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
- Navigation keys: left, right, up, down, pgup, pgdown, home, end
- Editing keys: space, enter, backspace, del, insert
- Symbols: ampersand, apostrophe, asterisk, at, backslash, backtick, bar, caret, colon, comma, dollar, dot, dquote, equal, escape, exclamation, hash, hyphen, minus, percent, plus, question, slash, squote, tilde, underscore
- Function keys: f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24
- Modifiers: alt, ctrl, shift
- Brackets: langle, lbrace, lbracket, lparen, rangle, rbrace, rbracket, rparen
- Special keys: apps, win
- Numberpad keys: np0, np1, np2, np3, np4, np5, np6, np7, np8, np9, npadd, npdec, npdiv, npmul, npsep, npsub
The following code types the text “Hello world!” into the foreground application:
Key("H, e, l, l, o, space, w, o, r, l, d, exclamation").execute()
The following code is a bit more useful, as it saves the current file with the name “dragonfly.txt” (this works for many English-language applications):
action = Key("a-f, a/50") + Text("dragonfly.txt") + Key("enter")
action.execute()
The following code selects the next four lines by holding down the shift key, slowly moving down 4 lines, and then releasing the shift key:
Key("shift:down, down/25:4, shift:up").execute()
The following code locks the screen by pressing the Windows key together with the l:
Key("w-l").execute()
Keystroke emulation action.
The format of the keystroke specification spec is described in Keystroke specification format.
This class emulates keyboard activity by sending keystrokes to the foreground application. It does this using Dragonfly’s keyboard interface implemented in the keyboard and sendinput modules. These use the sendinput() function of the Win32 API.
This section describes the Text action object. This type of action is used for typing text into the foreground application.
It differs from the Key action in that Text is used for typing literal text, while dragonfly.actions.action_key.Key emulates pressing keys on the keyboard. An example of this is that the arrow-keys are not part of a text and so cannot be typed using the Text action, but can be sent by the dragonfly.actions.action_key.Key action.
Action that sends keyboard events to type text.
Paste-from-clipboard action.
This action inserts the given content into the Windows system clipboard, and then performs the paste action to paste it into the foreground application. By default, the paste action is the Control-v keystroke. The default clipboard format to use is the Unicode text format.
Mimic recognition action.
The constructor arguments are the words which will be mimicked. These should be passed as a variable argument list. For example:
action = Mimic("hello", "world", r"!\exclamation-mark")
action.execute()
If an error occurs during mimicking the given recognition, then an ActionError is raised. A common error is that the engine does not know the given words and can therefore not recognize them. For example, the following attempts to mimic recognition of one single word including a space and an exclamation-mark; this will almost certainly fail:
Mimic("hello world!").execute() # Will raise ActionError.
Wait for a specific window context action.
When this action is executed, it waits until the correct window context is present. This window context is specified by the desired window title of the foreground window and/or the executable name of the foreground application. These are specified using the constructor arguments listed above. The substring search used is not case sensitive.
If the correct window context is not found within timeout seconds, then this action will raise an :class`ActionError` to indicate the timeout.
Pause for the given amount of time.
The spec constructor argument should be a string giving the time to wait. It should be given in hundredths of a second. For example, the following code will pause for 20/100s = 0.2 seconds:
Pause("20").execute()
The reason the spec must be given as a string is because it can then be used in dynamic value evaluation. For example, the following code determines the time to pause at execution time:
action = Pause("%(time)d")
data = {"time": 37}
action.execute(data)