Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1# -*- coding: utf-8 -*- 

2#@+leo-ver=5-thin 

3#@+node:ekr.20141012064706.18389: * @file leoAst.py 

4#@@first 

5# This file is part of Leo: https://leoeditor.com 

6# Leo's copyright notice is based on the MIT license: http://leoeditor.com/license.html 

7#@+<< docstring >> 

8#@+node:ekr.20200113081838.1: ** << docstring >> (leoAst.py) 

9""" 

10leoAst.py: This file does not depend on Leo in any way. 

11 

12The classes in this file unify python's token-based and ast-based worlds by 

13creating two-way links between tokens in the token list and ast nodes in 

14the parse tree. For more details, see the "Overview" section below. 

15 

16 

17**Stand-alone operation** 

18 

19usage: 

20 leoAst.py --help 

21 leoAst.py [--fstringify | --fstringify-diff | --orange | --orange-diff] PATHS 

22 leoAst.py --py-cov [ARGS] 

23 leoAst.py --pytest [ARGS] 

24 leoAst.py --unittest [ARGS] 

25 

26examples: 

27 --py-cov "-f TestOrange" 

28 --pytest "-f TestOrange" 

29 --unittest TestOrange 

30 

31positional arguments: 

32 PATHS directory or list of files 

33 

34optional arguments: 

35 -h, --help show this help message and exit 

36 --fstringify leonine fstringify 

37 --fstringify-diff show fstringify diff 

38 --orange leonine Black 

39 --orange-diff show orange diff 

40 --py-cov run pytest --cov on leoAst.py 

41 --pytest run pytest on leoAst.py 

42 --unittest run unittest on leoAst.py 

43 

44 

45**Overview** 

46 

47leoAst.py unifies python's token-oriented and ast-oriented worlds. 

48 

49leoAst.py defines classes that create two-way links between tokens 

50created by python's tokenize module and parse tree nodes created by 

51python's ast module: 

52 

53The Token Order Generator (TOG) class quickly creates the following 

54links: 

55 

56- An *ordered* children array from each ast node to its children. 

57 

58- A parent link from each ast.node to its parent. 

59 

60- Two-way links between tokens in the token list, a list of Token 

61 objects, and the ast nodes in the parse tree: 

62 

63 - For each token, token.node contains the ast.node "responsible" for 

64 the token. 

65 

66 - For each ast node, node.first_i and node.last_i are indices into 

67 the token list. These indices give the range of tokens that can be 

68 said to be "generated" by the ast node. 

69 

70Once the TOG class has inserted parent/child links, the Token Order 

71Traverser (TOT) class traverses trees annotated with parent/child 

72links extremely quickly. 

73 

74 

75**Applicability and importance** 

76 

77Many python developers will find asttokens meets all their needs. 

78asttokens is well documented and easy to use. Nevertheless, two-way 

79links are significant additions to python's tokenize and ast modules: 

80 

81- Links from tokens to nodes are assigned to the nearest possible ast 

82 node, not the nearest statement, as in asttokens. Links can easily 

83 be reassigned, if desired. 

84 

85- The TOG and TOT classes are intended to be the foundation of tools 

86 such as fstringify and black. 

87 

88- The TOG class solves real problems, such as: 

89 https://stackoverflow.com/questions/16748029/ 

90 

91**Known bug** 

92 

93This file has no known bugs *except* for Python version 3.8. 

94 

95For Python 3.8, syncing tokens will fail for function call such as: 

96 

97 f(1, x=2, *[3, 4], y=5) 

98 

99that is, for calls where keywords appear before non-keyword args. 

100 

101There are no plans to fix this bug. The workaround is to use Python version 

1023.9 or above. 

103 

104 

105**Figures of merit** 

106 

107Simplicity: The code consists primarily of a set of generators, one 

108for every kind of ast node. 

109 

110Speed: The TOG creates two-way links between tokens and ast nodes in 

111roughly the time taken by python's tokenize.tokenize and ast.parse 

112library methods. This is substantially faster than the asttokens, 

113black or fstringify tools. The TOT class traverses trees annotated 

114with parent/child links even more quickly. 

115 

116Memory: The TOG class makes no significant demands on python's 

117resources. Generators add nothing to python's call stack. 

118TOG.node_stack is the only variable-length data. This stack resides in 

119python's heap, so its length is unimportant. In the worst case, it 

120might contain a few thousand entries. The TOT class uses no 

121variable-length data at all. 

122 

123**Links** 

124 

125Leo... 

126Ask for help: https://groups.google.com/forum/#!forum/leo-editor 

127Report a bug: https://github.com/leo-editor/leo-editor/issues 

128leoAst.py docs: http://leoeditor.com/appendices.html#leoast-py 

129 

130Other tools... 

131asttokens: https://pypi.org/project/asttokens 

132black: https://pypi.org/project/black/ 

133fstringify: https://pypi.org/project/fstringify/ 

134 

135Python modules... 

136tokenize.py: https://docs.python.org/3/library/tokenize.html 

137ast.py https://docs.python.org/3/library/ast.html 

138 

139**Studying this file** 

140 

141I strongly recommend that you use Leo when studying this code so that you 

142will see the file's intended outline structure. 

143 

144Without Leo, you will see only special **sentinel comments** that create 

145Leo's outline structure. These comments have the form:: 

146 

147 `#@<comment-kind>:<user-id>.<timestamp>.<number>: <outline-level> <headline>` 

148""" 

149#@-<< docstring >> 

150#@+<< imports >> 

151#@+node:ekr.20200105054219.1: ** << imports >> (leoAst.py) 

152import argparse 

153import ast 

154import codecs 

155import difflib 

156import glob 

157import io 

158import os 

159import re 

160import sys 

161import textwrap 

162import tokenize 

163import traceback 

164from typing import List, Optional 

165#@-<< imports >> 

166v1, v2, junk1, junk2, junk3 = sys.version_info 

167py_version = (v1, v2) 

168 

169# Async tokens exist only in Python 3.5 and 3.6. 

170# https://docs.python.org/3/library/token.html 

171has_async_tokens = (3, 5) <= py_version <= (3, 6) 

172 

173# has_position_only_params = (v1, v2) >= (3, 8) 

174#@+others 

175#@+node:ekr.20191226175251.1: ** class LeoGlobals 

176#@@nosearch 

177 

178 

179class LeoGlobals: # pragma: no cover 

180 """ 

181 Simplified version of functions in leoGlobals.py. 

182 """ 

183 

184 total_time = 0.0 # For unit testing. 

185 

186 #@+others 

187 #@+node:ekr.20191226175903.1: *3* LeoGlobals.callerName 

188 def callerName(self, n): 

189 """Get the function name from the call stack.""" 

190 try: 

191 f1 = sys._getframe(n) 

192 code1 = f1.f_code 

193 return code1.co_name 

194 except Exception: 

195 return '' 

196 #@+node:ekr.20191226175426.1: *3* LeoGlobals.callers 

197 def callers(self, n=4): 

198 """ 

199 Return a string containing a comma-separated list of the callers 

200 of the function that called g.callerList. 

201 """ 

202 i, result = 2, [] 

203 while True: 

204 s = self.callerName(n=i) 

205 if s: 

206 result.append(s) 

207 if not s or len(result) >= n: 

208 break 

209 i += 1 

210 return ','.join(reversed(result)) 

211 #@+node:ekr.20191226190709.1: *3* leoGlobals.es_exception & helper 

212 def es_exception(self, full=True): 

213 typ, val, tb = sys.exc_info() 

214 for line in traceback.format_exception(typ, val, tb): 

215 print(line) 

216 fileName, n = self.getLastTracebackFileAndLineNumber() 

217 return fileName, n 

218 #@+node:ekr.20191226192030.1: *4* LeoGlobals.getLastTracebackFileAndLineNumber 

219 def getLastTracebackFileAndLineNumber(self): 

220 typ, val, tb = sys.exc_info() 

221 if typ == SyntaxError: 

222 # IndentationError is a subclass of SyntaxError. 

223 # SyntaxError *does* have 'filename' and 'lineno' attributes. 

224 return val.filename, val.lineno # type:ignore 

225 # 

226 # Data is a list of tuples, one per stack entry. 

227 # The tuples have the form (filename, lineNumber, functionName, text). 

228 data = traceback.extract_tb(tb) 

229 item = data[-1] # Get the item at the top of the stack. 

230 filename, n, functionName, text = item 

231 return filename, n 

232 #@+node:ekr.20200220065737.1: *3* LeoGlobals.objToString 

233 def objToString(self, obj, tag=None): 

234 """Simplified version of g.printObj.""" 

235 result = [] 

236 if tag: 

237 result.append(f"{tag}...") 

238 if isinstance(obj, str): 

239 obj = g.splitLines(obj) 

240 if isinstance(obj, list): 

241 result.append('[') 

242 for z in obj: 

243 result.append(f" {z!r}") 

244 result.append(']') 

245 elif isinstance(obj, tuple): 

246 result.append('(') 

247 for z in obj: 

248 result.append(f" {z!r}") 

249 result.append(')') 

250 else: 

251 result.append(repr(obj)) 

252 result.append('') 

253 return '\n'.join(result) 

254 #@+node:ekr.20191226190425.1: *3* LeoGlobals.plural 

255 def plural(self, obj): 

256 """Return "s" or "" depending on n.""" 

257 if isinstance(obj, (list, tuple, str)): 

258 n = len(obj) 

259 else: 

260 n = obj 

261 return '' if n == 1 else 's' 

262 #@+node:ekr.20191226175441.1: *3* LeoGlobals.printObj 

263 def printObj(self, obj, tag=None): 

264 """Simplified version of g.printObj.""" 

265 print(self.objToString(obj, tag)) 

266 #@+node:ekr.20191226190131.1: *3* LeoGlobals.splitLines 

267 def splitLines(self, s): 

268 """Split s into lines, preserving the number of lines and 

269 the endings of all lines, including the last line.""" 

270 # g.stat() 

271 if s: 

272 return s.splitlines(True) 

273 # This is a Python string function! 

274 return [] 

275 #@+node:ekr.20191226190844.1: *3* LeoGlobals.toEncodedString 

276 def toEncodedString(self, s, encoding='utf-8'): 

277 """Convert unicode string to an encoded string.""" 

278 if not isinstance(s, str): 

279 return s 

280 try: 

281 s = s.encode(encoding, "strict") 

282 except UnicodeError: 

283 s = s.encode(encoding, "replace") 

284 print(f"toEncodedString: Error converting {s!r} to {encoding}") 

285 return s 

286 #@+node:ekr.20191226190006.1: *3* LeoGlobals.toUnicode 

287 def toUnicode(self, s, encoding='utf-8'): 

288 """Convert bytes to unicode if necessary.""" 

289 tag = 'g.toUnicode' 

290 if isinstance(s, str): 

291 return s 

292 if not isinstance(s, bytes): 

293 print(f"{tag}: bad s: {s!r}") 

294 return '' 

295 b: bytes = s 

296 try: 

297 s2 = b.decode(encoding, 'strict') 

298 except(UnicodeDecodeError, UnicodeError): 

299 s2 = b.decode(encoding, 'replace') 

300 print(f"{tag}: unicode error. encoding: {encoding!r}, s2:\n{s2!r}") 

301 g.trace(g.callers()) 

302 except Exception: 

303 g.es_exception() 

304 print(f"{tag}: unexpected error! encoding: {encoding!r}, s2:\n{s2!r}") 

305 g.trace(g.callers()) 

306 return s2 

307 #@+node:ekr.20191226175436.1: *3* LeoGlobals.trace 

308 def trace(self, *args): 

309 """Print a tracing message.""" 

310 # Compute the caller name. 

311 try: 

312 f1 = sys._getframe(1) 

313 code1 = f1.f_code 

314 name = code1.co_name 

315 except Exception: 

316 name = '' 

317 print(f"{name}: {' '.join(str(z) for z in args)}") 

318 #@+node:ekr.20191226190241.1: *3* LeoGlobals.truncate 

319 def truncate(self, s, n): 

320 """Return s truncated to n characters.""" 

321 if len(s) <= n: 

322 return s 

323 s2 = s[: n - 3] + f"...({len(s)})" 

324 return s2 + '\n' if s.endswith('\n') else s2 

325 #@-others 

326#@+node:ekr.20200702114522.1: ** leoAst.py: top-level commands 

327#@+node:ekr.20200702114557.1: *3* command: fstringify_command 

328def fstringify_command(files): 

329 """ 

330 Entry point for --fstringify. 

331 

332 Fstringify the given file, overwriting the file. 

333 """ 

334 for filename in files: # pragma: no cover 

335 if os.path.exists(filename): 

336 print(f"fstringify {filename}") 

337 Fstringify().fstringify_file_silent(filename) 

338 else: 

339 print(f"file not found: {filename}") 

340#@+node:ekr.20200702121222.1: *3* command: fstringify_diff_command 

341def fstringify_diff_command(files): 

342 """ 

343 Entry point for --fstringify-diff. 

344 

345 Print the diff that would be produced by fstringify. 

346 """ 

347 for filename in files: # pragma: no cover 

348 if os.path.exists(filename): 

349 print(f"fstringify-diff {filename}") 

350 Fstringify().fstringify_file_diff(filename) 

351 else: 

352 print(f"file not found: {filename}") 

353#@+node:ekr.20200702115002.1: *3* command: orange_command 

354def orange_command(files): 

355 

356 for filename in files: # pragma: no cover 

357 if os.path.exists(filename): 

358 print(f"orange {filename}") 

359 Orange().beautify_file(filename) 

360 else: 

361 print(f"file not found: {filename}") 

362#@+node:ekr.20200702121315.1: *3* command: orange_diff_command 

363def orange_diff_command(files): 

364 

365 for filename in files: # pragma: no cover 

366 if os.path.exists(filename): 

367 print(f"orange-diff {filename}") 

368 Orange().beautify_file_diff(filename) 

369 else: 

370 print(f"file not found: {filename}") 

371#@+node:ekr.20160521104628.1: ** leoAst.py: top-level utils 

372if 1: # pragma: no cover 

373 #@+others 

374 #@+node:ekr.20200702102239.1: *3* function: main (leoAst.py) 

375 def main(): 

376 """Run commands specified by sys.argv.""" 

377 description = textwrap.dedent("""\ 

378 leo-editor/leo/unittests/core/test_leoAst.py contains unit tests (100% coverage). 

379 """) 

380 parser = argparse.ArgumentParser(description=description, formatter_class=argparse.RawTextHelpFormatter) 

381 parser.add_argument('PATHS', nargs='*', help='directory or list of files') 

382 group = parser.add_mutually_exclusive_group(required=False) # Don't require any args. 

383 add = group.add_argument 

384 add('--fstringify', dest='f', action='store_true', help='leonine fstringify') 

385 add('--fstringify-diff', dest='fd', action='store_true', help='show fstringify diff') 

386 add('--orange', dest='o', action='store_true', help='leonine Black') 

387 add('--orange-diff', dest='od', action='store_true', help='show orange diff') 

388 args = parser.parse_args() 

389 files = args.PATHS 

390 if len(files) == 1 and os.path.isdir(files[0]): 

391 files = glob.glob(f"{files[0]}{os.sep}*.py") 

392 if args.f: 

393 fstringify_command(files) 

394 if args.fd: 

395 fstringify_diff_command(files) 

396 if args.o: 

397 orange_command(files) 

398 if args.od: 

399 orange_diff_command(files) 

400 #@+node:ekr.20200107114409.1: *3* functions: reading & writing files 

401 #@+node:ekr.20200218071822.1: *4* function: regularize_nls 

402 def regularize_nls(s): 

403 """Regularize newlines within s.""" 

404 return s.replace('\r\n', '\n').replace('\r', '\n') 

405 #@+node:ekr.20200106171502.1: *4* function: get_encoding_directive 

406 # This is the pattern in PEP 263. 

407 encoding_pattern = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)') 

408 

409 def get_encoding_directive(bb): 

410 """ 

411 Get the encoding from the encoding directive at the start of a file. 

412 

413 bb: The bytes of the file. 

414 

415 Returns the codec name, or 'UTF-8'. 

416 

417 Adapted from pyzo. Copyright 2008 to 2020 by Almar Klein. 

418 """ 

419 for line in bb.split(b'\n', 2)[:2]: 

420 # Try to make line a string 

421 try: 

422 line2 = line.decode('ASCII').strip() 

423 except Exception: 

424 continue 

425 # Does the line match the PEP 263 pattern? 

426 m = encoding_pattern.match(line2) 

427 if not m: 

428 continue 

429 # Is it a known encoding? Correct the name if it is. 

430 try: 

431 c = codecs.lookup(m.group(1)) 

432 return c.name 

433 except Exception: 

434 pass 

435 return 'UTF-8' 

436 #@+node:ekr.20200103113417.1: *4* function: read_file 

437 def read_file(filename, encoding='utf-8'): 

438 """ 

439 Return the contents of the file with the given name. 

440 Print an error message and return None on error. 

441 """ 

442 tag = 'read_file' 

443 try: 

444 # Translate all newlines to '\n'. 

445 with open(filename, 'r', encoding=encoding) as f: 

446 s = f.read() 

447 return regularize_nls(s) 

448 except Exception: 

449 print(f"{tag}: can not read {filename}") 

450 return None 

451 #@+node:ekr.20200106173430.1: *4* function: read_file_with_encoding 

452 def read_file_with_encoding(filename): 

453 """ 

454 Read the file with the given name, returning (e, s), where: 

455 

456 s is the string, converted to unicode, or '' if there was an error. 

457 

458 e is the encoding of s, computed in the following order: 

459 

460 - The BOM encoding if the file starts with a BOM mark. 

461 - The encoding given in the # -*- coding: utf-8 -*- line. 

462 - The encoding given by the 'encoding' keyword arg. 

463 - 'utf-8'. 

464 """ 

465 # First, read the file. 

466 tag = 'read_with_encoding' 

467 try: 

468 with open(filename, 'rb') as f: 

469 bb = f.read() 

470 except Exception: 

471 print(f"{tag}: can not read {filename}") 

472 if not bb: 

473 return 'UTF-8', '' 

474 # Look for the BOM. 

475 e, bb = strip_BOM(bb) 

476 if not e: 

477 # Python's encoding comments override everything else. 

478 e = get_encoding_directive(bb) 

479 s = g.toUnicode(bb, encoding=e) 

480 s = regularize_nls(s) 

481 return e, s 

482 #@+node:ekr.20200106174158.1: *4* function: strip_BOM 

483 def strip_BOM(bb): 

484 """ 

485 bb must be the bytes contents of a file. 

486 

487 If bb starts with a BOM (Byte Order Mark), return (e, bb2), where: 

488 

489 - e is the encoding implied by the BOM. 

490 - bb2 is bb, stripped of the BOM. 

491 

492 If there is no BOM, return (None, bb) 

493 """ 

494 assert isinstance(bb, bytes), bb.__class__.__name__ 

495 table = ( 

496 # Test longer bom's first. 

497 (4, 'utf-32', codecs.BOM_UTF32_BE), 

498 (4, 'utf-32', codecs.BOM_UTF32_LE), 

499 (3, 'utf-8', codecs.BOM_UTF8), 

500 (2, 'utf-16', codecs.BOM_UTF16_BE), 

501 (2, 'utf-16', codecs.BOM_UTF16_LE), 

502 ) 

503 for n, e, bom in table: 

504 assert len(bom) == n 

505 if bom == bb[: len(bom)]: 

506 return e, bb[len(bom) :] 

507 return None, bb 

508 #@+node:ekr.20200103163100.1: *4* function: write_file 

509 def write_file(filename, s, encoding='utf-8'): 

510 """ 

511 Write the string s to the file whose name is given. 

512 

513 Handle all exeptions. 

514 

515 Before calling this function, the caller should ensure 

516 that the file actually has been changed. 

517 """ 

518 try: 

519 # Write the file with platform-dependent newlines. 

520 with open(filename, 'w', encoding=encoding) as f: 

521 f.write(s) 

522 except Exception as e: 

523 g.trace(f"Error writing {filename}\n{e}") 

524 #@+node:ekr.20200113154120.1: *3* functions: tokens 

525 #@+node:ekr.20191223093539.1: *4* function: find_anchor_token 

526 def find_anchor_token(node, global_token_list): 

527 """ 

528 Return the anchor_token for node, a token such that token.node == node. 

529 

530 The search starts at node, and then all the usual child nodes. 

531 """ 

532 

533 node1 = node 

534 

535 def anchor_token(node): 

536 """Return the anchor token in node.token_list""" 

537 # Careful: some tokens in the token list may have been killed. 

538 for token in get_node_token_list(node, global_token_list): 

539 if is_ancestor(node1, token): 

540 return token 

541 return None 

542 

543 # This table only has to cover fields for ast.Nodes that 

544 # won't have any associated token. 

545 

546 fields = ( 

547 # Common... 

548 'elt', 'elts', 'body', 'value', 

549 # Less common... 

550 'dims', 'ifs', 'names', 's', 

551 'test', 'values', 'targets', 

552 ) 

553 while node: 

554 # First, try the node itself. 

555 token = anchor_token(node) 

556 if token: 

557 return token 

558 # Second, try the most common nodes w/o token_lists: 

559 if isinstance(node, ast.Call): 

560 node = node.func 

561 elif isinstance(node, ast.Tuple): 

562 node = node.elts # type:ignore 

563 # Finally, try all other nodes. 

564 else: 

565 # This will be used rarely. 

566 for field in fields: 

567 node = getattr(node, field, None) 

568 if node: 

569 token = anchor_token(node) 

570 if token: 

571 return token 

572 else: 

573 break 

574 return None 

575 #@+node:ekr.20191231160225.1: *4* function: find_paren_token (changed signature) 

576 def find_paren_token(i, global_token_list): 

577 """Return i of the next paren token, starting at tokens[i].""" 

578 while i < len(global_token_list): 

579 token = global_token_list[i] 

580 if token.kind == 'op' and token.value in '()': 

581 return i 

582 if is_significant_token(token): 

583 break 

584 i += 1 

585 return None 

586 #@+node:ekr.20200113110505.4: *4* function: get_node_tokens_list 

587 def get_node_token_list(node, global_tokens_list): 

588 """ 

589 tokens_list must be the global tokens list. 

590 Return the tokens assigned to the node, or []. 

591 """ 

592 i = getattr(node, 'first_i', None) 

593 j = getattr(node, 'last_i', None) 

594 return [] if i is None else global_tokens_list[i : j + 1] 

595 #@+node:ekr.20191124123830.1: *4* function: is_significant & is_significant_token 

596 def is_significant(kind, value): 

597 """ 

598 Return True if (kind, value) represent a token that can be used for 

599 syncing generated tokens with the token list. 

600 """ 

601 # Making 'endmarker' significant ensures that all tokens are synced. 

602 return ( 

603 kind in ('async', 'await', 'endmarker', 'name', 'number', 'string') or 

604 kind == 'op' and value not in ',;()') 

605 

606 def is_significant_token(token): 

607 """Return True if the given token is a syncronizing token""" 

608 return is_significant(token.kind, token.value) 

609 #@+node:ekr.20191224093336.1: *4* function: match_parens 

610 def match_parens(filename, i, j, tokens): 

611 """Match parens in tokens[i:j]. Return the new j.""" 

612 if j >= len(tokens): 

613 return len(tokens) 

614 # Calculate paren level... 

615 level = 0 

616 for n in range(i, j + 1): 

617 token = tokens[n] 

618 if token.kind == 'op' and token.value == '(': 

619 level += 1 

620 if token.kind == 'op' and token.value == ')': 

621 if level == 0: 

622 break 

623 level -= 1 

624 # Find matching ')' tokens *after* j. 

625 if level > 0: 

626 while level > 0 and j + 1 < len(tokens): 

627 token = tokens[j + 1] 

628 if token.kind == 'op' and token.value == ')': 

629 level -= 1 

630 elif token.kind == 'op' and token.value == '(': 

631 level += 1 

632 elif is_significant_token(token): 

633 break 

634 j += 1 

635 if level != 0: # pragma: no cover. 

636 line_n = tokens[i].line_number 

637 raise AssignLinksError( 

638 f"\n" 

639 f"Unmatched parens: level={level}\n" 

640 f" file: {filename}\n" 

641 f" line: {line_n}\n") 

642 return j 

643 #@+node:ekr.20191223053324.1: *4* function: tokens_for_node 

644 def tokens_for_node(filename, node, global_token_list): 

645 """Return the list of all tokens descending from node.""" 

646 # Find any token descending from node. 

647 token = find_anchor_token(node, global_token_list) 

648 if not token: 

649 if 0: # A good trace for debugging. 

650 print('') 

651 g.trace('===== no tokens', node.__class__.__name__) 

652 return [] 

653 assert is_ancestor(node, token) 

654 # Scan backward. 

655 i = first_i = token.index 

656 while i >= 0: 

657 token2 = global_token_list[i - 1] 

658 if getattr(token2, 'node', None): 

659 if is_ancestor(node, token2): 

660 first_i = i - 1 

661 else: 

662 break 

663 i -= 1 

664 # Scan forward. 

665 j = last_j = token.index 

666 while j + 1 < len(global_token_list): 

667 token2 = global_token_list[j + 1] 

668 if getattr(token2, 'node', None): 

669 if is_ancestor(node, token2): 

670 last_j = j + 1 

671 else: 

672 break 

673 j += 1 

674 last_j = match_parens(filename, first_i, last_j, global_token_list) 

675 results = global_token_list[first_i : last_j + 1] 

676 return results 

677 #@+node:ekr.20200101030236.1: *4* function: tokens_to_string 

678 def tokens_to_string(tokens): 

679 """Return the string represented by the list of tokens.""" 

680 if tokens is None: 

681 # This indicates an internal error. 

682 print('') 

683 g.trace('===== token list is None ===== ') 

684 print('') 

685 return '' 

686 return ''.join([z.to_string() for z in tokens]) 

687 #@+node:ekr.20191231072039.1: *3* functions: utils... 

688 # General utility functions on tokens and nodes. 

689 #@+node:ekr.20191119085222.1: *4* function: obj_id 

690 def obj_id(obj): 

691 """Return the last four digits of id(obj), for dumps & traces.""" 

692 return str(id(obj))[-4:] 

693 #@+node:ekr.20191231060700.1: *4* function: op_name 

694 #@@nobeautify 

695 

696 # https://docs.python.org/3/library/ast.html 

697 

698 _op_names = { 

699 # Binary operators. 

700 'Add': '+', 

701 'BitAnd': '&', 

702 'BitOr': '|', 

703 'BitXor': '^', 

704 'Div': '/', 

705 'FloorDiv': '//', 

706 'LShift': '<<', 

707 'MatMult': '@', # Python 3.5. 

708 'Mod': '%', 

709 'Mult': '*', 

710 'Pow': '**', 

711 'RShift': '>>', 

712 'Sub': '-', 

713 # Boolean operators. 

714 'And': ' and ', 

715 'Or': ' or ', 

716 # Comparison operators 

717 'Eq': '==', 

718 'Gt': '>', 

719 'GtE': '>=', 

720 'In': ' in ', 

721 'Is': ' is ', 

722 'IsNot': ' is not ', 

723 'Lt': '<', 

724 'LtE': '<=', 

725 'NotEq': '!=', 

726 'NotIn': ' not in ', 

727 # Context operators. 

728 'AugLoad': '<AugLoad>', 

729 'AugStore': '<AugStore>', 

730 'Del': '<Del>', 

731 'Load': '<Load>', 

732 'Param': '<Param>', 

733 'Store': '<Store>', 

734 # Unary operators. 

735 'Invert': '~', 

736 'Not': ' not ', 

737 'UAdd': '+', 

738 'USub': '-', 

739 } 

740 

741 def op_name(node): 

742 """Return the print name of an operator node.""" 

743 class_name = node.__class__.__name__ 

744 assert class_name in _op_names, repr(class_name) 

745 return _op_names[class_name].strip() 

746 #@+node:ekr.20200107114452.1: *3* node/token creators... 

747 #@+node:ekr.20200103082049.1: *4* function: make_tokens 

748 def make_tokens(contents): 

749 """ 

750 Return a list (not a generator) of Token objects corresponding to the 

751 list of 5-tuples generated by tokenize.tokenize. 

752 

753 Perform consistency checks and handle all exeptions. 

754 """ 

755 

756 def check(contents, tokens): 

757 result = tokens_to_string(tokens) 

758 ok = result == contents 

759 if not ok: 

760 print('\nRound-trip check FAILS') 

761 print('Contents...\n') 

762 g.printObj(contents) 

763 print('\nResult...\n') 

764 g.printObj(result) 

765 return ok 

766 

767 try: 

768 five_tuples = tokenize.tokenize( 

769 io.BytesIO(contents.encode('utf-8')).readline) 

770 except Exception: 

771 print('make_tokens: exception in tokenize.tokenize') 

772 g.es_exception() 

773 return None 

774 tokens = Tokenizer().create_input_tokens(contents, five_tuples) 

775 assert check(contents, tokens) 

776 return tokens 

777 #@+node:ekr.20191027075648.1: *4* function: parse_ast 

778 def parse_ast(s): 

779 """ 

780 Parse string s, catching & reporting all exceptions. 

781 Return the ast node, or None. 

782 """ 

783 

784 def oops(message): 

785 print('') 

786 print(f"parse_ast: {message}") 

787 g.printObj(s) 

788 print('') 

789 

790 try: 

791 s1 = g.toEncodedString(s) 

792 tree = ast.parse(s1, filename='before', mode='exec') 

793 return tree 

794 except IndentationError: 

795 oops('Indentation Error') 

796 except SyntaxError: 

797 oops('Syntax Error') 

798 except Exception: 

799 oops('Unexpected Exception') 

800 g.es_exception() 

801 return None 

802 #@+node:ekr.20191231110051.1: *3* node/token dumpers... 

803 #@+node:ekr.20191027074436.1: *4* function: dump_ast 

804 def dump_ast(ast, tag='dump_ast'): 

805 """Utility to dump an ast tree.""" 

806 g.printObj(AstDumper().dump_ast(ast), tag=tag) 

807 #@+node:ekr.20191228095945.4: *4* function: dump_contents 

808 def dump_contents(contents, tag='Contents'): 

809 print('') 

810 print(f"{tag}...\n") 

811 for i, z in enumerate(g.splitLines(contents)): 

812 print(f"{i+1:<3} ", z.rstrip()) 

813 print('') 

814 #@+node:ekr.20191228095945.5: *4* function: dump_lines 

815 def dump_lines(tokens, tag='Token lines'): 

816 print('') 

817 print(f"{tag}...\n") 

818 for z in tokens: 

819 if z.line.strip(): 

820 print(z.line.rstrip()) 

821 else: 

822 print(repr(z.line)) 

823 print('') 

824 #@+node:ekr.20191228095945.7: *4* function: dump_results 

825 def dump_results(tokens, tag='Results'): 

826 print('') 

827 print(f"{tag}...\n") 

828 print(tokens_to_string(tokens)) 

829 print('') 

830 #@+node:ekr.20191228095945.8: *4* function: dump_tokens 

831 def dump_tokens(tokens, tag='Tokens'): 

832 print('') 

833 print(f"{tag}...\n") 

834 if not tokens: 

835 return 

836 print("Note: values shown are repr(value) *except* for 'string' tokens.") 

837 tokens[0].dump_header() 

838 for i, z in enumerate(tokens): 

839 # Confusing. 

840 # if (i % 20) == 0: z.dump_header() 

841 print(z.dump()) 

842 print('') 

843 #@+node:ekr.20191228095945.9: *4* function: dump_tree 

844 def dump_tree(tokens, tree, tag='Tree'): 

845 print('') 

846 print(f"{tag}...\n") 

847 print(AstDumper().dump_tree(tokens, tree)) 

848 #@+node:ekr.20200107040729.1: *4* function: show_diffs 

849 def show_diffs(s1, s2, filename=''): 

850 """Print diffs between strings s1 and s2.""" 

851 lines = list(difflib.unified_diff( 

852 g.splitLines(s1), 

853 g.splitLines(s2), 

854 fromfile=f"Old {filename}", 

855 tofile=f"New {filename}", 

856 )) 

857 print('') 

858 tag = f"Diffs for {filename}" if filename else 'Diffs' 

859 g.printObj(lines, tag=tag) 

860 #@+node:ekr.20191223095408.1: *3* node/token nodes... 

861 # Functions that associate tokens with nodes. 

862 #@+node:ekr.20200120082031.1: *4* function: find_statement_node 

863 def find_statement_node(node): 

864 """ 

865 Return the nearest statement node. 

866 Return None if node has only Module for a parent. 

867 """ 

868 if isinstance(node, ast.Module): 

869 return None 

870 parent = node 

871 while parent: 

872 if is_statement_node(parent): 

873 return parent 

874 parent = parent.parent 

875 return None 

876 #@+node:ekr.20191223054300.1: *4* function: is_ancestor 

877 def is_ancestor(node, token): 

878 """Return True if node is an ancestor of token.""" 

879 t_node = token.node 

880 if not t_node: 

881 assert token.kind == 'killed', repr(token) 

882 return False 

883 while t_node: 

884 if t_node == node: 

885 return True 

886 t_node = t_node.parent 

887 return False 

888 #@+node:ekr.20200120082300.1: *4* function: is_long_statement 

889 def is_long_statement(node): 

890 """ 

891 Return True if node is an instance of a node that might be split into 

892 shorter lines. 

893 """ 

894 return isinstance(node, ( 

895 ast.Assign, ast.AnnAssign, ast.AsyncFor, ast.AsyncWith, ast.AugAssign, 

896 ast.Call, ast.Delete, ast.ExceptHandler, ast.For, ast.Global, 

897 ast.If, ast.Import, ast.ImportFrom, 

898 ast.Nonlocal, ast.Return, ast.While, ast.With, ast.Yield, ast.YieldFrom)) 

899 #@+node:ekr.20200120110005.1: *4* function: is_statement_node 

900 def is_statement_node(node): 

901 """Return True if node is a top-level statement.""" 

902 return is_long_statement(node) or isinstance(node, ( 

903 ast.Break, ast.Continue, ast.Pass, ast.Try)) 

904 #@+node:ekr.20191231082137.1: *4* function: nearest_common_ancestor 

905 def nearest_common_ancestor(node1, node2): 

906 """ 

907 Return the nearest common ancestor node for the given nodes. 

908 

909 The nodes must have parent links. 

910 """ 

911 

912 def parents(node): 

913 aList = [] 

914 while node: 

915 aList.append(node) 

916 node = node.parent 

917 return list(reversed(aList)) 

918 

919 result = None 

920 parents1 = parents(node1) 

921 parents2 = parents(node2) 

922 while parents1 and parents2: 

923 parent1 = parents1.pop(0) 

924 parent2 = parents2.pop(0) 

925 if parent1 == parent2: 

926 result = parent1 

927 else: 

928 break 

929 return result 

930 #@+node:ekr.20191225061516.1: *3* node/token replacers... 

931 # Functions that replace tokens or nodes. 

932 #@+node:ekr.20191231162249.1: *4* function: add_token_to_token_list 

933 def add_token_to_token_list(token, node): 

934 """Insert token in the proper location of node.token_list.""" 

935 if getattr(node, 'first_i', None) is None: 

936 node.first_i = node.last_i = token.index 

937 else: 

938 node.first_i = min(node.first_i, token.index) 

939 node.last_i = max(node.last_i, token.index) 

940 #@+node:ekr.20191225055616.1: *4* function: replace_node 

941 def replace_node(new_node, old_node): 

942 """Replace new_node by old_node in the parse tree.""" 

943 parent = old_node.parent 

944 new_node.parent = parent 

945 new_node.node_index = old_node.node_index 

946 children = parent.children 

947 i = children.index(old_node) 

948 children[i] = new_node 

949 fields = getattr(old_node, '_fields', None) 

950 if fields: 

951 for field in fields: 

952 field = getattr(old_node, field) 

953 if field == old_node: 

954 setattr(old_node, field, new_node) 

955 break 

956 #@+node:ekr.20191225055626.1: *4* function: replace_token 

957 def replace_token(token, kind, value): 

958 """Replace kind and value of the given token.""" 

959 if token.kind in ('endmarker', 'killed'): 

960 return 

961 token.kind = kind 

962 token.value = value 

963 token.node = None # Should be filled later. 

964 #@-others 

965#@+node:ekr.20191027072910.1: ** Exception classes 

966class AssignLinksError(Exception): 

967 """Assigning links to ast nodes failed.""" 

968 

969 

970class AstNotEqual(Exception): 

971 """The two given AST's are not equivalent.""" 

972 

973 

974class FailFast(Exception): 

975 """Abort tests in TestRunner class.""" 

976#@+node:ekr.20141012064706.18390: ** class AstDumper 

977class AstDumper: # pragma: no cover 

978 """A class supporting various kinds of dumps of ast nodes.""" 

979 #@+others 

980 #@+node:ekr.20191112033445.1: *3* dumper.dump_tree & helper 

981 def dump_tree(self, tokens, tree): 

982 """Briefly show a tree, properly indented.""" 

983 self.tokens = tokens 

984 result = [self.show_header()] 

985 self.dump_tree_and_links_helper(tree, 0, result) 

986 return ''.join(result) 

987 #@+node:ekr.20191125035321.1: *4* dumper.dump_tree_and_links_helper 

988 def dump_tree_and_links_helper(self, node, level, result): 

989 """Return the list of lines in result.""" 

990 if node is None: 

991 return 

992 # Let block. 

993 indent = ' ' * 2 * level 

994 children: List[ast.AST] = getattr(node, 'children', []) 

995 node_s = self.compute_node_string(node, level) 

996 # Dump... 

997 if isinstance(node, (list, tuple)): 

998 for z in node: 

999 self.dump_tree_and_links_helper(z, level, result) 

1000 elif isinstance(node, str): 

1001 result.append(f"{indent}{node.__class__.__name__:>8}:{node}\n") 

1002 elif isinstance(node, ast.AST): 

1003 # Node and parent. 

1004 result.append(node_s) 

1005 # Children. 

1006 for z in children: 

1007 self.dump_tree_and_links_helper(z, level + 1, result) 

1008 else: 

1009 result.append(node_s) 

1010 #@+node:ekr.20191125035600.1: *3* dumper.compute_node_string & helpers 

1011 def compute_node_string(self, node, level): 

1012 """Return a string summarizing the node.""" 

1013 indent = ' ' * 2 * level 

1014 parent = getattr(node, 'parent', None) 

1015 node_id = getattr(node, 'node_index', '??') 

1016 parent_id = getattr(parent, 'node_index', '??') 

1017 parent_s = f"{parent_id:>3}.{parent.__class__.__name__} " if parent else '' 

1018 class_name = node.__class__.__name__ 

1019 descriptor_s = f"{node_id}.{class_name}: " + self.show_fields( 

1020 class_name, node, 30) 

1021 tokens_s = self.show_tokens(node, 70, 100) 

1022 lines = self.show_line_range(node) 

1023 full_s1 = f"{parent_s:<16} {lines:<10} {indent}{descriptor_s} " 

1024 node_s = f"{full_s1:<62} {tokens_s}\n" 

1025 return node_s 

1026 #@+node:ekr.20191113223424.1: *4* dumper.show_fields 

1027 def show_fields(self, class_name, node, truncate_n): 

1028 """Return a string showing interesting fields of the node.""" 

1029 val = '' 

1030 if class_name == 'JoinedStr': 

1031 values = node.values 

1032 assert isinstance(values, list) 

1033 # Str tokens may represent *concatenated* strings. 

1034 results = [] 

1035 fstrings, strings = 0, 0 

1036 for z in values: 

1037 assert isinstance(z, (ast.FormattedValue, ast.Str)) 

1038 if isinstance(z, ast.Str): 

1039 results.append(z.s) 

1040 strings += 1 

1041 else: 

1042 results.append(z.__class__.__name__) 

1043 fstrings += 1 

1044 val = f"{strings} str, {fstrings} f-str" 

1045 elif class_name == 'keyword': 

1046 if isinstance(node.value, ast.Str): 

1047 val = f"arg={node.arg}..Str.value.s={node.value.s}" 

1048 elif isinstance(node.value, ast.Name): 

1049 val = f"arg={node.arg}..Name.value.id={node.value.id}" 

1050 else: 

1051 val = f"arg={node.arg}..value={node.value.__class__.__name__}" 

1052 elif class_name == 'Name': 

1053 val = f"id={node.id!r}" 

1054 elif class_name == 'NameConstant': 

1055 val = f"value={node.value!r}" 

1056 elif class_name == 'Num': 

1057 val = f"n={node.n}" 

1058 elif class_name == 'Starred': 

1059 if isinstance(node.value, ast.Str): 

1060 val = f"s={node.value.s}" 

1061 elif isinstance(node.value, ast.Name): 

1062 val = f"id={node.value.id}" 

1063 else: 

1064 val = f"s={node.value.__class__.__name__}" 

1065 elif class_name == 'Str': 

1066 val = f"s={node.s!r}" 

1067 elif class_name in ('AugAssign', 'BinOp', 'BoolOp', 'UnaryOp'): # IfExp 

1068 name = node.op.__class__.__name__ 

1069 val = f"op={_op_names.get(name, name)}" 

1070 elif class_name == 'Compare': 

1071 ops = ','.join([op_name(z) for z in node.ops]) 

1072 val = f"ops='{ops}'" 

1073 else: 

1074 val = '' 

1075 return g.truncate(val, truncate_n) 

1076 #@+node:ekr.20191114054726.1: *4* dumper.show_line_range 

1077 def show_line_range(self, node): 

1078 

1079 token_list = get_node_token_list(node, self.tokens) 

1080 if not token_list: 

1081 return '' 

1082 min_ = min([z.line_number for z in token_list]) 

1083 max_ = max([z.line_number for z in token_list]) 

1084 return f"{min_}" if min_ == max_ else f"{min_}..{max_}" 

1085 #@+node:ekr.20191113223425.1: *4* dumper.show_tokens 

1086 def show_tokens(self, node, n, m, show_cruft=False): 

1087 """ 

1088 Return a string showing node.token_list. 

1089 

1090 Split the result if n + len(result) > m 

1091 """ 

1092 token_list = get_node_token_list(node, self.tokens) 

1093 result = [] 

1094 for z in token_list: 

1095 val = None 

1096 if z.kind == 'comment': 

1097 if show_cruft: 

1098 val = g.truncate(z.value, 10) # Short is good. 

1099 result.append(f"{z.kind}.{z.index}({val})") 

1100 elif z.kind == 'name': 

1101 val = g.truncate(z.value, 20) 

1102 result.append(f"{z.kind}.{z.index}({val})") 

1103 elif z.kind == 'newline': 

1104 # result.append(f"{z.kind}.{z.index}({z.line_number}:{len(z.line)})") 

1105 result.append(f"{z.kind}.{z.index}") 

1106 elif z.kind == 'number': 

1107 result.append(f"{z.kind}.{z.index}({z.value})") 

1108 elif z.kind == 'op': 

1109 if z.value not in ',()' or show_cruft: 

1110 result.append(f"{z.kind}.{z.index}({z.value})") 

1111 elif z.kind == 'string': 

1112 val = g.truncate(z.value, 30) 

1113 result.append(f"{z.kind}.{z.index}({val})") 

1114 elif z.kind == 'ws': 

1115 if show_cruft: 

1116 result.append(f"{z.kind}.{z.index}({len(z.value)})") 

1117 else: 

1118 # Indent, dedent, encoding, etc. 

1119 # Don't put a blank. 

1120 continue 

1121 if result and result[-1] != ' ': 

1122 result.append(' ') 

1123 # 

1124 # split the line if it is too long. 

1125 # g.printObj(result, tag='show_tokens') 

1126 if 1: 

1127 return ''.join(result) 

1128 line, lines = [], [] 

1129 for r in result: 

1130 line.append(r) 

1131 if n + len(''.join(line)) >= m: 

1132 lines.append(''.join(line)) 

1133 line = [] 

1134 lines.append(''.join(line)) 

1135 pad = '\n' + ' ' * n 

1136 return pad.join(lines) 

1137 #@+node:ekr.20191110165235.5: *3* dumper.show_header 

1138 def show_header(self): 

1139 """Return a header string, but only the fist time.""" 

1140 return ( 

1141 f"{'parent':<16} {'lines':<10} {'node':<34} {'tokens'}\n" 

1142 f"{'======':<16} {'=====':<10} {'====':<34} {'======'}\n") 

1143 #@+node:ekr.20141012064706.18392: *3* dumper.dump_ast & helper 

1144 annotate_fields = False 

1145 include_attributes = False 

1146 indent_ws = ' ' 

1147 

1148 def dump_ast(self, node, level=0): 

1149 """ 

1150 Dump an ast tree. Adapted from ast.dump. 

1151 """ 

1152 sep1 = '\n%s' % (self.indent_ws * (level + 1)) 

1153 if isinstance(node, ast.AST): 

1154 fields = [(a, self.dump_ast(b, level + 1)) for a, b in self.get_fields(node)] 

1155 if self.include_attributes and node._attributes: 

1156 fields.extend([(a, self.dump_ast(getattr(node, a), level + 1)) 

1157 for a in node._attributes]) 

1158 if self.annotate_fields: 

1159 aList = ['%s=%s' % (a, b) for a, b in fields] 

1160 else: 

1161 aList = [b for a, b in fields] 

1162 name = node.__class__.__name__ 

1163 sep = '' if len(aList) <= 1 else sep1 

1164 return '%s(%s%s)' % (name, sep, sep1.join(aList)) 

1165 if isinstance(node, list): 

1166 sep = sep1 

1167 return 'LIST[%s]' % ''.join( 

1168 ['%s%s' % (sep, self.dump_ast(z, level + 1)) for z in node]) 

1169 return repr(node) 

1170 #@+node:ekr.20141012064706.18393: *4* dumper.get_fields 

1171 def get_fields(self, node): 

1172 

1173 return ( 

1174 (a, b) for a, b in ast.iter_fields(node) 

1175 if a not in ['ctx',] and b not in (None, []) 

1176 ) 

1177 #@-others 

1178#@+node:ekr.20191227170628.1: ** TOG classes... 

1179#@+node:ekr.20191113063144.1: *3* class TokenOrderGenerator 

1180class TokenOrderGenerator: 

1181 """ 

1182 A class that traverses ast (parse) trees in token order. 

1183 

1184 Overview: https://github.com/leo-editor/leo-editor/issues/1440#issue-522090981 

1185 

1186 Theory of operation: 

1187 - https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-573661883 

1188 - http://leoeditor.com/appendices.html#tokenorder-classes-theory-of-operation 

1189 

1190 How to: http://leoeditor.com/appendices.html#tokenorder-class-how-to 

1191 

1192 Project history: https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-574145510 

1193 """ 

1194 

1195 n_nodes = 0 # The number of nodes that have been visited. 

1196 #@+others 

1197 #@+node:ekr.20200103174914.1: *4* tog: Init... 

1198 #@+node:ekr.20191228184647.1: *5* tog.balance_tokens 

1199 def balance_tokens(self, tokens): 

1200 """ 

1201 TOG.balance_tokens. 

1202 

1203 Insert two-way links between matching paren tokens. 

1204 """ 

1205 count, stack = 0, [] 

1206 for token in tokens: 

1207 if token.kind == 'op': 

1208 if token.value == '(': 

1209 count += 1 

1210 stack.append(token.index) 

1211 if token.value == ')': 

1212 if stack: 

1213 index = stack.pop() 

1214 tokens[index].matching_paren = token.index 

1215 tokens[token.index].matching_paren = index 

1216 else: # pragma: no cover 

1217 g.trace(f"unmatched ')' at index {token.index}") 

1218 if stack: # pragma: no cover 

1219 g.trace("unmatched '(' at {','.join(stack)}") 

1220 return count 

1221 #@+node:ekr.20191113063144.4: *5* tog.create_links 

1222 def create_links(self, tokens, tree, file_name=''): 

1223 """ 

1224 A generator creates two-way links between the given tokens and ast-tree. 

1225 

1226 Callers should call this generator with list(tog.create_links(...)) 

1227 

1228 The sync_tokens method creates the links and verifies that the resulting 

1229 tree traversal generates exactly the given tokens in exact order. 

1230 

1231 tokens: the list of Token instances for the input. 

1232 Created by make_tokens(). 

1233 tree: the ast tree for the input. 

1234 Created by parse_ast(). 

1235 """ 

1236 # 

1237 # Init all ivars. 

1238 self.file_name = file_name # For tests. 

1239 self.level = 0 # Python indentation level. 

1240 self.node = None # The node being visited. 

1241 self.tokens = tokens # The immutable list of input tokens. 

1242 self.tree = tree # The tree of ast.AST nodes. 

1243 # 

1244 # Traverse the tree. 

1245 try: 

1246 while True: 

1247 next(self.visitor(tree)) 

1248 except StopIteration: 

1249 pass 

1250 # 

1251 # Ensure that all tokens are patched. 

1252 self.node = tree 

1253 yield from self.gen_token('endmarker', '') 

1254 #@+node:ekr.20191229071733.1: *5* tog.init_from_file 

1255 def init_from_file(self, filename): # pragma: no cover 

1256 """ 

1257 Create the tokens and ast tree for the given file. 

1258 Create links between tokens and the parse tree. 

1259 Return (contents, encoding, tokens, tree). 

1260 """ 

1261 self.level = 0 

1262 self.filename = filename 

1263 encoding, contents = read_file_with_encoding(filename) 

1264 if not contents: 

1265 return None, None, None, None 

1266 self.tokens = tokens = make_tokens(contents) 

1267 self.tree = tree = parse_ast(contents) 

1268 list(self.create_links(tokens, tree)) 

1269 return contents, encoding, tokens, tree 

1270 #@+node:ekr.20191229071746.1: *5* tog.init_from_string 

1271 def init_from_string(self, contents, filename): # pragma: no cover 

1272 """ 

1273 Tokenize, parse and create links in the contents string. 

1274 

1275 Return (tokens, tree). 

1276 """ 

1277 self.filename = filename 

1278 self.level = 0 

1279 self.tokens = tokens = make_tokens(contents) 

1280 self.tree = tree = parse_ast(contents) 

1281 list(self.create_links(tokens, tree)) 

1282 return tokens, tree 

1283 #@+node:ekr.20191223052749.1: *4* tog: Traversal... 

1284 #@+node:ekr.20191113063144.3: *5* tog.begin_visitor 

1285 begin_end_stack: List[str] = [] 

1286 node_index = 0 # The index into the node_stack. 

1287 node_stack: List[ast.AST] = [] # The stack of parent nodes. 

1288 

1289 def begin_visitor(self, node): 

1290 """Enter a visitor.""" 

1291 # Update the stats. 

1292 self.n_nodes += 1 

1293 # Do this first, *before* updating self.node. 

1294 node.parent = self.node 

1295 if self.node: 

1296 children = getattr(self.node, 'children', []) # type:ignore 

1297 children.append(node) 

1298 self.node.children = children 

1299 # Inject the node_index field. 

1300 assert not hasattr(node, 'node_index'), g.callers() 

1301 node.node_index = self.node_index 

1302 self.node_index += 1 

1303 # begin_visitor and end_visitor must be paired. 

1304 self.begin_end_stack.append(node.__class__.__name__) 

1305 # Push the previous node. 

1306 self.node_stack.append(self.node) 

1307 # Update self.node *last*. 

1308 self.node = node 

1309 #@+node:ekr.20200104032811.1: *5* tog.end_visitor 

1310 def end_visitor(self, node): 

1311 """Leave a visitor.""" 

1312 # begin_visitor and end_visitor must be paired. 

1313 entry_name = self.begin_end_stack.pop() 

1314 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}" 

1315 assert self.node == node, (repr(self.node), repr(node)) 

1316 # Restore self.node. 

1317 self.node = self.node_stack.pop() 

1318 #@+node:ekr.20200110162044.1: *5* tog.find_next_significant_token 

1319 def find_next_significant_token(self): 

1320 """ 

1321 Scan from *after* self.tokens[px] looking for the next significant 

1322 token. 

1323 

1324 Return the token, or None. Never change self.px. 

1325 """ 

1326 px = self.px + 1 

1327 while px < len(self.tokens): 

1328 token = self.tokens[px] 

1329 px += 1 

1330 if is_significant_token(token): 

1331 return token 

1332 # This will never happen, because endtoken is significant. 

1333 return None # pragma: no cover 

1334 #@+node:ekr.20191121180100.1: *5* tog.gen* 

1335 # Useful wrappers... 

1336 

1337 def gen(self, z): 

1338 yield from self.visitor(z) 

1339 

1340 def gen_name(self, val): 

1341 yield from self.visitor(self.sync_name(val)) # type:ignore 

1342 

1343 def gen_op(self, val): 

1344 yield from self.visitor(self.sync_op(val)) # type:ignore 

1345 

1346 def gen_token(self, kind, val): 

1347 yield from self.visitor(self.sync_token(kind, val)) # type:ignore 

1348 #@+node:ekr.20191113063144.7: *5* tog.sync_token & set_links 

1349 px = -1 # Index of the previously synced token. 

1350 

1351 def sync_token(self, kind, val): 

1352 """ 

1353 Sync to a token whose kind & value are given. The token need not be 

1354 significant, but it must be guaranteed to exist in the token list. 

1355 

1356 The checks in this method constitute a strong, ever-present, unit test. 

1357 

1358 Scan the tokens *after* px, looking for a token T matching (kind, val). 

1359 raise AssignLinksError if a significant token is found that doesn't match T. 

1360 Otherwise: 

1361 - Create two-way links between all assignable tokens between px and T. 

1362 - Create two-way links between T and self.node. 

1363 - Advance by updating self.px to point to T. 

1364 """ 

1365 node, tokens = self.node, self.tokens 

1366 assert isinstance(node, ast.AST), repr(node) 

1367 # g.trace( 

1368 # f"px: {self.px:2} " 

1369 # f"node: {node.__class__.__name__:<10} " 

1370 # f"kind: {kind:>10}: val: {val!r}") 

1371 # 

1372 # Step one: Look for token T. 

1373 old_px = px = self.px + 1 

1374 while px < len(self.tokens): 

1375 token = tokens[px] 

1376 if (kind, val) == (token.kind, token.value): 

1377 break # Success. 

1378 if kind == token.kind == 'number': 

1379 val = token.value 

1380 break # Benign: use the token's value, a string, instead of a number. 

1381 if is_significant_token(token): # pragma: no cover 

1382 line_s = f"line {token.line_number}:" 

1383 val = str(val) # for g.truncate. 

1384 raise AssignLinksError( 

1385 f" file: {self.filename}\n" 

1386 f"{line_s:>12} {token.line.strip()}\n" 

1387 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n" 

1388 f" found: {token.kind}.{token.value!r}\n" 

1389 f"token.index: {token.index}\n") 

1390 # Skip the insignificant token. 

1391 px += 1 

1392 else: # pragma: no cover 

1393 val = str(val) # for g.truncate. 

1394 raise AssignLinksError( 

1395 f" file: {self.filename}\n" 

1396 f"Looking for: {kind}.{g.truncate(val, 40)}\n" 

1397 f" found: end of token list") 

1398 # 

1399 # Step two: Assign *secondary* links only for newline tokens. 

1400 # Ignore all other non-significant tokens. 

1401 while old_px < px: 

1402 token = tokens[old_px] 

1403 old_px += 1 

1404 if token.kind in ('comment', 'newline', 'nl'): 

1405 self.set_links(node, token) 

1406 # 

1407 # Step three: Set links in the found token. 

1408 token = tokens[px] 

1409 self.set_links(node, token) 

1410 # 

1411 # Step four: Advance. 

1412 self.px = px 

1413 #@+node:ekr.20191125120814.1: *6* tog.set_links 

1414 last_statement_node = None 

1415 

1416 def set_links(self, node, token): 

1417 """Make two-way links between token and the given node.""" 

1418 # Don't bother assigning comment, comma, parens, ws and endtoken tokens. 

1419 if token.kind == 'comment': 

1420 # Append the comment to node.comment_list. 

1421 comment_list = getattr(node, 'comment_list', []) # type:ignore 

1422 node.comment_list = comment_list + [token] 

1423 return 

1424 if token.kind in ('endmarker', 'ws'): 

1425 return 

1426 if token.kind == 'op' and token.value in ',()': 

1427 return 

1428 # *Always* remember the last statement. 

1429 statement = find_statement_node(node) 

1430 if statement: 

1431 self.last_statement_node = statement # type:ignore 

1432 assert not isinstance(self.last_statement_node, ast.Module) 

1433 if token.node is not None: # pragma: no cover 

1434 line_s = f"line {token.line_number}:" 

1435 raise AssignLinksError( 

1436 f" file: {self.filename}\n" 

1437 f"{line_s:>12} {token.line.strip()}\n" 

1438 f"token index: {self.px}\n" 

1439 f"token.node is not None\n" 

1440 f" token.node: {token.node.__class__.__name__}\n" 

1441 f" callers: {g.callers()}") 

1442 # Assign newlines to the previous statement node, if any. 

1443 if token.kind in ('newline', 'nl'): 

1444 # Set an *auxilliary* link for the split/join logic. 

1445 # Do *not* set token.node! 

1446 token.statement_node = self.last_statement_node 

1447 return 

1448 if is_significant_token(token): 

1449 # Link the token to the ast node. 

1450 token.node = node # type:ignore 

1451 # Add the token to node's token_list. 

1452 add_token_to_token_list(token, node) 

1453 #@+node:ekr.20191124083124.1: *5* tog.sync_name and sync_op 

1454 # It's valid for these to return None. 

1455 

1456 def sync_name(self, val): 

1457 aList = val.split('.') 

1458 if len(aList) == 1: 

1459 self.sync_token('name', val) 

1460 else: 

1461 for i, part in enumerate(aList): 

1462 self.sync_token('name', part) 

1463 if i < len(aList) - 1: 

1464 self.sync_op('.') 

1465 

1466 def sync_op(self, val): 

1467 """ 

1468 Sync to the given operator. 

1469 

1470 val may be '(' or ')' *only* if the parens *will* actually exist in the 

1471 token list. 

1472 """ 

1473 self.sync_token('op', val) 

1474 #@+node:ekr.20191113081443.1: *5* tog.visitor (calls begin/end_visitor) 

1475 def visitor(self, node): 

1476 """Given an ast node, return a *generator* from its visitor.""" 

1477 # This saves a lot of tests. 

1478 trace = False 

1479 if node is None: 

1480 return 

1481 if trace: # pragma: no cover 

1482 # Keep this trace. It's useful. 

1483 cn = node.__class__.__name__ if node else ' ' 

1484 caller1, caller2 = g.callers(2).split(',') 

1485 g.trace(f"{caller1:>15} {caller2:<14} {cn}") 

1486 # More general, more convenient. 

1487 if isinstance(node, (list, tuple)): 

1488 for z in node or []: 

1489 if isinstance(z, ast.AST): 

1490 yield from self.visitor(z) 

1491 else: # pragma: no cover 

1492 # Some fields may contain ints or strings. 

1493 assert isinstance(z, (int, str)), z.__class__.__name__ 

1494 return 

1495 # We *do* want to crash if the visitor doesn't exist. 

1496 method = getattr(self, 'do_' + node.__class__.__name__) 

1497 # Allow begin/end visitor to be generators. 

1498 self.begin_visitor(node) 

1499 yield from method(node) 

1500 self.end_visitor(node) 

1501 #@+node:ekr.20191113063144.13: *4* tog: Visitors... 

1502 #@+node:ekr.20191113063144.32: *5* tog.keyword: not called! 

1503 # keyword arguments supplied to call (NULL identifier for **kwargs) 

1504 

1505 # keyword = (identifier? arg, expr value) 

1506 

1507 def do_keyword(self, node): # pragma: no cover 

1508 """A keyword arg in an ast.Call.""" 

1509 # This should never be called. 

1510 # tog.hande_call_arguments calls self.gen(kwarg_arg.value) instead. 

1511 filename = getattr(self, 'filename', '<no file>') 

1512 raise AssignLinksError( 

1513 f"file: {filename}\n" 

1514 f"do_keyword should never be called\n" 

1515 f"{g.callers(8)}") 

1516 #@+node:ekr.20191113063144.14: *5* tog: Contexts 

1517 #@+node:ekr.20191113063144.28: *6* tog.arg 

1518 # arg = (identifier arg, expr? annotation) 

1519 

1520 def do_arg(self, node): 

1521 """This is one argument of a list of ast.Function or ast.Lambda arguments.""" 

1522 yield from self.gen_name(node.arg) 

1523 annotation = getattr(node, 'annotation', None) 

1524 if annotation is not None: 

1525 yield from self.gen_op(':') 

1526 yield from self.gen(node.annotation) 

1527 #@+node:ekr.20191113063144.27: *6* tog.arguments 

1528 # arguments = ( 

1529 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs, 

1530 # expr* kw_defaults, arg? kwarg, expr* defaults 

1531 # ) 

1532 

1533 def do_arguments(self, node): 

1534 """Arguments to ast.Function or ast.Lambda, **not** ast.Call.""" 

1535 # 

1536 # No need to generate commas anywhere below. 

1537 # 

1538 # Let block. Some fields may not exist pre Python 3.8. 

1539 n_plain = len(node.args) - len(node.defaults) 

1540 posonlyargs = getattr(node, 'posonlyargs', []) # type:ignore 

1541 vararg = getattr(node, 'vararg', None) 

1542 kwonlyargs = getattr(node, 'kwonlyargs', []) # type:ignore 

1543 kw_defaults = getattr(node, 'kw_defaults', []) # type:ignore 

1544 kwarg = getattr(node, 'kwarg', None) 

1545 if 0: 

1546 g.printObj(ast.dump(node.vararg) if node.vararg else 'None', tag='node.vararg') 

1547 g.printObj([ast.dump(z) for z in node.args], tag='node.args') 

1548 g.printObj([ast.dump(z) for z in node.defaults], tag='node.defaults') 

1549 g.printObj([ast.dump(z) for z in posonlyargs], tag='node.posonlyargs') 

1550 g.printObj([ast.dump(z) for z in kwonlyargs], tag='kwonlyargs') 

1551 g.printObj([ast.dump(z) if z else 'None' for z in kw_defaults], tag='kw_defaults') 

1552 # 1. Sync the position-only args. 

1553 if posonlyargs: 

1554 for n, z in enumerate(posonlyargs): 

1555 # g.trace('pos-only', ast.dump(z)) 

1556 yield from self.gen(z) 

1557 yield from self.gen_op('/') 

1558 # 2. Sync all args. 

1559 for i, z in enumerate(node.args): 

1560 yield from self.gen(z) 

1561 if i >= n_plain: 

1562 yield from self.gen_op('=') 

1563 yield from self.gen(node.defaults[i - n_plain]) 

1564 # 3. Sync the vararg. 

1565 if vararg: 

1566 # g.trace('vararg', ast.dump(vararg)) 

1567 yield from self.gen_op('*') 

1568 yield from self.gen(vararg) 

1569 # 4. Sync the keyword-only args. 

1570 if kwonlyargs: 

1571 if not vararg: 

1572 yield from self.gen_op('*') 

1573 for n, z in enumerate(kwonlyargs): 

1574 # g.trace('keyword-only', ast.dump(z)) 

1575 yield from self.gen(z) 

1576 val = kw_defaults[n] 

1577 if val is not None: 

1578 yield from self.gen_op('=') 

1579 yield from self.gen(val) 

1580 # 5. Sync the kwarg. 

1581 if kwarg: 

1582 # g.trace('kwarg', ast.dump(kwarg)) 

1583 yield from self.gen_op('**') 

1584 yield from self.gen(kwarg) 

1585 

1586 #@+node:ekr.20191113063144.15: *6* tog.AsyncFunctionDef 

1587 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, 

1588 # expr? returns) 

1589 

1590 def do_AsyncFunctionDef(self, node): 

1591 

1592 if node.decorator_list: 

1593 for z in node.decorator_list: 

1594 # '@%s\n' 

1595 yield from self.gen_op('@') 

1596 yield from self.gen(z) 

1597 # 'asynch def (%s): -> %s\n' 

1598 # 'asynch def %s(%s):\n' 

1599 async_token_type = 'async' if has_async_tokens else 'name' 

1600 yield from self.gen_token(async_token_type, 'async') 

1601 yield from self.gen_name('def') 

1602 yield from self.gen_name(node.name) # A string 

1603 yield from self.gen_op('(') 

1604 yield from self.gen(node.args) 

1605 yield from self.gen_op(')') 

1606 returns = getattr(node, 'returns', None) 

1607 if returns is not None: 

1608 yield from self.gen_op('->') 

1609 yield from self.gen(node.returns) 

1610 yield from self.gen_op(':') 

1611 self.level += 1 

1612 yield from self.gen(node.body) 

1613 self.level -= 1 

1614 #@+node:ekr.20191113063144.16: *6* tog.ClassDef 

1615 def do_ClassDef(self, node, print_body=True): 

1616 

1617 for z in node.decorator_list or []: 

1618 # @{z}\n 

1619 yield from self.gen_op('@') 

1620 yield from self.gen(z) 

1621 # class name(bases):\n 

1622 yield from self.gen_name('class') 

1623 yield from self.gen_name(node.name) # A string. 

1624 if node.bases: 

1625 yield from self.gen_op('(') 

1626 yield from self.gen(node.bases) 

1627 yield from self.gen_op(')') 

1628 yield from self.gen_op(':') 

1629 # Body... 

1630 self.level += 1 

1631 yield from self.gen(node.body) 

1632 self.level -= 1 

1633 #@+node:ekr.20191113063144.17: *6* tog.FunctionDef 

1634 # FunctionDef( 

1635 # identifier name, arguments args, 

1636 # stmt* body, 

1637 # expr* decorator_list, 

1638 # expr? returns, 

1639 # string? type_comment) 

1640 

1641 def do_FunctionDef(self, node): 

1642 

1643 # Guards... 

1644 returns = getattr(node, 'returns', None) 

1645 # Decorators... 

1646 # @{z}\n 

1647 for z in node.decorator_list or []: 

1648 yield from self.gen_op('@') 

1649 yield from self.gen(z) 

1650 # Signature... 

1651 # def name(args): -> returns\n 

1652 # def name(args):\n 

1653 yield from self.gen_name('def') 

1654 yield from self.gen_name(node.name) # A string. 

1655 yield from self.gen_op('(') 

1656 yield from self.gen(node.args) 

1657 yield from self.gen_op(')') 

1658 if returns is not None: 

1659 yield from self.gen_op('->') 

1660 yield from self.gen(node.returns) 

1661 yield from self.gen_op(':') 

1662 # Body... 

1663 self.level += 1 

1664 yield from self.gen(node.body) 

1665 self.level -= 1 

1666 #@+node:ekr.20191113063144.18: *6* tog.Interactive 

1667 def do_Interactive(self, node): # pragma: no cover 

1668 

1669 yield from self.gen(node.body) 

1670 #@+node:ekr.20191113063144.20: *6* tog.Lambda 

1671 def do_Lambda(self, node): 

1672 

1673 yield from self.gen_name('lambda') 

1674 yield from self.gen(node.args) 

1675 yield from self.gen_op(':') 

1676 yield from self.gen(node.body) 

1677 #@+node:ekr.20191113063144.19: *6* tog.Module 

1678 def do_Module(self, node): 

1679 

1680 # Encoding is a non-syncing statement. 

1681 yield from self.gen(node.body) 

1682 #@+node:ekr.20191113063144.21: *5* tog: Expressions 

1683 #@+node:ekr.20191113063144.22: *6* tog.Expr 

1684 def do_Expr(self, node): 

1685 """An outer expression.""" 

1686 # No need to put parentheses. 

1687 yield from self.gen(node.value) 

1688 #@+node:ekr.20191113063144.23: *6* tog.Expression 

1689 def do_Expression(self, node): # pragma: no cover 

1690 """An inner expression.""" 

1691 # No need to put parentheses. 

1692 yield from self.gen(node.body) 

1693 #@+node:ekr.20191113063144.24: *6* tog.GeneratorExp 

1694 def do_GeneratorExp(self, node): 

1695 

1696 # '<gen %s for %s>' % (elt, ','.join(gens)) 

1697 # No need to put parentheses or commas. 

1698 yield from self.gen(node.elt) 

1699 yield from self.gen(node.generators) 

1700 #@+node:ekr.20210321171703.1: *6* tog.NamedExpr 

1701 # NamedExpr(expr target, expr value) 

1702 

1703 def do_NamedExpr(self, node): # Python 3.8+ 

1704 

1705 yield from self.gen(node.target) 

1706 yield from self.gen_op(':=') 

1707 yield from self.gen(node.value) 

1708 #@+node:ekr.20191113063144.26: *5* tog: Operands 

1709 #@+node:ekr.20191113063144.29: *6* tog.Attribute 

1710 # Attribute(expr value, identifier attr, expr_context ctx) 

1711 

1712 def do_Attribute(self, node): 

1713 

1714 yield from self.gen(node.value) 

1715 yield from self.gen_op('.') 

1716 yield from self.gen_name(node.attr) # A string. 

1717 #@+node:ekr.20191113063144.30: *6* tog.Bytes 

1718 def do_Bytes(self, node): 

1719 

1720 """ 

1721 It's invalid to mix bytes and non-bytes literals, so just 

1722 advancing to the next 'string' token suffices. 

1723 """ 

1724 token = self.find_next_significant_token() 

1725 yield from self.gen_token('string', token.value) 

1726 #@+node:ekr.20191113063144.33: *6* tog.comprehension 

1727 # comprehension = (expr target, expr iter, expr* ifs, int is_async) 

1728 

1729 def do_comprehension(self, node): 

1730 

1731 # No need to put parentheses. 

1732 yield from self.gen_name('for') # #1858. 

1733 yield from self.gen(node.target) # A name 

1734 yield from self.gen_name('in') 

1735 yield from self.gen(node.iter) 

1736 for z in node.ifs or []: 

1737 yield from self.gen_name('if') 

1738 yield from self.gen(z) 

1739 #@+node:ekr.20191113063144.34: *6* tog.Constant 

1740 def do_Constant(self, node): # pragma: no cover 

1741 """ 

1742 

1743 https://greentreesnakes.readthedocs.io/en/latest/nodes.html 

1744 

1745 A constant. The value attribute holds the Python object it represents. 

1746 This can be simple types such as a number, string or None, but also 

1747 immutable container types (tuples and frozensets) if all of their 

1748 elements are constant. 

1749 """ 

1750 

1751 # Support Python 3.8. 

1752 if node.value is None or isinstance(node.value, bool): 

1753 # Weird: return a name! 

1754 yield from self.gen_token('name', repr(node.value)) 

1755 elif node.value == Ellipsis: 

1756 yield from self.gen_op('...') 

1757 elif isinstance(node.value, str): 

1758 yield from self.do_Str(node) 

1759 elif isinstance(node.value, (int, float)): 

1760 yield from self.gen_token('number', repr(node.value)) 

1761 elif isinstance(node.value, bytes): 

1762 yield from self.do_Bytes(node) 

1763 elif isinstance(node.value, tuple): 

1764 yield from self.do_Tuple(node) 

1765 elif isinstance(node.value, frozenset): 

1766 yield from self.do_Set(node) 

1767 else: 

1768 # Unknown type. 

1769 g.trace('----- Oops -----', repr(node.value), g.callers()) 

1770 #@+node:ekr.20191113063144.35: *6* tog.Dict 

1771 # Dict(expr* keys, expr* values) 

1772 

1773 def do_Dict(self, node): 

1774 

1775 assert len(node.keys) == len(node.values) 

1776 yield from self.gen_op('{') 

1777 # No need to put commas. 

1778 for i, key in enumerate(node.keys): 

1779 key, value = node.keys[i], node.values[i] 

1780 yield from self.gen(key) # a Str node. 

1781 yield from self.gen_op(':') 

1782 if value is not None: 

1783 yield from self.gen(value) 

1784 yield from self.gen_op('}') 

1785 #@+node:ekr.20191113063144.36: *6* tog.DictComp 

1786 # DictComp(expr key, expr value, comprehension* generators) 

1787 

1788 # d2 = {val: key for key, val in d} 

1789 

1790 def do_DictComp(self, node): 

1791 

1792 yield from self.gen_token('op', '{') 

1793 yield from self.gen(node.key) 

1794 yield from self.gen_op(':') 

1795 yield from self.gen(node.value) 

1796 for z in node.generators or []: 

1797 yield from self.gen(z) 

1798 yield from self.gen_token('op', '}') 

1799 #@+node:ekr.20191113063144.37: *6* tog.Ellipsis 

1800 def do_Ellipsis(self, node): # pragma: no cover (Does not exist for python 3.8+) 

1801 

1802 yield from self.gen_op('...') 

1803 #@+node:ekr.20191113063144.38: *6* tog.ExtSlice 

1804 # https://docs.python.org/3/reference/expressions.html#slicings 

1805 

1806 # ExtSlice(slice* dims) 

1807 

1808 def do_ExtSlice(self, node): # pragma: no cover (deprecated) 

1809 

1810 # ','.join(node.dims) 

1811 for i, z in enumerate(node.dims): 

1812 yield from self.gen(z) 

1813 if i < len(node.dims) - 1: 

1814 yield from self.gen_op(',') 

1815 #@+node:ekr.20191113063144.40: *6* tog.Index 

1816 def do_Index(self, node): # pragma: no cover (deprecated) 

1817 

1818 yield from self.gen(node.value) 

1819 #@+node:ekr.20191113063144.39: *6* tog.FormattedValue: not called! 

1820 # FormattedValue(expr value, int? conversion, expr? format_spec) 

1821 

1822 def do_FormattedValue(self, node): # pragma: no cover 

1823 """ 

1824 This node represents the *components* of a *single* f-string. 

1825 

1826 Happily, JoinedStr nodes *also* represent *all* f-strings, 

1827 so the TOG should *never visit this node! 

1828 """ 

1829 filename = getattr(self, 'filename', '<no file>') 

1830 raise AssignLinksError( 

1831 f"file: {filename}\n" 

1832 f"do_FormattedValue should never be called") 

1833 

1834 # This code has no chance of being useful... 

1835 

1836 # conv = node.conversion 

1837 # spec = node.format_spec 

1838 # yield from self.gen(node.value) 

1839 # if conv is not None: 

1840 # yield from self.gen_token('number', conv) 

1841 # if spec is not None: 

1842 # yield from self.gen(node.format_spec) 

1843 #@+node:ekr.20191113063144.41: *6* tog.JoinedStr & helpers 

1844 # JoinedStr(expr* values) 

1845 

1846 def do_JoinedStr(self, node): 

1847 """ 

1848 JoinedStr nodes represent at least one f-string and all other strings 

1849 concatentated to it. 

1850 

1851 Analyzing JoinedStr.values would be extremely tricky, for reasons that 

1852 need not be explained here. 

1853 

1854 Instead, we get the tokens *from the token list itself*! 

1855 """ 

1856 for z in self.get_concatenated_string_tokens(): 

1857 yield from self.gen_token(z.kind, z.value) 

1858 #@+node:ekr.20191113063144.42: *6* tog.List 

1859 def do_List(self, node): 

1860 

1861 # No need to put commas. 

1862 yield from self.gen_op('[') 

1863 yield from self.gen(node.elts) 

1864 yield from self.gen_op(']') 

1865 #@+node:ekr.20191113063144.43: *6* tog.ListComp 

1866 # ListComp(expr elt, comprehension* generators) 

1867 

1868 def do_ListComp(self, node): 

1869 

1870 yield from self.gen_op('[') 

1871 yield from self.gen(node.elt) 

1872 for z in node.generators: 

1873 yield from self.gen(z) 

1874 yield from self.gen_op(']') 

1875 #@+node:ekr.20191113063144.44: *6* tog.Name & NameConstant 

1876 def do_Name(self, node): 

1877 

1878 yield from self.gen_name(node.id) 

1879 

1880 def do_NameConstant(self, node): # pragma: no cover (Does not exist in Python 3.8+) 

1881 

1882 yield from self.gen_name(repr(node.value)) 

1883 

1884 #@+node:ekr.20191113063144.45: *6* tog.Num 

1885 def do_Num(self, node): # pragma: no cover (Does not exist in Python 3.8+) 

1886 

1887 yield from self.gen_token('number', node.n) 

1888 #@+node:ekr.20191113063144.47: *6* tog.Set 

1889 # Set(expr* elts) 

1890 

1891 def do_Set(self, node): 

1892 

1893 yield from self.gen_op('{') 

1894 yield from self.gen(node.elts) 

1895 yield from self.gen_op('}') 

1896 #@+node:ekr.20191113063144.48: *6* tog.SetComp 

1897 # SetComp(expr elt, comprehension* generators) 

1898 

1899 def do_SetComp(self, node): 

1900 

1901 yield from self.gen_op('{') 

1902 yield from self.gen(node.elt) 

1903 for z in node.generators or []: 

1904 yield from self.gen(z) 

1905 yield from self.gen_op('}') 

1906 #@+node:ekr.20191113063144.49: *6* tog.Slice 

1907 # slice = Slice(expr? lower, expr? upper, expr? step) 

1908 

1909 def do_Slice(self, node): 

1910 

1911 lower = getattr(node, 'lower', None) 

1912 upper = getattr(node, 'upper', None) 

1913 step = getattr(node, 'step', None) 

1914 if lower is not None: 

1915 yield from self.gen(lower) 

1916 # Always put the colon between upper and lower. 

1917 yield from self.gen_op(':') 

1918 if upper is not None: 

1919 yield from self.gen(upper) 

1920 # Put the second colon if it exists in the token list. 

1921 if step is None: 

1922 token = self.find_next_significant_token() 

1923 if token and token.value == ':': 

1924 yield from self.gen_op(':') 

1925 else: 

1926 yield from self.gen_op(':') 

1927 yield from self.gen(step) 

1928 #@+node:ekr.20191113063144.50: *6* tog.Str & helper 

1929 def do_Str(self, node): 

1930 """This node represents a string constant.""" 

1931 # This loop is necessary to handle string concatenation. 

1932 for z in self.get_concatenated_string_tokens(): 

1933 yield from self.gen_token(z.kind, z.value) 

1934 #@+node:ekr.20200111083914.1: *7* tog.get_concatenated_tokens 

1935 def get_concatenated_string_tokens(self): 

1936 """ 

1937 Return the next 'string' token and all 'string' tokens concatenated to 

1938 it. *Never* update self.px here. 

1939 """ 

1940 trace = False 

1941 tag = 'tog.get_concatenated_string_tokens' 

1942 i = self.px 

1943 # First, find the next significant token. It should be a string. 

1944 i, token = i + 1, None 

1945 while i < len(self.tokens): 

1946 token = self.tokens[i] 

1947 i += 1 

1948 if token.kind == 'string': 

1949 # Rescan the string. 

1950 i -= 1 

1951 break 

1952 # An error. 

1953 if is_significant_token(token): # pragma: no cover 

1954 break 

1955 # Raise an error if we didn't find the expected 'string' token. 

1956 if not token or token.kind != 'string': # pragma: no cover 

1957 if not token: 

1958 token = self.tokens[-1] 

1959 filename = getattr(self, 'filename', '<no filename>') 

1960 raise AssignLinksError( 

1961 f"\n" 

1962 f"{tag}...\n" 

1963 f"file: {filename}\n" 

1964 f"line: {token.line_number}\n" 

1965 f" i: {i}\n" 

1966 f"expected 'string' token, got {token!s}") 

1967 # Accumulate string tokens. 

1968 assert self.tokens[i].kind == 'string' 

1969 results = [] 

1970 while i < len(self.tokens): 

1971 token = self.tokens[i] 

1972 i += 1 

1973 if token.kind == 'string': 

1974 results.append(token) 

1975 elif token.kind == 'op' or is_significant_token(token): 

1976 # Any significant token *or* any op will halt string concatenation. 

1977 break 

1978 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc. 

1979 # The (significant) 'endmarker' token ensures we will have result. 

1980 assert results 

1981 if trace: # pragma: no cover 

1982 g.printObj(results, tag=f"{tag}: Results") 

1983 return results 

1984 #@+node:ekr.20191113063144.51: *6* tog.Subscript 

1985 # Subscript(expr value, slice slice, expr_context ctx) 

1986 

1987 def do_Subscript(self, node): 

1988 

1989 yield from self.gen(node.value) 

1990 yield from self.gen_op('[') 

1991 yield from self.gen(node.slice) 

1992 yield from self.gen_op(']') 

1993 #@+node:ekr.20191113063144.52: *6* tog.Tuple 

1994 # Tuple(expr* elts, expr_context ctx) 

1995 

1996 def do_Tuple(self, node): 

1997 

1998 # Do not call gen_op for parens or commas here. 

1999 # They do not necessarily exist in the token list! 

2000 yield from self.gen(node.elts) 

2001 #@+node:ekr.20191113063144.53: *5* tog: Operators 

2002 #@+node:ekr.20191113063144.55: *6* tog.BinOp 

2003 def do_BinOp(self, node): 

2004 

2005 op_name_ = op_name(node.op) 

2006 yield from self.gen(node.left) 

2007 yield from self.gen_op(op_name_) 

2008 yield from self.gen(node.right) 

2009 #@+node:ekr.20191113063144.56: *6* tog.BoolOp 

2010 # BoolOp(boolop op, expr* values) 

2011 

2012 def do_BoolOp(self, node): 

2013 

2014 # op.join(node.values) 

2015 op_name_ = op_name(node.op) 

2016 for i, z in enumerate(node.values): 

2017 yield from self.gen(z) 

2018 if i < len(node.values) - 1: 

2019 yield from self.gen_name(op_name_) 

2020 #@+node:ekr.20191113063144.57: *6* tog.Compare 

2021 # Compare(expr left, cmpop* ops, expr* comparators) 

2022 

2023 def do_Compare(self, node): 

2024 

2025 assert len(node.ops) == len(node.comparators) 

2026 yield from self.gen(node.left) 

2027 for i, z in enumerate(node.ops): 

2028 op_name_ = op_name(node.ops[i]) 

2029 if op_name_ in ('not in', 'is not'): 

2030 for z in op_name_.split(' '): 

2031 yield from self.gen_name(z) 

2032 elif op_name_.isalpha(): 

2033 yield from self.gen_name(op_name_) 

2034 else: 

2035 yield from self.gen_op(op_name_) 

2036 yield from self.gen(node.comparators[i]) 

2037 #@+node:ekr.20191113063144.58: *6* tog.UnaryOp 

2038 def do_UnaryOp(self, node): 

2039 

2040 op_name_ = op_name(node.op) 

2041 if op_name_.isalpha(): 

2042 yield from self.gen_name(op_name_) 

2043 else: 

2044 yield from self.gen_op(op_name_) 

2045 yield from self.gen(node.operand) 

2046 #@+node:ekr.20191113063144.59: *6* tog.IfExp (ternary operator) 

2047 # IfExp(expr test, expr body, expr orelse) 

2048 

2049 def do_IfExp(self, node): 

2050 

2051 #'%s if %s else %s' 

2052 yield from self.gen(node.body) 

2053 yield from self.gen_name('if') 

2054 yield from self.gen(node.test) 

2055 yield from self.gen_name('else') 

2056 yield from self.gen(node.orelse) 

2057 #@+node:ekr.20191113063144.60: *5* tog: Statements 

2058 #@+node:ekr.20191113063144.83: *6* tog.Starred 

2059 # Starred(expr value, expr_context ctx) 

2060 

2061 def do_Starred(self, node): 

2062 """A starred argument to an ast.Call""" 

2063 yield from self.gen_op('*') 

2064 yield from self.gen(node.value) 

2065 #@+node:ekr.20191113063144.61: *6* tog.AnnAssign 

2066 # AnnAssign(expr target, expr annotation, expr? value, int simple) 

2067 

2068 def do_AnnAssign(self, node): 

2069 

2070 # {node.target}:{node.annotation}={node.value}\n' 

2071 yield from self.gen(node.target) 

2072 yield from self.gen_op(':') 

2073 yield from self.gen(node.annotation) 

2074 if node.value is not None: # #1851 

2075 yield from self.gen_op('=') 

2076 yield from self.gen(node.value) 

2077 #@+node:ekr.20191113063144.62: *6* tog.Assert 

2078 # Assert(expr test, expr? msg) 

2079 

2080 def do_Assert(self, node): 

2081 

2082 # Guards... 

2083 msg = getattr(node, 'msg', None) 

2084 # No need to put parentheses or commas. 

2085 yield from self.gen_name('assert') 

2086 yield from self.gen(node.test) 

2087 if msg is not None: 

2088 yield from self.gen(node.msg) 

2089 #@+node:ekr.20191113063144.63: *6* tog.Assign 

2090 def do_Assign(self, node): 

2091 

2092 for z in node.targets: 

2093 yield from self.gen(z) 

2094 yield from self.gen_op('=') 

2095 yield from self.gen(node.value) 

2096 #@+node:ekr.20191113063144.64: *6* tog.AsyncFor 

2097 def do_AsyncFor(self, node): 

2098 

2099 # The def line... 

2100 # Py 3.8 changes the kind of token. 

2101 async_token_type = 'async' if has_async_tokens else 'name' 

2102 yield from self.gen_token(async_token_type, 'async') 

2103 yield from self.gen_name('for') 

2104 yield from self.gen(node.target) 

2105 yield from self.gen_name('in') 

2106 yield from self.gen(node.iter) 

2107 yield from self.gen_op(':') 

2108 # Body... 

2109 self.level += 1 

2110 yield from self.gen(node.body) 

2111 # Else clause... 

2112 if node.orelse: 

2113 yield from self.gen_name('else') 

2114 yield from self.gen_op(':') 

2115 yield from self.gen(node.orelse) 

2116 self.level -= 1 

2117 #@+node:ekr.20191113063144.65: *6* tog.AsyncWith 

2118 def do_AsyncWith(self, node): 

2119 

2120 async_token_type = 'async' if has_async_tokens else 'name' 

2121 yield from self.gen_token(async_token_type, 'async') 

2122 yield from self.do_With(node) 

2123 #@+node:ekr.20191113063144.66: *6* tog.AugAssign 

2124 # AugAssign(expr target, operator op, expr value) 

2125 

2126 def do_AugAssign(self, node): 

2127 

2128 # %s%s=%s\n' 

2129 op_name_ = op_name(node.op) 

2130 yield from self.gen(node.target) 

2131 yield from self.gen_op(op_name_ + '=') 

2132 yield from self.gen(node.value) 

2133 #@+node:ekr.20191113063144.67: *6* tog.Await 

2134 # Await(expr value) 

2135 

2136 def do_Await(self, node): 

2137 

2138 #'await %s\n' 

2139 async_token_type = 'await' if has_async_tokens else 'name' 

2140 yield from self.gen_token(async_token_type, 'await') 

2141 yield from self.gen(node.value) 

2142 #@+node:ekr.20191113063144.68: *6* tog.Break 

2143 def do_Break(self, node): 

2144 

2145 yield from self.gen_name('break') 

2146 #@+node:ekr.20191113063144.31: *6* tog.Call & helpers 

2147 # Call(expr func, expr* args, keyword* keywords) 

2148 

2149 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields. 

2150 

2151 def do_Call(self, node): 

2152 

2153 # The calls to gen_op(')') and gen_op('(') do nothing by default. 

2154 # Subclasses might handle them in an overridden tog.set_links. 

2155 yield from self.gen(node.func) 

2156 yield from self.gen_op('(') 

2157 # No need to generate any commas. 

2158 yield from self.handle_call_arguments(node) 

2159 yield from self.gen_op(')') 

2160 #@+node:ekr.20191204114930.1: *7* tog.arg_helper 

2161 def arg_helper(self, node): 

2162 """ 

2163 Yield the node, with a special case for strings. 

2164 """ 

2165 if isinstance(node, str): 

2166 yield from self.gen_token('name', node) 

2167 else: 

2168 yield from self.gen(node) 

2169 #@+node:ekr.20191204105506.1: *7* tog.handle_call_arguments 

2170 def handle_call_arguments(self, node): 

2171 """ 

2172 Generate arguments in the correct order. 

2173 

2174 Call(expr func, expr* args, keyword* keywords) 

2175 

2176 https://docs.python.org/3/reference/expressions.html#calls 

2177 

2178 Warning: This code will fail on Python 3.8 only for calls 

2179 containing kwargs in unexpected places. 

2180 """ 

2181 # *args: in node.args[]: Starred(value=Name(id='args')) 

2182 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)]) 

2183 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs')) 

2184 # 

2185 # Scan args for *name or *List 

2186 args = node.args or [] 

2187 keywords = node.keywords or [] 

2188 

2189 def get_pos(obj): 

2190 line1 = getattr(obj, 'lineno', None) 

2191 col1 = getattr(obj, 'col_offset', None) 

2192 return line1, col1, obj 

2193 

2194 def sort_key(aTuple): 

2195 line, col, obj = aTuple 

2196 return line * 1000 + col 

2197 

2198 if 0: 

2199 g.printObj([ast.dump(z) for z in args], tag='args') 

2200 g.printObj([ast.dump(z) for z in keywords], tag='keywords') 

2201 

2202 if py_version >= (3, 9): 

2203 places = [get_pos(z) for z in args + keywords] 

2204 places.sort(key=sort_key) 

2205 ordered_args = [z[2] for z in places] 

2206 for z in ordered_args: 

2207 if isinstance(z, ast.Starred): 

2208 yield from self.gen_op('*') 

2209 yield from self.gen(z.value) 

2210 elif isinstance(z, ast.keyword): 

2211 if getattr(z, 'arg', None) is None: 

2212 yield from self.gen_op('**') 

2213 yield from self.arg_helper(z.value) 

2214 else: 

2215 yield from self.arg_helper(z.arg) 

2216 yield from self.gen_op('=') 

2217 yield from self.arg_helper(z.value) 

2218 else: 

2219 yield from self.arg_helper(z) 

2220 else: # pragma: no cover 

2221 # 

2222 # Legacy code: May fail for Python 3.8 

2223 # 

2224 # Scan args for *arg and *[...] 

2225 kwarg_arg = star_arg = None 

2226 for z in args: 

2227 if isinstance(z, ast.Starred): 

2228 if isinstance(z.value, ast.Name): # *Name. 

2229 star_arg = z 

2230 args.remove(z) 

2231 break 

2232 elif isinstance(z.value, (ast.List, ast.Tuple)): # *[...] 

2233 # star_list = z 

2234 break 

2235 raise AttributeError(f"Invalid * expression: {ast.dump(z)}") # pragma: no cover 

2236 # Scan keywords for **name. 

2237 for z in keywords: 

2238 if hasattr(z, 'arg') and z.arg is None: 

2239 kwarg_arg = z 

2240 keywords.remove(z) 

2241 break 

2242 # Sync the plain arguments. 

2243 for z in args: 

2244 yield from self.arg_helper(z) 

2245 # Sync the keyword args. 

2246 for z in keywords: 

2247 yield from self.arg_helper(z.arg) 

2248 yield from self.gen_op('=') 

2249 yield from self.arg_helper(z.value) 

2250 # Sync the * arg. 

2251 if star_arg: 

2252 yield from self.arg_helper(star_arg) 

2253 # Sync the ** kwarg. 

2254 if kwarg_arg: 

2255 yield from self.gen_op('**') 

2256 yield from self.gen(kwarg_arg.value) 

2257 #@+node:ekr.20191113063144.69: *6* tog.Continue 

2258 def do_Continue(self, node): 

2259 

2260 yield from self.gen_name('continue') 

2261 #@+node:ekr.20191113063144.70: *6* tog.Delete 

2262 def do_Delete(self, node): 

2263 

2264 # No need to put commas. 

2265 yield from self.gen_name('del') 

2266 yield from self.gen(node.targets) 

2267 #@+node:ekr.20191113063144.71: *6* tog.ExceptHandler 

2268 def do_ExceptHandler(self, node): 

2269 

2270 # Except line... 

2271 yield from self.gen_name('except') 

2272 if getattr(node, 'type', None): 

2273 yield from self.gen(node.type) 

2274 if getattr(node, 'name', None): 

2275 yield from self.gen_name('as') 

2276 yield from self.gen_name(node.name) 

2277 yield from self.gen_op(':') 

2278 # Body... 

2279 self.level += 1 

2280 yield from self.gen(node.body) 

2281 self.level -= 1 

2282 #@+node:ekr.20191113063144.73: *6* tog.For 

2283 def do_For(self, node): 

2284 

2285 # The def line... 

2286 yield from self.gen_name('for') 

2287 yield from self.gen(node.target) 

2288 yield from self.gen_name('in') 

2289 yield from self.gen(node.iter) 

2290 yield from self.gen_op(':') 

2291 # Body... 

2292 self.level += 1 

2293 yield from self.gen(node.body) 

2294 # Else clause... 

2295 if node.orelse: 

2296 yield from self.gen_name('else') 

2297 yield from self.gen_op(':') 

2298 yield from self.gen(node.orelse) 

2299 self.level -= 1 

2300 #@+node:ekr.20191113063144.74: *6* tog.Global 

2301 # Global(identifier* names) 

2302 

2303 def do_Global(self, node): 

2304 

2305 yield from self.gen_name('global') 

2306 for z in node.names: 

2307 yield from self.gen_name(z) 

2308 #@+node:ekr.20191113063144.75: *6* tog.If & helpers 

2309 # If(expr test, stmt* body, stmt* orelse) 

2310 

2311 def do_If(self, node): 

2312 #@+<< do_If docstring >> 

2313 #@+node:ekr.20191122222412.1: *7* << do_If docstring >> 

2314 """ 

2315 The parse trees for the following are identical! 

2316 

2317 if 1: if 1: 

2318 pass pass 

2319 else: elif 2: 

2320 if 2: pass 

2321 pass 

2322 

2323 So there is *no* way for the 'if' visitor to disambiguate the above two 

2324 cases from the parse tree alone. 

2325 

2326 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token. 

2327 """ 

2328 #@-<< do_If docstring >> 

2329 # Use the next significant token to distinguish between 'if' and 'elif'. 

2330 token = self.find_next_significant_token() 

2331 yield from self.gen_name(token.value) 

2332 yield from self.gen(node.test) 

2333 yield from self.gen_op(':') 

2334 # 

2335 # Body... 

2336 self.level += 1 

2337 yield from self.gen(node.body) 

2338 self.level -= 1 

2339 # 

2340 # Else and elif clauses... 

2341 if node.orelse: 

2342 self.level += 1 

2343 token = self.find_next_significant_token() 

2344 if token.value == 'else': 

2345 yield from self.gen_name('else') 

2346 yield from self.gen_op(':') 

2347 yield from self.gen(node.orelse) 

2348 else: 

2349 yield from self.gen(node.orelse) 

2350 self.level -= 1 

2351 #@+node:ekr.20191113063144.76: *6* tog.Import & helper 

2352 def do_Import(self, node): 

2353 

2354 yield from self.gen_name('import') 

2355 for alias in node.names: 

2356 yield from self.gen_name(alias.name) 

2357 if alias.asname: 

2358 yield from self.gen_name('as') 

2359 yield from self.gen_name(alias.asname) 

2360 #@+node:ekr.20191113063144.77: *6* tog.ImportFrom 

2361 # ImportFrom(identifier? module, alias* names, int? level) 

2362 

2363 def do_ImportFrom(self, node): 

2364 

2365 yield from self.gen_name('from') 

2366 for i in range(node.level): 

2367 yield from self.gen_op('.') 

2368 if node.module: 

2369 yield from self.gen_name(node.module) 

2370 yield from self.gen_name('import') 

2371 # No need to put commas. 

2372 for alias in node.names: 

2373 if alias.name == '*': # #1851. 

2374 yield from self.gen_op('*') 

2375 else: 

2376 yield from self.gen_name(alias.name) 

2377 if alias.asname: 

2378 yield from self.gen_name('as') 

2379 yield from self.gen_name(alias.asname) 

2380 #@+node:ekr.20191113063144.78: *6* tog.Nonlocal 

2381 # Nonlocal(identifier* names) 

2382 

2383 def do_Nonlocal(self, node): 

2384 

2385 # nonlocal %s\n' % ','.join(node.names)) 

2386 # No need to put commas. 

2387 yield from self.gen_name('nonlocal') 

2388 for z in node.names: 

2389 yield from self.gen_name(z) 

2390 #@+node:ekr.20191113063144.79: *6* tog.Pass 

2391 def do_Pass(self, node): 

2392 

2393 yield from self.gen_name('pass') 

2394 #@+node:ekr.20191113063144.81: *6* tog.Raise 

2395 # Raise(expr? exc, expr? cause) 

2396 

2397 def do_Raise(self, node): 

2398 

2399 # No need to put commas. 

2400 yield from self.gen_name('raise') 

2401 exc = getattr(node, 'exc', None) 

2402 cause = getattr(node, 'cause', None) 

2403 tback = getattr(node, 'tback', None) 

2404 yield from self.gen(exc) 

2405 if cause: 

2406 yield from self.gen_name('from') # #2446. 

2407 yield from self.gen(cause) 

2408 yield from self.gen(tback) 

2409 #@+node:ekr.20191113063144.82: *6* tog.Return 

2410 def do_Return(self, node): 

2411 

2412 yield from self.gen_name('return') 

2413 yield from self.gen(node.value) 

2414 #@+node:ekr.20191113063144.85: *6* tog.Try 

2415 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 

2416 

2417 def do_Try(self, node): 

2418 

2419 # Try line... 

2420 yield from self.gen_name('try') 

2421 yield from self.gen_op(':') 

2422 # Body... 

2423 self.level += 1 

2424 yield from self.gen(node.body) 

2425 yield from self.gen(node.handlers) 

2426 # Else... 

2427 if node.orelse: 

2428 yield from self.gen_name('else') 

2429 yield from self.gen_op(':') 

2430 yield from self.gen(node.orelse) 

2431 # Finally... 

2432 if node.finalbody: 

2433 yield from self.gen_name('finally') 

2434 yield from self.gen_op(':') 

2435 yield from self.gen(node.finalbody) 

2436 self.level -= 1 

2437 #@+node:ekr.20191113063144.88: *6* tog.While 

2438 def do_While(self, node): 

2439 

2440 # While line... 

2441 # while %s:\n' 

2442 yield from self.gen_name('while') 

2443 yield from self.gen(node.test) 

2444 yield from self.gen_op(':') 

2445 # Body... 

2446 self.level += 1 

2447 yield from self.gen(node.body) 

2448 # Else clause... 

2449 if node.orelse: 

2450 yield from self.gen_name('else') 

2451 yield from self.gen_op(':') 

2452 yield from self.gen(node.orelse) 

2453 self.level -= 1 

2454 #@+node:ekr.20191113063144.89: *6* tog.With 

2455 # With(withitem* items, stmt* body) 

2456 

2457 # withitem = (expr context_expr, expr? optional_vars) 

2458 

2459 def do_With(self, node): 

2460 

2461 expr: Optional[ast.AST] = getattr(node, 'context_expression', None) 

2462 items: List[ast.AST] = getattr(node, 'items', []) 

2463 yield from self.gen_name('with') 

2464 yield from self.gen(expr) 

2465 # No need to put commas. 

2466 for item in items: 

2467 yield from self.gen(item.context_expr) # type:ignore 

2468 optional_vars = getattr(item, 'optional_vars', None) 

2469 if optional_vars is not None: 

2470 yield from self.gen_name('as') 

2471 yield from self.gen(item.optional_vars) # type:ignore 

2472 # End the line. 

2473 yield from self.gen_op(':') 

2474 # Body... 

2475 self.level += 1 

2476 yield from self.gen(node.body) 

2477 self.level -= 1 

2478 #@+node:ekr.20191113063144.90: *6* tog.Yield 

2479 def do_Yield(self, node): 

2480 

2481 yield from self.gen_name('yield') 

2482 if hasattr(node, 'value'): 

2483 yield from self.gen(node.value) 

2484 #@+node:ekr.20191113063144.91: *6* tog.YieldFrom 

2485 # YieldFrom(expr value) 

2486 

2487 def do_YieldFrom(self, node): 

2488 

2489 yield from self.gen_name('yield') 

2490 yield from self.gen_name('from') 

2491 yield from self.gen(node.value) 

2492 #@-others 

2493#@+node:ekr.20191226195813.1: *3* class TokenOrderTraverser 

2494class TokenOrderTraverser: 

2495 """ 

2496 Traverse an ast tree using the parent/child links created by the 

2497 TokenOrderInjector class. 

2498 """ 

2499 #@+others 

2500 #@+node:ekr.20191226200154.1: *4* TOT.traverse 

2501 def traverse(self, tree): 

2502 """ 

2503 Call visit, in token order, for all nodes in tree. 

2504 

2505 Recursion is not allowed. 

2506 

2507 The code follows p.moveToThreadNext exactly. 

2508 """ 

2509 

2510 def has_next(i, node, stack): 

2511 """Return True if stack[i] is a valid child of node.parent.""" 

2512 # g.trace(node.__class__.__name__, stack) 

2513 parent = node.parent 

2514 return bool(parent and parent.children and i < len(parent.children)) 

2515 

2516 # Update stats 

2517 

2518 self.last_node_index = -1 # For visit 

2519 # The stack contains child indices. 

2520 node, stack = tree, [0] 

2521 seen = set() 

2522 while node and stack: 

2523 # g.trace( 

2524 # f"{node.node_index:>3} " 

2525 # f"{node.__class__.__name__:<12} {stack}") 

2526 # Visit the node. 

2527 assert node.node_index not in seen, node.node_index 

2528 seen.add(node.node_index) 

2529 self.visit(node) 

2530 # if p.v.children: p.moveToFirstChild() 

2531 children: List[ast.AST] = getattr(node, 'children', []) 

2532 if children: 

2533 # Move to the first child. 

2534 stack.append(0) 

2535 node = children[0] 

2536 # g.trace(' child:', node.__class__.__name__, stack) 

2537 continue 

2538 # elif p.hasNext(): p.moveToNext() 

2539 stack[-1] += 1 

2540 i = stack[-1] 

2541 if has_next(i, node, stack): 

2542 node = node.parent.children[i] 

2543 continue 

2544 # else... 

2545 # p.moveToParent() 

2546 node = node.parent 

2547 stack.pop() 

2548 # while p: 

2549 while node and stack: 

2550 # if p.hasNext(): 

2551 stack[-1] += 1 

2552 i = stack[-1] 

2553 if has_next(i, node, stack): 

2554 # Move to the next sibling. 

2555 node = node.parent.children[i] 

2556 break # Found. 

2557 # p.moveToParent() 

2558 node = node.parent 

2559 stack.pop() 

2560 # not found. 

2561 else: 

2562 break # pragma: no cover 

2563 return self.last_node_index 

2564 #@+node:ekr.20191227160547.1: *4* TOT.visit 

2565 def visit(self, node): 

2566 

2567 self.last_node_index += 1 

2568 assert self.last_node_index == node.node_index, ( 

2569 self.last_node_index, node.node_index) 

2570 #@-others 

2571#@+node:ekr.20200107165250.1: *3* class Orange 

2572class Orange: 

2573 """ 

2574 A flexible and powerful beautifier for Python. 

2575 Orange is the new black. 

2576 

2577 *Important*: This is a predominantly a *token*-based beautifier. 

2578 However, orange.colon and orange.possible_unary_op use the parse 

2579 tree to provide context that would otherwise be difficult to 

2580 deduce. 

2581 """ 

2582 # This switch is really a comment. It will always be false. 

2583 # It marks the code that simulates the operation of the black tool. 

2584 black_mode = False 

2585 

2586 # Patterns... 

2587 nobeautify_pat = re.compile(r'\s*#\s*pragma:\s*no\s*beautify\b|#\s*@@nobeautify') 

2588 

2589 # Patterns from FastAtRead class, specialized for python delims. 

2590 node_pat = re.compile(r'^(\s*)#@\+node:([^:]+): \*(\d+)?(\*?) (.*)$') # @node 

2591 start_doc_pat = re.compile(r'^\s*#@\+(at|doc)?(\s.*?)?$') # @doc or @ 

2592 at_others_pat = re.compile(r'^(\s*)#@(\+|-)others\b(.*)$') # @others 

2593 

2594 # Doc parts end with @c or a node sentinel. Specialized for python. 

2595 end_doc_pat = re.compile(r"^\s*#@(@(c(ode)?)|([+]node\b.*))$") 

2596 #@+others 

2597 #@+node:ekr.20200107165250.2: *4* orange.ctor 

2598 def __init__(self, settings=None): 

2599 """Ctor for Orange class.""" 

2600 if settings is None: 

2601 settings = {} 

2602 valid_keys = ( 

2603 'allow_joined_strings', 

2604 'max_join_line_length', 

2605 'max_split_line_length', 

2606 'orange', 

2607 'tab_width', 

2608 ) 

2609 # For mypy... 

2610 self.kind: str = '' 

2611 # Default settings... 

2612 self.allow_joined_strings = False # EKR's preference. 

2613 self.max_join_line_length = 88 

2614 self.max_split_line_length = 88 

2615 self.tab_width = 4 

2616 # Override from settings dict... 

2617 for key in settings: # pragma: no cover 

2618 value = settings.get(key) 

2619 if key in valid_keys and value is not None: 

2620 setattr(self, key, value) 

2621 else: 

2622 g.trace(f"Unexpected setting: {key} = {value!r}") 

2623 #@+node:ekr.20200107165250.51: *4* orange.push_state 

2624 def push_state(self, kind, value=None): 

2625 """Append a state to the state stack.""" 

2626 state = ParseState(kind, value) 

2627 self.state_stack.append(state) 

2628 #@+node:ekr.20200107165250.8: *4* orange: Entries 

2629 #@+node:ekr.20200107173542.1: *5* orange.beautify (main token loop) 

2630 def oops(self): # pragma: no cover 

2631 g.trace(f"Unknown kind: {self.kind}") 

2632 

2633 def beautify(self, contents, filename, tokens, tree, max_join_line_length=None, max_split_line_length=None): 

2634 """ 

2635 The main line. Create output tokens and return the result as a string. 

2636 """ 

2637 # Config overrides 

2638 if max_join_line_length is not None: 

2639 self.max_join_line_length = max_join_line_length 

2640 if max_split_line_length is not None: 

2641 self.max_split_line_length = max_split_line_length 

2642 # State vars... 

2643 self.curly_brackets_level = 0 # Number of unmatched '{' tokens. 

2644 self.decorator_seen = False # Set by do_name for do_op. 

2645 self.in_arg_list = 0 # > 0 if in an arg list of a def. 

2646 self.level = 0 # Set only by do_indent and do_dedent. 

2647 self.lws = '' # Leading whitespace. 

2648 self.paren_level = 0 # Number of unmatched '(' tokens. 

2649 self.square_brackets_stack: List[bool] = [] # A stack of bools, for self.word(). 

2650 self.state_stack: List["ParseState"] = [] # Stack of ParseState objects. 

2651 self.val = None # The input token's value (a string). 

2652 self.verbatim = False # True: don't beautify. 

2653 # 

2654 # Init output list and state... 

2655 self.code_list: List[Token] = [] # The list of output tokens. 

2656 self.code_list_index = 0 # The token's index. 

2657 self.tokens = tokens # The list of input tokens. 

2658 self.tree = tree 

2659 self.add_token('file-start', '') 

2660 self.push_state('file-start') 

2661 for i, token in enumerate(tokens): 

2662 self.token = token 

2663 self.kind, self.val, self.line = token.kind, token.value, token.line 

2664 if self.verbatim: 

2665 self.do_verbatim() 

2666 else: 

2667 func = getattr(self, f"do_{token.kind}", self.oops) 

2668 func() 

2669 # Any post pass would go here. 

2670 return tokens_to_string(self.code_list) 

2671 #@+node:ekr.20200107172450.1: *5* orange.beautify_file (entry) 

2672 def beautify_file(self, filename): # pragma: no cover 

2673 """ 

2674 Orange: Beautify the the given external file. 

2675 

2676 Return True if the file was changed. 

2677 """ 

2678 tag = 'beautify-file' 

2679 self.filename = filename 

2680 tog = TokenOrderGenerator() 

2681 contents, encoding, tokens, tree = tog.init_from_file(filename) 

2682 if not contents or not tokens or not tree: 

2683 print(f"{tag}: Can not beautify: {filename}") 

2684 return False 

2685 # Beautify. 

2686 results = self.beautify(contents, filename, tokens, tree) 

2687 # Something besides newlines must change. 

2688 if regularize_nls(contents) == regularize_nls(results): 

2689 print(f"{tag}: Unchanged: {filename}") 

2690 return False 

2691 if 0: # This obscures more import error messages. 

2692 # Show the diffs. 

2693 show_diffs(contents, results, filename=filename) 

2694 # Write the results 

2695 print(f"{tag}: Wrote {filename}") 

2696 write_file(filename, results, encoding=encoding) 

2697 return True 

2698 #@+node:ekr.20200107172512.1: *5* orange.beautify_file_diff (entry) 

2699 def beautify_file_diff(self, filename): # pragma: no cover 

2700 """ 

2701 Orange: Print the diffs that would resulf from the orange-file command. 

2702 

2703 Return True if the file would be changed. 

2704 """ 

2705 tag = 'diff-beautify-file' 

2706 self.filename = filename 

2707 tog = TokenOrderGenerator() 

2708 contents, encoding, tokens, tree = tog.init_from_file(filename) 

2709 if not contents or not tokens or not tree: 

2710 print(f"{tag}: Can not beautify: {filename}") 

2711 return False 

2712 # fstringify. 

2713 results = self.beautify(contents, filename, tokens, tree) 

2714 # Something besides newlines must change. 

2715 if regularize_nls(contents) == regularize_nls(results): 

2716 print(f"{tag}: Unchanged: {filename}") 

2717 return False 

2718 # Show the diffs. 

2719 show_diffs(contents, results, filename=filename) 

2720 return True 

2721 #@+node:ekr.20200107165250.13: *4* orange: Input token handlers 

2722 #@+node:ekr.20200107165250.14: *5* orange.do_comment 

2723 in_doc_part = False 

2724 

2725 def do_comment(self): 

2726 """Handle a comment token.""" 

2727 val = self.val 

2728 # 

2729 # Leo-specific code... 

2730 if self.node_pat.match(val): 

2731 # Clear per-node state. 

2732 self.in_doc_part = False 

2733 self.verbatim = False 

2734 self.decorator_seen = False 

2735 # Do *not clear other state, which may persist across @others. 

2736 # self.curly_brackets_level = 0 

2737 # self.in_arg_list = 0 

2738 # self.level = 0 

2739 # self.lws = '' 

2740 # self.paren_level = 0 

2741 # self.square_brackets_stack = [] 

2742 # self.state_stack = [] 

2743 else: 

2744 # Keep track of verbatim mode. 

2745 if self.beautify_pat.match(val): 

2746 self.verbatim = False 

2747 elif self.nobeautify_pat.match(val): 

2748 self.verbatim = True 

2749 # Keep trace of @doc parts, to honor the convention for splitting lines. 

2750 if self.start_doc_pat.match(val): 

2751 self.in_doc_part = True 

2752 if self.end_doc_pat.match(val): 

2753 self.in_doc_part = False 

2754 # 

2755 # General code: Generate the comment. 

2756 self.clean('blank') 

2757 entire_line = self.line.lstrip().startswith('#') 

2758 if entire_line: 

2759 self.clean('hard-blank') 

2760 self.clean('line-indent') 

2761 # #1496: No further munging needed. 

2762 val = self.line.rstrip() 

2763 else: 

2764 # Exactly two spaces before trailing comments. 

2765 val = ' ' + self.val.rstrip() 

2766 self.add_token('comment', val) 

2767 #@+node:ekr.20200107165250.15: *5* orange.do_encoding 

2768 def do_encoding(self): 

2769 """ 

2770 Handle the encoding token. 

2771 """ 

2772 pass 

2773 #@+node:ekr.20200107165250.16: *5* orange.do_endmarker 

2774 def do_endmarker(self): 

2775 """Handle an endmarker token.""" 

2776 # Ensure exactly one blank at the end of the file. 

2777 self.clean_blank_lines() 

2778 self.add_token('line-end', '\n') 

2779 #@+node:ekr.20200107165250.18: *5* orange.do_indent & do_dedent & helper 

2780 def do_dedent(self): 

2781 """Handle dedent token.""" 

2782 self.level -= 1 

2783 self.lws = self.level * self.tab_width * ' ' 

2784 self.line_indent() 

2785 if self.black_mode: # pragma: no cover (black) 

2786 state = self.state_stack[-1] 

2787 if state.kind == 'indent' and state.value == self.level: 

2788 self.state_stack.pop() 

2789 state = self.state_stack[-1] 

2790 if state.kind in ('class', 'def'): 

2791 self.state_stack.pop() 

2792 self.handle_dedent_after_class_or_def(state.kind) 

2793 

2794 def do_indent(self): 

2795 """Handle indent token.""" 

2796 new_indent = self.val 

2797 old_indent = self.level * self.tab_width * ' ' 

2798 if new_indent > old_indent: 

2799 self.level += 1 

2800 elif new_indent < old_indent: # pragma: no cover (defensive) 

2801 g.trace('\n===== can not happen', repr(new_indent), repr(old_indent)) 

2802 self.lws = new_indent 

2803 self.line_indent() 

2804 #@+node:ekr.20200220054928.1: *6* orange.handle_dedent_after_class_or_def 

2805 def handle_dedent_after_class_or_def(self, kind): # pragma: no cover (black) 

2806 """ 

2807 Insert blank lines after a class or def as the result of a 'dedent' token. 

2808 

2809 Normal comment lines may precede the 'dedent'. 

2810 Insert the blank lines *before* such comment lines. 

2811 """ 

2812 # 

2813 # Compute the tail. 

2814 i = len(self.code_list) - 1 

2815 tail: List[Token] = [] 

2816 while i > 0: 

2817 t = self.code_list.pop() 

2818 i -= 1 

2819 if t.kind == 'line-indent': 

2820 pass 

2821 elif t.kind == 'line-end': 

2822 tail.insert(0, t) 

2823 elif t.kind == 'comment': 

2824 # Only underindented single-line comments belong in the tail. 

2825 # @+node comments must never be in the tail. 

2826 single_line = self.code_list[i].kind in ('line-end', 'line-indent') 

2827 lws = len(t.value) - len(t.value.lstrip()) 

2828 underindent = lws <= len(self.lws) 

2829 if underindent and single_line and not self.node_pat.match(t.value): 

2830 # A single-line comment. 

2831 tail.insert(0, t) 

2832 else: 

2833 self.code_list.append(t) 

2834 break 

2835 else: 

2836 self.code_list.append(t) 

2837 break 

2838 # 

2839 # Remove leading 'line-end' tokens from the tail. 

2840 while tail and tail[0].kind == 'line-end': 

2841 tail = tail[1:] 

2842 # 

2843 # Put the newlines *before* the tail. 

2844 # For Leo, always use 1 blank lines. 

2845 n = 1 # n = 2 if kind == 'class' else 1 

2846 # Retain the token (intention) for debugging. 

2847 self.add_token('blank-lines', n) 

2848 for i in range(0, n + 1): 

2849 self.add_token('line-end', '\n') 

2850 if tail: 

2851 self.code_list.extend(tail) 

2852 self.line_indent() 

2853 #@+node:ekr.20200107165250.20: *5* orange.do_name 

2854 def do_name(self): 

2855 """Handle a name token.""" 

2856 name = self.val 

2857 if self.black_mode and name in ('class', 'def'): # pragma: no cover (black) 

2858 # Handle newlines before and after 'class' or 'def' 

2859 self.decorator_seen = False 

2860 state = self.state_stack[-1] 

2861 if state.kind == 'decorator': 

2862 # Always do this, regardless of @bool clean-blank-lines. 

2863 self.clean_blank_lines() 

2864 # Suppress split/join. 

2865 self.add_token('hard-newline', '\n') 

2866 self.add_token('line-indent', self.lws) 

2867 self.state_stack.pop() 

2868 else: 

2869 # Always do this, regardless of @bool clean-blank-lines. 

2870 self.blank_lines(2 if name == 'class' else 1) 

2871 self.push_state(name) 

2872 self.push_state('indent', self.level) 

2873 # For trailing lines after inner classes/defs. 

2874 self.word(name) 

2875 return 

2876 # 

2877 # Leo mode... 

2878 if name in ('class', 'def'): 

2879 self.word(name) 

2880 elif name in ( 

2881 'and', 'elif', 'else', 'for', 'if', 'in', 'not', 'not in', 'or', 'while' 

2882 ): 

2883 self.word_op(name) 

2884 else: 

2885 self.word(name) 

2886 #@+node:ekr.20200107165250.21: *5* orange.do_newline & do_nl 

2887 def do_newline(self): 

2888 """Handle a regular newline.""" 

2889 self.line_end() 

2890 

2891 def do_nl(self): 

2892 """Handle a continuation line.""" 

2893 self.line_end() 

2894 #@+node:ekr.20200107165250.22: *5* orange.do_number 

2895 def do_number(self): 

2896 """Handle a number token.""" 

2897 self.blank() 

2898 self.add_token('number', self.val) 

2899 #@+node:ekr.20200107165250.23: *5* orange.do_op 

2900 def do_op(self): 

2901 """Handle an op token.""" 

2902 val = self.val 

2903 if val == '.': 

2904 self.clean('blank') 

2905 prev = self.code_list[-1] 

2906 # #2495: Special case for 'from .' 

2907 if prev.kind == 'word' and prev.value == 'from': 

2908 self.blank() 

2909 self.add_token('op', val) 

2910 self.blank() 

2911 else: 

2912 self.add_token('op-no-blanks', val) 

2913 elif val == '@': 

2914 if self.black_mode: # pragma: no cover (black) 

2915 if not self.decorator_seen: 

2916 self.blank_lines(1) 

2917 self.decorator_seen = True 

2918 self.clean('blank') 

2919 self.add_token('op-no-blanks', val) 

2920 self.push_state('decorator') 

2921 elif val == ':': 

2922 # Treat slices differently. 

2923 self.colon(val) 

2924 elif val in ',;': 

2925 # Pep 8: Avoid extraneous whitespace immediately before 

2926 # comma, semicolon, or colon. 

2927 self.clean('blank') 

2928 self.add_token('op', val) 

2929 self.blank() 

2930 elif val in '([{': 

2931 # Pep 8: Avoid extraneous whitespace immediately inside 

2932 # parentheses, brackets or braces. 

2933 self.lt(val) 

2934 elif val in ')]}': 

2935 # Ditto. 

2936 self.rt(val) 

2937 elif val == '=': 

2938 # Pep 8: Don't use spaces around the = sign when used to indicate 

2939 # a keyword argument or a default parameter value. 

2940 if self.paren_level: 

2941 self.clean('blank') 

2942 self.add_token('op-no-blanks', val) 

2943 else: 

2944 self.blank() 

2945 self.add_token('op', val) 

2946 self.blank() 

2947 elif val in '~+-': 

2948 self.possible_unary_op(val) 

2949 elif val == '*': 

2950 self.star_op() 

2951 elif val == '**': 

2952 self.star_star_op() 

2953 else: 

2954 # Pep 8: always surround binary operators with a single space. 

2955 # '==','+=','-=','*=','**=','/=','//=','%=','!=','<=','>=','<','>', 

2956 # '^','~','*','**','&','|','/','//', 

2957 # Pep 8: If operators with different priorities are used, 

2958 # consider adding whitespace around the operators with the lowest priority(ies). 

2959 self.blank() 

2960 self.add_token('op', val) 

2961 self.blank() 

2962 #@+node:ekr.20200107165250.24: *5* orange.do_string 

2963 def do_string(self): 

2964 """Handle a 'string' token.""" 

2965 # Careful: continued strings may contain '\r' 

2966 val = regularize_nls(self.val) 

2967 self.add_token('string', val) 

2968 self.blank() 

2969 #@+node:ekr.20200210175117.1: *5* orange.do_verbatim 

2970 beautify_pat = re.compile( 

2971 r'#\s*pragma:\s*beautify\b|#\s*@@beautify|#\s*@\+node|#\s*@[+-]others|#\s*@[+-]<<') 

2972 

2973 def do_verbatim(self): 

2974 """ 

2975 Handle one token in verbatim mode. 

2976 End verbatim mode when the appropriate comment is seen. 

2977 """ 

2978 kind = self.kind 

2979 # 

2980 # Careful: tokens may contain '\r' 

2981 val = regularize_nls(self.val) 

2982 if kind == 'comment': 

2983 if self.beautify_pat.match(val): 

2984 self.verbatim = False 

2985 val = val.rstrip() 

2986 self.add_token('comment', val) 

2987 return 

2988 if kind == 'indent': 

2989 self.level += 1 

2990 self.lws = self.level * self.tab_width * ' ' 

2991 if kind == 'dedent': 

2992 self.level -= 1 

2993 self.lws = self.level * self.tab_width * ' ' 

2994 self.add_token('verbatim', val) 

2995 #@+node:ekr.20200107165250.25: *5* orange.do_ws 

2996 def do_ws(self): 

2997 """ 

2998 Handle the "ws" pseudo-token. 

2999 

3000 Put the whitespace only if if ends with backslash-newline. 

3001 """ 

3002 val = self.val 

3003 # Handle backslash-newline. 

3004 if '\\\n' in val: 

3005 self.clean('blank') 

3006 self.add_token('op-no-blanks', val) 

3007 return 

3008 # Handle start-of-line whitespace. 

3009 prev = self.code_list[-1] 

3010 inner = self.paren_level or self.square_brackets_stack or self.curly_brackets_level 

3011 if prev.kind == 'line-indent' and inner: 

3012 # Retain the indent that won't be cleaned away. 

3013 self.clean('line-indent') 

3014 self.add_token('hard-blank', val) 

3015 #@+node:ekr.20200107165250.26: *4* orange: Output token generators 

3016 #@+node:ekr.20200118145044.1: *5* orange.add_line_end 

3017 def add_line_end(self): 

3018 """Add a line-end request to the code list.""" 

3019 # This may be called from do_name as well as do_newline and do_nl. 

3020 assert self.token.kind in ('newline', 'nl'), self.token.kind 

3021 self.clean('blank') # Important! 

3022 self.clean('line-indent') 

3023 t = self.add_token('line-end', '\n') 

3024 # Distinguish between kinds of 'line-end' tokens. 

3025 t.newline_kind = self.token.kind 

3026 return t 

3027 #@+node:ekr.20200107170523.1: *5* orange.add_token 

3028 def add_token(self, kind, value): 

3029 """Add an output token to the code list.""" 

3030 tok = Token(kind, value) 

3031 tok.index = self.code_list_index # For debugging only. 

3032 self.code_list_index += 1 

3033 self.code_list.append(tok) 

3034 return tok 

3035 #@+node:ekr.20200107165250.27: *5* orange.blank 

3036 def blank(self): 

3037 """Add a blank request to the code list.""" 

3038 prev = self.code_list[-1] 

3039 if prev.kind not in ( 

3040 'blank', 

3041 'blank-lines', 

3042 'file-start', 

3043 'hard-blank', # Unique to orange. 

3044 'line-end', 

3045 'line-indent', 

3046 'lt', 

3047 'op-no-blanks', 

3048 'unary-op', 

3049 ): 

3050 self.add_token('blank', ' ') 

3051 #@+node:ekr.20200107165250.29: *5* orange.blank_lines (black only) 

3052 def blank_lines(self, n): # pragma: no cover (black) 

3053 """ 

3054 Add a request for n blank lines to the code list. 

3055 Multiple blank-lines request yield at least the maximum of all requests. 

3056 """ 

3057 self.clean_blank_lines() 

3058 prev = self.code_list[-1] 

3059 if prev.kind == 'file-start': 

3060 self.add_token('blank-lines', n) 

3061 return 

3062 for i in range(0, n + 1): 

3063 self.add_token('line-end', '\n') 

3064 # Retain the token (intention) for debugging. 

3065 self.add_token('blank-lines', n) 

3066 self.line_indent() 

3067 #@+node:ekr.20200107165250.30: *5* orange.clean 

3068 def clean(self, kind): 

3069 """Remove the last item of token list if it has the given kind.""" 

3070 prev = self.code_list[-1] 

3071 if prev.kind == kind: 

3072 self.code_list.pop() 

3073 #@+node:ekr.20200107165250.31: *5* orange.clean_blank_lines 

3074 def clean_blank_lines(self): 

3075 """ 

3076 Remove all vestiges of previous blank lines. 

3077 

3078 Return True if any of the cleaned 'line-end' tokens represented "hard" newlines. 

3079 """ 

3080 cleaned_newline = False 

3081 table = ('blank-lines', 'line-end', 'line-indent') 

3082 while self.code_list[-1].kind in table: 

3083 t = self.code_list.pop() 

3084 if t.kind == 'line-end' and getattr(t, 'newline_kind', None) != 'nl': 

3085 cleaned_newline = True 

3086 return cleaned_newline 

3087 #@+node:ekr.20200107165250.32: *5* orange.colon 

3088 def colon(self, val): 

3089 """Handle a colon.""" 

3090 

3091 def is_expr(node): 

3092 """True if node is any expression other than += number.""" 

3093 if isinstance(node, (ast.BinOp, ast.Call, ast.IfExp)): 

3094 return True 

3095 return isinstance( 

3096 node, ast.UnaryOp) and not isinstance(node.operand, ast.Num) 

3097 

3098 node = self.token.node 

3099 self.clean('blank') 

3100 if not isinstance(node, ast.Slice): 

3101 self.add_token('op', val) 

3102 self.blank() 

3103 return 

3104 # A slice. 

3105 lower = getattr(node, 'lower', None) 

3106 upper = getattr(node, 'upper', None) 

3107 step = getattr(node, 'step', None) 

3108 if any(is_expr(z) for z in (lower, upper, step)): 

3109 prev = self.code_list[-1] 

3110 if prev.value not in '[:': 

3111 self.blank() 

3112 self.add_token('op', val) 

3113 self.blank() 

3114 else: 

3115 self.add_token('op-no-blanks', val) 

3116 #@+node:ekr.20200107165250.33: *5* orange.line_end 

3117 def line_end(self): 

3118 """Add a line-end request to the code list.""" 

3119 # This should be called only be do_newline and do_nl. 

3120 node, token = self.token.statement_node, self.token 

3121 assert token.kind in ('newline', 'nl'), (token.kind, g.callers()) 

3122 # Create the 'line-end' output token. 

3123 self.add_line_end() 

3124 # Attempt to split the line. 

3125 was_split = self.split_line(node, token) 

3126 # Attempt to join the line only if it has not just been split. 

3127 if not was_split and self.max_join_line_length > 0: 

3128 self.join_lines(node, token) 

3129 self.line_indent() 

3130 # Add the indentation for all lines 

3131 # until the next indent or unindent token. 

3132 #@+node:ekr.20200107165250.40: *5* orange.line_indent 

3133 def line_indent(self): 

3134 """Add a line-indent token.""" 

3135 self.clean('line-indent') 

3136 # Defensive. Should never happen. 

3137 self.add_token('line-indent', self.lws) 

3138 #@+node:ekr.20200107165250.41: *5* orange.lt & rt 

3139 #@+node:ekr.20200107165250.42: *6* orange.lt 

3140 def lt(self, val): 

3141 """Generate code for a left paren or curly/square bracket.""" 

3142 assert val in '([{', repr(val) 

3143 if val == '(': 

3144 self.paren_level += 1 

3145 elif val == '[': 

3146 self.square_brackets_stack.append(False) 

3147 else: 

3148 self.curly_brackets_level += 1 

3149 self.clean('blank') 

3150 prev = self.code_list[-1] 

3151 if prev.kind in ('op', 'word-op'): 

3152 self.blank() 

3153 self.add_token('lt', val) 

3154 elif prev.kind == 'word': 

3155 # Only suppress blanks before '(' or '[' for non-keyworks. 

3156 if val == '{' or prev.value in ('if', 'else', 'return', 'for'): 

3157 self.blank() 

3158 elif val == '(': 

3159 self.in_arg_list += 1 

3160 self.add_token('lt', val) 

3161 else: 

3162 self.clean('blank') 

3163 self.add_token('op-no-blanks', val) 

3164 #@+node:ekr.20200107165250.43: *6* orange.rt 

3165 def rt(self, val): 

3166 """Generate code for a right paren or curly/square bracket.""" 

3167 assert val in ')]}', repr(val) 

3168 if val == ')': 

3169 self.paren_level -= 1 

3170 self.in_arg_list = max(0, self.in_arg_list - 1) 

3171 elif val == ']': 

3172 self.square_brackets_stack.pop() 

3173 else: 

3174 self.curly_brackets_level -= 1 

3175 self.clean('blank') 

3176 self.add_token('rt', val) 

3177 #@+node:ekr.20200107165250.45: *5* orange.possible_unary_op & unary_op 

3178 def possible_unary_op(self, s): 

3179 """Add a unary or binary op to the token list.""" 

3180 node = self.token.node 

3181 self.clean('blank') 

3182 if isinstance(node, ast.UnaryOp): 

3183 self.unary_op(s) 

3184 else: 

3185 self.blank() 

3186 self.add_token('op', s) 

3187 self.blank() 

3188 

3189 def unary_op(self, s): 

3190 """Add an operator request to the code list.""" 

3191 assert s and isinstance(s, str), repr(s) 

3192 self.clean('blank') 

3193 prev = self.code_list[-1] 

3194 if prev.kind == 'lt': 

3195 self.add_token('unary-op', s) 

3196 else: 

3197 self.blank() 

3198 self.add_token('unary-op', s) 

3199 #@+node:ekr.20200107165250.46: *5* orange.star_op 

3200 def star_op(self): 

3201 """Put a '*' op, with special cases for *args.""" 

3202 val = '*' 

3203 self.clean('blank') 

3204 if self.paren_level > 0: 

3205 prev = self.code_list[-1] 

3206 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

3207 self.blank() 

3208 self.add_token('op', val) 

3209 return 

3210 self.blank() 

3211 self.add_token('op', val) 

3212 self.blank() 

3213 #@+node:ekr.20200107165250.47: *5* orange.star_star_op 

3214 def star_star_op(self): 

3215 """Put a ** operator, with a special case for **kwargs.""" 

3216 val = '**' 

3217 self.clean('blank') 

3218 if self.paren_level > 0: 

3219 prev = self.code_list[-1] 

3220 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

3221 self.blank() 

3222 self.add_token('op', val) 

3223 return 

3224 self.blank() 

3225 self.add_token('op', val) 

3226 self.blank() 

3227 #@+node:ekr.20200107165250.48: *5* orange.word & word_op 

3228 def word(self, s): 

3229 """Add a word request to the code list.""" 

3230 assert s and isinstance(s, str), repr(s) 

3231 if self.square_brackets_stack: 

3232 # A previous 'op-no-blanks' token may cancel this blank. 

3233 self.blank() 

3234 self.add_token('word', s) 

3235 elif self.in_arg_list > 0: 

3236 self.add_token('word', s) 

3237 self.blank() 

3238 else: 

3239 self.blank() 

3240 self.add_token('word', s) 

3241 self.blank() 

3242 

3243 def word_op(self, s): 

3244 """Add a word-op request to the code list.""" 

3245 assert s and isinstance(s, str), repr(s) 

3246 self.blank() 

3247 self.add_token('word-op', s) 

3248 self.blank() 

3249 #@+node:ekr.20200118120049.1: *4* orange: Split/join 

3250 #@+node:ekr.20200107165250.34: *5* orange.split_line & helpers 

3251 def split_line(self, node, token): 

3252 """ 

3253 Split token's line, if possible and enabled. 

3254 

3255 Return True if the line was broken into two or more lines. 

3256 """ 

3257 assert token.kind in ('newline', 'nl'), repr(token) 

3258 # Return if splitting is disabled: 

3259 if self.max_split_line_length <= 0: # pragma: no cover (user option) 

3260 return False 

3261 # Return if the node can't be split. 

3262 if not is_long_statement(node): 

3263 return False 

3264 # Find the *output* tokens of the previous lines. 

3265 line_tokens = self.find_prev_line() 

3266 line_s = ''.join([z.to_string() for z in line_tokens]) 

3267 # Do nothing for short lines. 

3268 if len(line_s) < self.max_split_line_length: 

3269 return False 

3270 # Return if the previous line has no opening delim: (, [ or {. 

3271 if not any(z.kind == 'lt' for z in line_tokens): # pragma: no cover (defensive) 

3272 return False 

3273 prefix = self.find_line_prefix(line_tokens) 

3274 # Calculate the tail before cleaning the prefix. 

3275 tail = line_tokens[len(prefix) :] 

3276 # Cut back the token list: subtract 1 for the trailing line-end. 

3277 self.code_list = self.code_list[: len(self.code_list) - len(line_tokens) - 1] 

3278 # Append the tail, splitting it further, as needed. 

3279 self.append_tail(prefix, tail) 

3280 # Add the line-end token deleted by find_line_prefix. 

3281 self.add_token('line-end', '\n') 

3282 return True 

3283 #@+node:ekr.20200107165250.35: *6* orange.append_tail 

3284 def append_tail(self, prefix, tail): 

3285 """Append the tail tokens, splitting the line further as necessary.""" 

3286 tail_s = ''.join([z.to_string() for z in tail]) 

3287 if len(tail_s) < self.max_split_line_length: 

3288 # Add the prefix. 

3289 self.code_list.extend(prefix) 

3290 # Start a new line and increase the indentation. 

3291 self.add_token('line-end', '\n') 

3292 self.add_token('line-indent', self.lws + ' ' * 4) 

3293 self.code_list.extend(tail) 

3294 return 

3295 # Still too long. Split the line at commas. 

3296 self.code_list.extend(prefix) 

3297 # Start a new line and increase the indentation. 

3298 self.add_token('line-end', '\n') 

3299 self.add_token('line-indent', self.lws + ' ' * 4) 

3300 open_delim = Token(kind='lt', value=prefix[-1].value) 

3301 value = open_delim.value.replace('(', ')').replace('[', ']').replace('{', '}') 

3302 close_delim = Token(kind='rt', value=value) 

3303 delim_count = 1 

3304 lws = self.lws + ' ' * 4 

3305 for i, t in enumerate(tail): 

3306 if t.kind == 'op' and t.value == ',': 

3307 if delim_count == 1: 

3308 # Start a new line. 

3309 self.add_token('op-no-blanks', ',') 

3310 self.add_token('line-end', '\n') 

3311 self.add_token('line-indent', lws) 

3312 # Kill a following blank. 

3313 if i + 1 < len(tail): 

3314 next_t = tail[i + 1] 

3315 if next_t.kind == 'blank': 

3316 next_t.kind = 'no-op' 

3317 next_t.value = '' 

3318 else: 

3319 self.code_list.append(t) 

3320 elif t.kind == close_delim.kind and t.value == close_delim.value: 

3321 # Done if the delims match. 

3322 delim_count -= 1 

3323 if delim_count == 0: 

3324 # Start a new line 

3325 self.add_token('op-no-blanks', ',') 

3326 self.add_token('line-end', '\n') 

3327 self.add_token('line-indent', self.lws) 

3328 self.code_list.extend(tail[i:]) 

3329 return 

3330 lws = lws[:-4] 

3331 self.code_list.append(t) 

3332 elif t.kind == open_delim.kind and t.value == open_delim.value: 

3333 delim_count += 1 

3334 lws = lws + ' ' * 4 

3335 self.code_list.append(t) 

3336 else: 

3337 self.code_list.append(t) 

3338 g.trace('BAD DELIMS', delim_count) # pragma: no cover 

3339 #@+node:ekr.20200107165250.36: *6* orange.find_prev_line 

3340 def find_prev_line(self): 

3341 """Return the previous line, as a list of tokens.""" 

3342 line = [] 

3343 for t in reversed(self.code_list[:-1]): 

3344 if t.kind in ('hard-newline', 'line-end'): 

3345 break 

3346 line.append(t) 

3347 return list(reversed(line)) 

3348 #@+node:ekr.20200107165250.37: *6* orange.find_line_prefix 

3349 def find_line_prefix(self, token_list): 

3350 """ 

3351 Return all tokens up to and including the first lt token. 

3352 Also add all lt tokens directly following the first lt token. 

3353 """ 

3354 result = [] 

3355 for i, t in enumerate(token_list): 

3356 result.append(t) 

3357 if t.kind == 'lt': 

3358 break 

3359 return result 

3360 #@+node:ekr.20200107165250.39: *5* orange.join_lines 

3361 def join_lines(self, node, token): 

3362 """ 

3363 Join preceding lines, if possible and enabled. 

3364 token is a line_end token. node is the corresponding ast node. 

3365 """ 

3366 if self.max_join_line_length <= 0: # pragma: no cover (user option) 

3367 return 

3368 assert token.kind in ('newline', 'nl'), repr(token) 

3369 if token.kind == 'nl': 

3370 return 

3371 # Scan backward in the *code* list, 

3372 # looking for 'line-end' tokens with tok.newline_kind == 'nl' 

3373 nls = 0 

3374 i = len(self.code_list) - 1 

3375 t = self.code_list[i] 

3376 assert t.kind == 'line-end', repr(t) 

3377 # Not all tokens have a newline_kind ivar. 

3378 assert t.newline_kind == 'newline' # type:ignore 

3379 i -= 1 

3380 while i >= 0: 

3381 t = self.code_list[i] 

3382 if t.kind == 'comment': 

3383 # Can't join. 

3384 return 

3385 if t.kind == 'string' and not self.allow_joined_strings: 

3386 # An EKR preference: don't join strings, no matter what black does. 

3387 # This allows "short" f-strings to be aligned. 

3388 return 

3389 if t.kind == 'line-end': 

3390 if getattr(t, 'newline_kind', None) == 'nl': 

3391 nls += 1 

3392 else: 

3393 break # pragma: no cover 

3394 i -= 1 

3395 # Retain at the file-start token. 

3396 if i <= 0: 

3397 i = 1 

3398 if nls <= 0: # pragma: no cover (rare) 

3399 return 

3400 # Retain line-end and and any following line-indent. 

3401 # Required, so that the regex below won't eat too much. 

3402 while True: 

3403 t = self.code_list[i] 

3404 if t.kind == 'line-end': 

3405 if getattr(t, 'newline_kind', None) == 'nl': # pragma: no cover (rare) 

3406 nls -= 1 

3407 i += 1 

3408 elif self.code_list[i].kind == 'line-indent': 

3409 i += 1 

3410 else: 

3411 break # pragma: no cover (defensive) 

3412 if nls <= 0: # pragma: no cover (defensive) 

3413 return 

3414 # Calculate the joined line. 

3415 tail = self.code_list[i:] 

3416 tail_s = tokens_to_string(tail) 

3417 tail_s = re.sub(r'\n\s*', ' ', tail_s) 

3418 tail_s = tail_s.replace('( ', '(').replace(' )', ')') 

3419 tail_s = tail_s.rstrip() 

3420 # Don't join the lines if they would be too long. 

3421 if len(tail_s) > self.max_join_line_length: # pragma: no cover (defensive) 

3422 return 

3423 # Cut back the code list. 

3424 self.code_list = self.code_list[:i] 

3425 # Add the new output tokens. 

3426 self.add_token('string', tail_s) 

3427 self.add_token('line-end', '\n') 

3428 #@-others 

3429#@+node:ekr.20200107170847.1: *3* class OrangeSettings 

3430class OrangeSettings: 

3431 

3432 pass 

3433#@+node:ekr.20200107170126.1: *3* class ParseState 

3434class ParseState: 

3435 """ 

3436 A class representing items in the parse state stack. 

3437 

3438 The present states: 

3439 

3440 'file-start': Ensures the stack stack is never empty. 

3441 

3442 'decorator': The last '@' was a decorator. 

3443 

3444 do_op(): push_state('decorator') 

3445 do_name(): pops the stack if state.kind == 'decorator'. 

3446 

3447 'indent': The indentation level for 'class' and 'def' names. 

3448 

3449 do_name(): push_state('indent', self.level) 

3450 do_dendent(): pops the stack once or twice if state.value == self.level. 

3451 

3452 """ 

3453 

3454 def __init__(self, kind, value): 

3455 self.kind = kind 

3456 self.value = value 

3457 

3458 def __repr__(self): 

3459 return f"State: {self.kind} {self.value!r}" # pragma: no cover 

3460 

3461 __str__ = __repr__ 

3462#@+node:ekr.20200122033203.1: ** TOT classes... 

3463#@+node:ekr.20191222083453.1: *3* class Fstringify (TOT) 

3464class Fstringify(TokenOrderTraverser): 

3465 """A class to fstringify files.""" 

3466 

3467 silent = True # for pytest. Defined in all entries. 

3468 line_number = 0 

3469 line = '' 

3470 

3471 #@+others 

3472 #@+node:ekr.20191222083947.1: *4* fs.fstringify 

3473 def fstringify(self, contents, filename, tokens, tree): 

3474 """ 

3475 Fstringify.fstringify: 

3476 

3477 f-stringify the sources given by (tokens, tree). 

3478 

3479 Return the resulting string. 

3480 """ 

3481 self.filename = filename 

3482 self.tokens = tokens 

3483 self.tree = tree 

3484 # Prepass: reassign tokens. 

3485 ReassignTokens().reassign(filename, tokens, tree) 

3486 # Main pass. 

3487 self.traverse(self.tree) 

3488 results = tokens_to_string(self.tokens) 

3489 return results 

3490 #@+node:ekr.20200103054101.1: *4* fs.fstringify_file (entry) 

3491 def fstringify_file(self, filename): # pragma: no cover 

3492 """ 

3493 Fstringify.fstringify_file. 

3494 

3495 The entry point for the fstringify-file command. 

3496 

3497 f-stringify the given external file with the Fstrinfify class. 

3498 

3499 Return True if the file was changed. 

3500 """ 

3501 tag = 'fstringify-file' 

3502 self.filename = filename 

3503 self.silent = False 

3504 tog = TokenOrderGenerator() 

3505 try: 

3506 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3507 if not contents or not tokens or not tree: 

3508 print(f"{tag}: Can not fstringify: {filename}") 

3509 return False 

3510 results = self.fstringify(contents, filename, tokens, tree) 

3511 except Exception as e: 

3512 print(e) 

3513 return False 

3514 # Something besides newlines must change. 

3515 changed = regularize_nls(contents) != regularize_nls(results) 

3516 status = 'Wrote' if changed else 'Unchanged' 

3517 print(f"{tag}: {status:>9}: {filename}") 

3518 if changed: 

3519 write_file(filename, results, encoding=encoding) 

3520 return changed 

3521 #@+node:ekr.20200103065728.1: *4* fs.fstringify_file_diff (entry) 

3522 def fstringify_file_diff(self, filename): # pragma: no cover 

3523 """ 

3524 Fstringify.fstringify_file_diff. 

3525 

3526 The entry point for the diff-fstringify-file command. 

3527 

3528 Print the diffs that would resulf from the fstringify-file command. 

3529 

3530 Return True if the file would be changed. 

3531 """ 

3532 tag = 'diff-fstringify-file' 

3533 self.filename = filename 

3534 self.silent = False 

3535 tog = TokenOrderGenerator() 

3536 try: 

3537 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3538 if not contents or not tokens or not tree: 

3539 return False 

3540 results = self.fstringify(contents, filename, tokens, tree) 

3541 except Exception as e: 

3542 print(e) 

3543 return False 

3544 # Something besides newlines must change. 

3545 changed = regularize_nls(contents) != regularize_nls(results) 

3546 if changed: 

3547 show_diffs(contents, results, filename=filename) 

3548 else: 

3549 print(f"{tag}: Unchanged: {filename}") 

3550 return changed 

3551 #@+node:ekr.20200112060218.1: *4* fs.fstringify_file_silent (entry) 

3552 def fstringify_file_silent(self, filename): # pragma: no cover 

3553 """ 

3554 Fstringify.fstringify_file_silent. 

3555 

3556 The entry point for the silent-fstringify-file command. 

3557 

3558 fstringify the given file, suppressing all but serious error messages. 

3559 

3560 Return True if the file would be changed. 

3561 """ 

3562 self.filename = filename 

3563 self.silent = True 

3564 tog = TokenOrderGenerator() 

3565 try: 

3566 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3567 if not contents or not tokens or not tree: 

3568 return False 

3569 results = self.fstringify(contents, filename, tokens, tree) 

3570 except Exception as e: 

3571 print(e) 

3572 return False 

3573 # Something besides newlines must change. 

3574 changed = regularize_nls(contents) != regularize_nls(results) 

3575 status = 'Wrote' if changed else 'Unchanged' 

3576 # Write the results. 

3577 print(f"{status:>9}: {filename}") 

3578 if changed: 

3579 write_file(filename, results, encoding=encoding) 

3580 return changed 

3581 #@+node:ekr.20191222095754.1: *4* fs.make_fstring & helpers 

3582 def make_fstring(self, node): 

3583 """ 

3584 node is BinOp node representing an '%' operator. 

3585 node.left is an ast.Str node. 

3586 node.right reprsents the RHS of the '%' operator. 

3587 

3588 Convert this tree to an f-string, if possible. 

3589 Replace the node's entire tree with a new ast.Str node. 

3590 Replace all the relevant tokens with a single new 'string' token. 

3591 """ 

3592 trace = False 

3593 assert isinstance(node.left, ast.Str), (repr(node.left), g.callers()) 

3594 # Careful: use the tokens, not Str.s. This preserves spelling. 

3595 lt_token_list = get_node_token_list(node.left, self.tokens) 

3596 if not lt_token_list: # pragma: no cover 

3597 print('') 

3598 g.trace('Error: no token list in Str') 

3599 dump_tree(self.tokens, node) 

3600 print('') 

3601 return 

3602 lt_s = tokens_to_string(lt_token_list) 

3603 if trace: 

3604 g.trace('lt_s:', lt_s) # pragma: no cover 

3605 # Get the RHS values, a list of token lists. 

3606 values = self.scan_rhs(node.right) 

3607 if trace: # pragma: no cover 

3608 for i, z in enumerate(values): 

3609 dump_tokens(z, tag=f"RHS value {i}") 

3610 # Compute rt_s, self.line and self.line_number for later messages. 

3611 token0 = lt_token_list[0] 

3612 self.line_number = token0.line_number 

3613 self.line = token0.line.strip() 

3614 rt_s = ''.join(tokens_to_string(z) for z in values) 

3615 # Get the % specs in the LHS string. 

3616 specs = self.scan_format_string(lt_s) 

3617 if len(values) != len(specs): # pragma: no cover 

3618 self.message( 

3619 f"can't create f-fstring: {lt_s!r}\n" 

3620 f":f-string mismatch: " 

3621 f"{len(values)} value{g.plural(len(values))}, " 

3622 f"{len(specs)} spec{g.plural(len(specs))}") 

3623 return 

3624 # Replace specs with values. 

3625 results = self.substitute_values(lt_s, specs, values) 

3626 result = self.compute_result(lt_s, results) 

3627 if not result: 

3628 return 

3629 # Remove whitespace before ! and :. 

3630 result = self.clean_ws(result) 

3631 # Show the results 

3632 if trace: # pragma: no cover 

3633 before = (lt_s + ' % ' + rt_s).replace('\n', '<NL>') 

3634 after = result.replace('\n', '<NL>') 

3635 self.message( 

3636 f"trace:\n" 

3637 f":from: {before!s}\n" 

3638 f": to: {after!s}") 

3639 # Adjust the tree and the token list. 

3640 self.replace(node, result, values) 

3641 #@+node:ekr.20191222102831.3: *5* fs.clean_ws 

3642 ws_pat = re.compile(r'(\s+)([:!][0-9]\})') 

3643 

3644 def clean_ws(self, s): 

3645 """Carefully remove whitespace before ! and : specifiers.""" 

3646 s = re.sub(self.ws_pat, r'\2', s) 

3647 return s 

3648 #@+node:ekr.20191222102831.4: *5* fs.compute_result & helpers 

3649 def compute_result(self, lt_s, tokens): 

3650 """ 

3651 Create the final result, with various kinds of munges. 

3652 

3653 Return the result string, or None if there are errors. 

3654 """ 

3655 # Fail if there is a backslash within { and }. 

3656 if not self.check_back_slashes(lt_s, tokens): 

3657 return None # pragma: no cover 

3658 # Ensure consistent quotes. 

3659 if not self.change_quotes(lt_s, tokens): 

3660 return None # pragma: no cover 

3661 return tokens_to_string(tokens) 

3662 #@+node:ekr.20200215074309.1: *6* fs.check_back_slashes 

3663 def check_back_slashes(self, lt_s, tokens): 

3664 """ 

3665 Return False if any backslash appears with an {} expression. 

3666 

3667 Tokens is a list of lokens on the RHS. 

3668 """ 

3669 count = 0 

3670 for z in tokens: 

3671 if z.kind == 'op': 

3672 if z.value == '{': 

3673 count += 1 

3674 elif z.value == '}': 

3675 count -= 1 

3676 if (count % 2) == 1 and '\\' in z.value: 

3677 if not self.silent: 

3678 self.message( # pragma: no cover (silent during unit tests) 

3679 f"can't create f-fstring: {lt_s!r}\n" 

3680 f":backslash in {{expr}}:") 

3681 return False 

3682 return True 

3683 #@+node:ekr.20191222102831.7: *6* fs.change_quotes 

3684 def change_quotes(self, lt_s, aList): 

3685 """ 

3686 Carefully check quotes in all "inner" tokens as necessary. 

3687 

3688 Return False if the f-string would contain backslashes. 

3689 

3690 We expect the following "outer" tokens. 

3691 

3692 aList[0]: ('string', 'f') 

3693 aList[1]: ('string', a single or double quote. 

3694 aList[-1]: ('string', a single or double quote matching aList[1]) 

3695 """ 

3696 # Sanity checks. 

3697 if len(aList) < 4: 

3698 return True # pragma: no cover (defensive) 

3699 if not lt_s: # pragma: no cover (defensive) 

3700 self.message("can't create f-fstring: no lt_s!") 

3701 return False 

3702 delim = lt_s[0] 

3703 # Check tokens 0, 1 and -1. 

3704 token0 = aList[0] 

3705 token1 = aList[1] 

3706 token_last = aList[-1] 

3707 for token in token0, token1, token_last: 

3708 # These are the only kinds of tokens we expect to generate. 

3709 ok = ( 

3710 token.kind == 'string' or 

3711 token.kind == 'op' and token.value in '{}') 

3712 if not ok: # pragma: no cover (defensive) 

3713 self.message( 

3714 f"unexpected token: {token.kind} {token.value}\n" 

3715 f": lt_s: {lt_s!r}") 

3716 return False 

3717 # These checks are important... 

3718 if token0.value != 'f': 

3719 return False # pragma: no cover (defensive) 

3720 val1 = token1.value 

3721 if delim != val1: 

3722 return False # pragma: no cover (defensive) 

3723 val_last = token_last.value 

3724 if delim != val_last: 

3725 return False # pragma: no cover (defensive) 

3726 # 

3727 # Check for conflicting delims, preferring f"..." to f'...'. 

3728 for delim in ('"', "'"): 

3729 aList[1] = aList[-1] = Token('string', delim) 

3730 for z in aList[2:-1]: 

3731 if delim in z.value: 

3732 break 

3733 else: 

3734 return True 

3735 if not self.silent: # pragma: no cover (silent unit test) 

3736 self.message( 

3737 f"can't create f-fstring: {lt_s!r}\n" 

3738 f": conflicting delims:") 

3739 return False 

3740 #@+node:ekr.20191222102831.6: *5* fs.munge_spec 

3741 def munge_spec(self, spec): 

3742 """ 

3743 Return (head, tail). 

3744 

3745 The format is spec !head:tail or :tail 

3746 

3747 Example specs: s2, r3 

3748 """ 

3749 # To do: handle more specs. 

3750 head, tail = [], [] 

3751 if spec.startswith('+'): 

3752 pass # Leave it alone! 

3753 elif spec.startswith('-'): 

3754 tail.append('>') 

3755 spec = spec[1:] 

3756 if spec.endswith('s'): 

3757 spec = spec[:-1] 

3758 if spec.endswith('r'): 

3759 head.append('r') 

3760 spec = spec[:-1] 

3761 tail_s = ''.join(tail) + spec 

3762 head_s = ''.join(head) 

3763 return head_s, tail_s 

3764 #@+node:ekr.20191222102831.9: *5* fs.scan_format_string 

3765 # format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type] 

3766 # fill ::= <any character> 

3767 # align ::= "<" | ">" | "=" | "^" 

3768 # sign ::= "+" | "-" | " " 

3769 # width ::= integer 

3770 # precision ::= integer 

3771 # type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" 

3772 

3773 format_pat = re.compile(r'%(([+-]?[0-9]*(\.)?[0.9]*)*[bcdeEfFgGnoxrsX]?)') 

3774 

3775 def scan_format_string(self, s): 

3776 """Scan the format string s, returning a list match objects.""" 

3777 result = list(re.finditer(self.format_pat, s)) 

3778 return result 

3779 #@+node:ekr.20191222104224.1: *5* fs.scan_rhs 

3780 def scan_rhs(self, node): 

3781 """ 

3782 Scan the right-hand side of a potential f-string. 

3783 

3784 Return a list of the token lists for each element. 

3785 """ 

3786 trace = False 

3787 # First, Try the most common cases. 

3788 if isinstance(node, ast.Str): 

3789 token_list = get_node_token_list(node, self.tokens) 

3790 return [token_list] 

3791 if isinstance(node, (list, tuple, ast.Tuple)): 

3792 result = [] 

3793 elts = node.elts if isinstance(node, ast.Tuple) else node 

3794 for i, elt in enumerate(elts): 

3795 tokens = tokens_for_node(self.filename, elt, self.tokens) 

3796 result.append(tokens) 

3797 if trace: # pragma: no cover 

3798 g.trace(f"item: {i}: {elt.__class__.__name__}") 

3799 g.printObj(tokens, tag=f"Tokens for item {i}") 

3800 return result 

3801 # Now we expect only one result. 

3802 tokens = tokens_for_node(self.filename, node, self.tokens) 

3803 return [tokens] 

3804 #@+node:ekr.20191226155316.1: *5* fs.substitute_values 

3805 def substitute_values(self, lt_s, specs, values): 

3806 """ 

3807 Replace specifiers with values in lt_s string. 

3808 

3809 Double { and } as needed. 

3810 """ 

3811 i, results = 0, [Token('string', 'f')] 

3812 for spec_i, m in enumerate(specs): 

3813 value = tokens_to_string(values[spec_i]) 

3814 start, end, spec = m.start(0), m.end(0), m.group(1) 

3815 if start > i: 

3816 val = lt_s[i:start].replace('{', '{{').replace('}', '}}') 

3817 results.append(Token('string', val[0])) 

3818 results.append(Token('string', val[1:])) 

3819 head, tail = self.munge_spec(spec) 

3820 results.append(Token('op', '{')) 

3821 results.append(Token('string', value)) 

3822 if head: 

3823 results.append(Token('string', '!')) 

3824 results.append(Token('string', head)) 

3825 if tail: 

3826 results.append(Token('string', ':')) 

3827 results.append(Token('string', tail)) 

3828 results.append(Token('op', '}')) 

3829 i = end 

3830 # Add the tail. 

3831 tail = lt_s[i:] 

3832 if tail: 

3833 tail = tail.replace('{', '{{').replace('}', '}}') 

3834 results.append(Token('string', tail[:-1])) 

3835 results.append(Token('string', tail[-1])) 

3836 return results 

3837 #@+node:ekr.20200214142019.1: *4* fs.message 

3838 def message(self, message): # pragma: no cover. 

3839 """ 

3840 Print one or more message lines aligned on the first colon of the message. 

3841 """ 

3842 # Print a leading blank line. 

3843 print('') 

3844 # Calculate the padding. 

3845 lines = g.splitLines(message) 

3846 pad = max(lines[0].find(':'), 30) 

3847 # Print the first line. 

3848 z = lines[0] 

3849 i = z.find(':') 

3850 if i == -1: 

3851 print(z.rstrip()) 

3852 else: 

3853 print(f"{z[:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

3854 # Print the remaining message lines. 

3855 for z in lines[1:]: 

3856 if z.startswith('<'): 

3857 # Print left aligned. 

3858 print(z[1:].strip()) 

3859 elif z.startswith(':') and -1 < z[1:].find(':') <= pad: 

3860 # Align with the first line. 

3861 i = z[1:].find(':') 

3862 print(f"{z[1:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

3863 elif z.startswith('>'): 

3864 # Align after the aligning colon. 

3865 print(f"{' ':>{pad+2}}{z[1:].strip()}") 

3866 else: 

3867 # Default: Put the entire line after the aligning colon. 

3868 print(f"{' ':>{pad+2}}{z.strip()}") 

3869 # Print the standard message lines. 

3870 file_s = f"{'file':>{pad}}" 

3871 ln_n_s = f"{'line number':>{pad}}" 

3872 line_s = f"{'line':>{pad}}" 

3873 print( 

3874 f"{file_s}: {self.filename}\n" 

3875 f"{ln_n_s}: {self.line_number}\n" 

3876 f"{line_s}: {self.line!r}") 

3877 #@+node:ekr.20191225054848.1: *4* fs.replace 

3878 def replace(self, node, s, values): 

3879 """ 

3880 Replace node with an ast.Str node for s. 

3881 Replace all tokens in the range of values with a single 'string' node. 

3882 """ 

3883 # Replace the tokens... 

3884 tokens = tokens_for_node(self.filename, node, self.tokens) 

3885 i1 = i = tokens[0].index 

3886 replace_token(self.tokens[i], 'string', s) 

3887 j = 1 

3888 while j < len(tokens): 

3889 replace_token(self.tokens[i1 + j], 'killed', '') 

3890 j += 1 

3891 # Replace the node. 

3892 new_node = ast.Str() 

3893 new_node.s = s 

3894 replace_node(new_node, node) 

3895 # Update the token. 

3896 token = self.tokens[i1] 

3897 token.node = new_node # type:ignore 

3898 # Update the token list. 

3899 add_token_to_token_list(token, new_node) 

3900 #@+node:ekr.20191231055008.1: *4* fs.visit 

3901 def visit(self, node): 

3902 """ 

3903 FStringify.visit. (Overrides TOT visit). 

3904 

3905 Call fs.makes_fstring if node is a BinOp that might be converted to an 

3906 f-string. 

3907 """ 

3908 if ( 

3909 isinstance(node, ast.BinOp) 

3910 and op_name(node.op) == '%' 

3911 and isinstance(node.left, ast.Str) 

3912 ): 

3913 self.make_fstring(node) 

3914 #@-others 

3915#@+node:ekr.20191231084514.1: *3* class ReassignTokens (TOT) 

3916class ReassignTokens(TokenOrderTraverser): 

3917 """A class that reassigns tokens to more appropriate ast nodes.""" 

3918 #@+others 

3919 #@+node:ekr.20191231084640.1: *4* reassign.reassign 

3920 def reassign(self, filename, tokens, tree): 

3921 """The main entry point.""" 

3922 self.filename = filename 

3923 self.tokens = tokens 

3924 self.tree = tree 

3925 self.traverse(tree) 

3926 #@+node:ekr.20191231084853.1: *4* reassign.visit 

3927 def visit(self, node): 

3928 """ReassignTokens.visit""" 

3929 # For now, just handle call nodes. 

3930 if not isinstance(node, ast.Call): 

3931 return 

3932 tokens = tokens_for_node(self.filename, node, self.tokens) 

3933 node0, node9 = tokens[0].node, tokens[-1].node 

3934 nca = nearest_common_ancestor(node0, node9) 

3935 if not nca: 

3936 return 

3937 # g.trace(f"{self.filename:20} nca: {nca.__class__.__name__}") 

3938 # Associate () with the call node. 

3939 i = tokens[-1].index 

3940 j = find_paren_token(i + 1, self.tokens) 

3941 if j is None: 

3942 return # pragma: no cover 

3943 k = find_paren_token(j + 1, self.tokens) 

3944 if k is None: 

3945 return # pragma: no cover 

3946 self.tokens[j].node = nca # type:ignore 

3947 self.tokens[k].node = nca # type:ignore 

3948 add_token_to_token_list(self.tokens[j], nca) 

3949 add_token_to_token_list(self.tokens[k], nca) 

3950 #@-others 

3951#@+node:ekr.20191227170803.1: ** Token classes 

3952#@+node:ekr.20191110080535.1: *3* class Token 

3953class Token: 

3954 """ 

3955 A class representing a 5-tuple, plus additional data. 

3956 

3957 The TokenOrderTraverser class creates a list of such tokens. 

3958 """ 

3959 

3960 def __init__(self, kind, value): 

3961 

3962 self.kind = kind 

3963 self.value = value 

3964 # 

3965 # Injected by Tokenizer.add_token. 

3966 self.five_tuple = None 

3967 self.index = 0 

3968 # The entire line containing the token. 

3969 # Same as five_tuple.line. 

3970 self.line = '' 

3971 # The line number, for errors and dumps. 

3972 # Same as five_tuple.start[0] 

3973 self.line_number = 0 

3974 # 

3975 # Injected by Tokenizer.add_token. 

3976 self.level = 0 

3977 self.node = None 

3978 

3979 def __repr__(self): # pragma: no cover 

3980 nl_kind = getattr(self, 'newline_kind', '') 

3981 s = f"{self.kind:}.{self.index:<3}" 

3982 return f"{s:>18}:{nl_kind:7} {self.show_val(80)}" 

3983 

3984 def __str__(self): # pragma: no cover 

3985 nl_kind = getattr(self, 'newline_kind', '') 

3986 return f"{self.kind}.{self.index:<3}{nl_kind:8} {self.show_val(80)}" 

3987 

3988 def to_string(self): 

3989 """Return the contribution of the token to the source file.""" 

3990 return self.value if isinstance(self.value, str) else '' 

3991 #@+others 

3992 #@+node:ekr.20191231114927.1: *4* token.brief_dump 

3993 def brief_dump(self): # pragma: no cover 

3994 """Dump a token.""" 

3995 return ( 

3996 f"{self.index:>3} line: {self.line_number:<2} " 

3997 f"{self.kind:>11} {self.show_val(100)}") 

3998 #@+node:ekr.20200223022950.11: *4* token.dump 

3999 def dump(self): # pragma: no cover 

4000 """Dump a token and related links.""" 

4001 # Let block. 

4002 node_id = self.node.node_index if self.node else '' 

4003 node_cn = self.node.__class__.__name__ if self.node else '' 

4004 return ( 

4005 f"{self.line_number:4} " 

4006 f"{node_id:5} {node_cn:16} " 

4007 f"{self.index:>5} {self.kind:>11} " 

4008 f"{self.show_val(100)}") 

4009 #@+node:ekr.20200121081151.1: *4* token.dump_header 

4010 def dump_header(self): # pragma: no cover 

4011 """Print the header for token.dump""" 

4012 print( 

4013 f"\n" 

4014 f" node {'':10} token token\n" 

4015 f"line index class {'':10} index kind value\n" 

4016 f"==== ===== ===== {'':10} ===== ==== =====\n") 

4017 #@+node:ekr.20191116154328.1: *4* token.error_dump 

4018 def error_dump(self): # pragma: no cover 

4019 """Dump a token or result node for error message.""" 

4020 if self.node: 

4021 node_id = obj_id(self.node) 

4022 node_s = f"{node_id} {self.node.__class__.__name__}" 

4023 else: 

4024 node_s = "None" 

4025 return ( 

4026 f"index: {self.index:<3} {self.kind:>12} {self.show_val(20):<20} " 

4027 f"{node_s}") 

4028 #@+node:ekr.20191113095507.1: *4* token.show_val 

4029 def show_val(self, truncate_n): # pragma: no cover 

4030 """Return the token.value field.""" 

4031 if self.kind in ('ws', 'indent'): 

4032 val = len(self.value) 

4033 elif self.kind == 'string': 

4034 # Important: don't add a repr for 'string' tokens. 

4035 # repr just adds another layer of confusion. 

4036 val = g.truncate(self.value, truncate_n) # type:ignore 

4037 else: 

4038 val = g.truncate(repr(self.value), truncate_n) # type:ignore 

4039 return val 

4040 #@-others 

4041#@+node:ekr.20191110165235.1: *3* class Tokenizer 

4042class Tokenizer: 

4043 

4044 """Create a list of Tokens from contents.""" 

4045 

4046 results: List[Token] = [] 

4047 

4048 #@+others 

4049 #@+node:ekr.20191110165235.2: *4* tokenizer.add_token 

4050 token_index = 0 

4051 prev_line_token = None 

4052 

4053 def add_token(self, kind, five_tuple, line, s_row, value): 

4054 """ 

4055 Add a token to the results list. 

4056 

4057 Subclasses could override this method to filter out specific tokens. 

4058 """ 

4059 tok = Token(kind, value) 

4060 tok.five_tuple = five_tuple 

4061 tok.index = self.token_index 

4062 # Bump the token index. 

4063 self.token_index += 1 

4064 tok.line = line 

4065 tok.line_number = s_row 

4066 self.results.append(tok) 

4067 #@+node:ekr.20191110170551.1: *4* tokenizer.check_results 

4068 def check_results(self, contents): 

4069 

4070 # Split the results into lines. 

4071 result = ''.join([z.to_string() for z in self.results]) 

4072 result_lines = g.splitLines(result) 

4073 # Check. 

4074 ok = result == contents and result_lines == self.lines 

4075 assert ok, ( 

4076 f"\n" 

4077 f" result: {result!r}\n" 

4078 f" contents: {contents!r}\n" 

4079 f"result_lines: {result_lines}\n" 

4080 f" lines: {self.lines}" 

4081 ) 

4082 #@+node:ekr.20191110165235.3: *4* tokenizer.create_input_tokens 

4083 def create_input_tokens(self, contents, tokens): 

4084 """ 

4085 Generate a list of Token's from tokens, a list of 5-tuples. 

4086 """ 

4087 # Create the physical lines. 

4088 self.lines = contents.splitlines(True) 

4089 # Create the list of character offsets of the start of each physical line. 

4090 last_offset, self.offsets = 0, [0] 

4091 for line in self.lines: 

4092 last_offset += len(line) 

4093 self.offsets.append(last_offset) 

4094 # Handle each token, appending tokens and between-token whitespace to results. 

4095 self.prev_offset, self.results = -1, [] 

4096 for token in tokens: 

4097 self.do_token(contents, token) 

4098 # Print results when tracing. 

4099 self.check_results(contents) 

4100 # Return results, as a list. 

4101 return self.results 

4102 #@+node:ekr.20191110165235.4: *4* tokenizer.do_token (the gem) 

4103 header_has_been_shown = False 

4104 

4105 def do_token(self, contents, five_tuple): 

4106 """ 

4107 Handle the given token, optionally including between-token whitespace. 

4108 

4109 This is part of the "gem". 

4110 

4111 Links: 

4112 

4113 - 11/13/19: ENB: A much better untokenizer 

4114 https://groups.google.com/forum/#!msg/leo-editor/DpZ2cMS03WE/VPqtB9lTEAAJ 

4115 

4116 - Untokenize does not round-trip ws before bs-nl 

4117 https://bugs.python.org/issue38663 

4118 """ 

4119 import token as token_module 

4120 # Unpack.. 

4121 tok_type, val, start, end, line = five_tuple 

4122 s_row, s_col = start # row/col offsets of start of token. 

4123 e_row, e_col = end # row/col offsets of end of token. 

4124 kind = token_module.tok_name[tok_type].lower() 

4125 # Calculate the token's start/end offsets: character offsets into contents. 

4126 s_offset = self.offsets[max(0, s_row - 1)] + s_col 

4127 e_offset = self.offsets[max(0, e_row - 1)] + e_col 

4128 # tok_s is corresponding string in the line. 

4129 tok_s = contents[s_offset:e_offset] 

4130 # Add any preceding between-token whitespace. 

4131 ws = contents[self.prev_offset:s_offset] 

4132 if ws: 

4133 # No need for a hook. 

4134 self.add_token('ws', five_tuple, line, s_row, ws) 

4135 # Always add token, even if it contributes no text! 

4136 self.add_token(kind, five_tuple, line, s_row, tok_s) 

4137 # Update the ending offset. 

4138 self.prev_offset = e_offset 

4139 #@-others 

4140#@-others 

4141g = LeoGlobals() 

4142if __name__ == '__main__': 

4143 main() # pragma: no cover 

4144#@@language python 

4145#@@tabwidth -4 

4146#@@pagewidth 70 

4147#@-leo