filesys - General filesystem tests and utilities

Purpose:

This module contains tests and utilities relating to files and the filesystem.

Platform:

Linux/Windows | Python 3.7+

Developer:

J Berendt

Email:

development@s3dev.uk

Comments:

n/a

Example:

Example for comparing two files:

>>> from utils4 import filesys

>>> filesys.compare_files(file1='/path/to/file1.txt',
                          file2='/path/to/file2.txt')
True

If the files are expected to have different line endings, yet the contents are otherwise expected to be the same, pass the contents_only argument as True; as this will skip the file signature test:

>>> from utils4 import filesys

>>> filesys.compare_files(file1='/path/to/file1.txt',
                          file2='/path/to/file2.txt',
                          contents_only=True)
True
filesys.compare_files(file1: str, file2: str, encoding: str = 'utf-8', contents_only: bool = False, sig_only: bool = False) bool[source]

Test if two files are the same.

This method is modelled after the built-in cmp() function, yet has been modified to ignore line endings. Meaning, if two files have the same signature and the contents are the same, except for the line endings, a result of True is returned.

Parameters:
  • file1 (str) – Full path to a file to be tested.

  • file2 (str) – Full path to a file to be tested.

  • encoding (str, optional) – Encoding to be used when reading the files. Defaults to ‘utf-8’.

  • contents_only (bool, optional) – Only compare the file contents, do not test the signatures. This is useful if the line endings are expected to be different, as a file with DOS line endings will be marginally larger than a file with UNIX line endings; meaning the file signature test will fail. Defaults to False.

  • sig_only (bool, optional) – Only compare the file signatures. The files’ contents are not compared. Defaults to False.

Tests:

If any of the following tests fail, a value of False is returned immediately, and no further tests are conducted.

The following tests are conducted, given default function parameters:

  • Test both files are ‘regular’ files.

  • Test the files have the same size (in bytes), they are both regular files and their inode mode is the same.

  • Test the contents are the same; ignoring line endings.

Returns:

True if all tests pass, indicating the files are the same; otherwise False.

Return type:

bool

filesys.dirsplit(path: str, nfiles: int, pattern: str = '*', pairs: bool = False, repl: tuple = (None,)) bool[source]

Move all files from a single directory into (n) sub-directories.

Parameters:
  • path (str) – Full path to the source files. Additionally, all files will be moved into sub-directories in this path.

  • nfiles (int) – Number of source files to be moved into each directory.

  • pattern (str, optional) – A shell-style wildcard pattern used for collecting the source files. For example: *.csv. Defaults to ‘*’.

  • pairs (bool, optional) – Are the files in pairs?. If True, the repl argument is used to replace a sub-string of the source file with that of the paired file, so each file pair is moved into the same directory. Defaults to False.

  • repl (tuple, optional) –

    A tuple containing the old and new replacement strings. This argument is only in effect if the pairs argument is True. Defaults to (None,).

    For example:

    ('_input.csv', '_output.txt')
    

Raises:

FileNotFoundError – If the input file path does not exist.

Returns:

True if the operation completes, otherwise False.

Return type:

bool