wordle.frequency

Functions to determine word token frequency for wordclouds.

New in version 0.2.0.

Functions:

frequency_from_directory(directory[, …])

Returns a dictionary mapping the words in files in directory to their frequencies.

frequency_from_file(filename[, exclude_words])

Returns a dictionary mapping the words in the file to their frequencies.

frequency_from_git(git_url[, sha, depth, …])

Returns a dictionary mapping the words in files in directory to their frequencies.

get_tokens(filename)

Returns a collections.Counter of the tokens in a file.

frequency_from_directory(directory, exclude_words=(), exclude_dirs=())[source]

Returns a dictionary mapping the words in files in directory to their frequencies.

Parameters

New in version 0.2.0.

Return type

Counter

frequency_from_file(filename, exclude_words=())[source]

Returns a dictionary mapping the words in the file to their frequencies.

Parameters

New in version 0.2.0.

See also

func:~.get_tokens

Return type

Counter

frequency_from_git(git_url, sha=None, depth=None, exclude_words=(), exclude_dirs=())[source]

Returns a dictionary mapping the words in files in directory to their frequencies.

Parameters
  • git_url (str) – The url of the git repository to process

  • sha (Optional[str]) – An optional SHA hash of a commit to checkout. Default None.

  • depth (Optional[int]) – An optional depth to clone at. If None and sha is None the depth is 1. If None and sha is given the depth is unlimited. Default None.

  • exclude_words (Sequence[str]) – An optional list of words to exclude. Default ().

  • exclude_dirs (Sequence[Union[str, Path, PathLike]]) – An optional list of directories to exclude. Default ().

New in version 0.2.0.

Return type

Counter

get_tokens(filename)[source]

Returns a collections.Counter of the tokens in a file.

Parameters

filename (Union[str, Path, PathLike]) – The file to parse.

Return type

Counter[str]

Returns

A count of words etc. in the file.