wordle¶
Create wordclouds from git repositories, directories and source files.
Classes:
|
Generate word clouds from source code. |
Functions:
|
Export a wordcloud to a file. |
-
class
Wordle(font_path=None, width=400, height=200, prefer_horizontal=0.9, mask=None, contour_width=0, contour_color='black', scale=1, min_font_size=4, font_step=1, max_words=200, background_color='black', max_font_size=None, mode='RGB', relative_scaling='auto', color_func=None, regexp=None, collocations=True, colormap=None, repeat=False, include_numbers=False, min_word_length=0, random_state=None)[source]¶ Bases:
WordCloudGenerate word clouds from source code.
- Parameters
font_path (
Optional[str]) – Font path to the font that will be used (OTF or TTF). Defaults to DroidSansMono path on a Linux machine. If you are on another OS or don’t have this font, you need to adjust this path. DefaultNone.width (
int) – The width of the canvas. Default400.height (
int) – The height of the canvas. Default200.prefer_horizontal (
float) – The ratio of times to try horizontal fitting as opposed to vertical. If prefer_horizontal < 1, the algorithm will try rotating the word if it doesn’t fit. (There is currently no built-in way to get only vertical words.) Default0.9.mask (
Optional[ndarray]) – If notNone, gives a binary mask on where to draw words. If mask is notNone, width and height will be ignored and the shape of mask will be used instead. All white (#FFor#FFFFFF) entries will be considerd “masked out” while other entries will be free to draw on. DefaultNone.contour_width (
float) – If mask is notNoneand contour_width > 0, draw the mask contour. Default0.contour_color (
str) – Mask contour color. Default'black'.scale (
float) – Scaling between computation and drawing. For large word-cloud images, using scale instead of larger canvas size is significantly faster, but might lead to a coarser fit for the words. Default1.min_font_size (
int) – Smallest font size to use. Will stop when there is no more room in this size. Default4.font_step (
int) – Step size for the font.font_step> 1 might speed up computation but give a worse fit. Default1.max_words (
int) – The maximum number of words. Default200.background_color (
str) – Background color for the word cloud image. Default'black'.max_font_size (
Optional[int]) – Maximum font size for the largest word. IfNonethe height of the image is used. DefaultNone.mode (
str) – Transparent background will be generated when mode is “RGBA” and background_color is None. Default'RGB'.relative_scaling (
Union[str,float]) – Importance of relative word frequencies for font-size. With relative_scaling=0, only word-ranks are considered. With relative_scaling=1, a word that is twice as frequent will have twice the size. If you want to consider the word frequencies and not only their rank, relative_scaling around .5 often looks good. If ‘auto’ it will be set to 0.5 unless repeat is true, in which case it will be set to 0. Default'auto'.color_func (
Optional[Callable]) – Callable with parametersword,font_size,position,orientation,font_path,random_statewhich returns a PIL color for each word. Overwrites “colormap”. Seecolormapfor specifying a matplotlib colormap instead. To create a word cloud with a single color, usecolor_func=lambda *args, **kwargs: "white". The single color can also be specified using RGB code. For examplecolor_func=lambda *args, **kwargs: (255,0,0)sets the color to red. DefaultNone.regexp (
Optional[str]) – Regular expression to split the input text into tokens in process_text. If None is specified,r"\w[\w']+"is used. Ignored if using generate_from_frequencies. DefaultNone.collocations (
bool) – Whether to include collocations (bigrams) of two words. Ignored if using generate_from_frequencies. DefaultTrue.colormap (
Union[None,str,Colormap]) – Matplotlib colormap to randomly draw colors from for each word. Ignored if “color_func” is specified. Default “viridis”.repeat (
bool) – Whether to repeat words and phrases until max_words or min_font_size is reached. DefaultFalse.include_numbers (
bool) – Whether to include numbers as phrases or not. DefaultFalse.min_word_length (
int) – Minimum number of letters a word must have to be included. Default0.random_state (
Union[RandomState,int,None]) – Seed for the randomness that determines the colour and position of words. DefaultNone.
Note
Larger canvases with make the code significantly slower. If you need a large word cloud, try a lower canvas size, and set the scale parameter. The algorithm might give more weight to the ranking of the words than their actual frequencies, depending on the
max_font_sizeand the scaling heuristic.Methods:
Returns the wordcloud image as numpy array.
generate_from_directory(directory[, …])Create a word_cloud from a directory of source code files.
generate_from_file(filename[, outfile, …])Create a word_cloud from a source code file.
generate_from_git(git_url[, outfile, sha, …])Create a word_cloud from a directory of source code files.
recolor([random_state, color_func, colormap])Recolour the existing layout.
to_array()Returns the wordcloud image as numpy array.
to_file(filename)Export the wordle to a file.
to_image()Returns the wordcloud as an image.
to_svg(*[, embed_font, …])Export the wordle to an SVG.
Attributes:
Callable with parameters
word,font_size,position,orientation,font_path,random_statewhich returns a PIL color for each word.-
color_func¶ Type:
CallableCallable with parameters
word,font_size,position,orientation,font_path,random_statewhich returns a PIL color for each word.
-
generate_from_directory(directory, outfile=None, *, exclude_words=(), exclude_dirs=(), max_font_size=None)[source]¶ Create a word_cloud from a directory of source code files.
- Parameters
directory (
Union[str,Path,PathLike]) – The directory to processoutfile (
Union[str,Path,PathLike,None]) – The file to save the wordle as. Supported formats arePNG,JPEGand SVG. IfNonethe wordle is not saved. DefaultNone.exclude_words (
Sequence[str]) – An optional list of words to exclude. Default().exclude_dirs (
Sequence[Union[str,Path,PathLike]]) – An optional list of directories to exclude. Each entry is treated as a regular expression to match at the beginning of the relative path. Default().max_font_size (
Optional[int]) – Use this font-size instead ofmax_font_size. DefaultNone.
Changed in version 0.2.1:
exclude_words,exclude_dirs,max_font_sizeare now keyword-only.- Return type
-
generate_from_file(filename, outfile=None, *, exclude_words=(), max_font_size=None)[source]¶ Create a word_cloud from a source code file.
- Parameters
outfile (
Union[str,Path,PathLike,None]) – The file to save the wordle as. Supported formats arePNG,JPEGandSVG. IfNonethe wordle is not saved. DefaultNone.exclude_words (
Sequence[str]) – An optional list of words to exclude. Default().max_font_size (
Optional[int]) – Use this font-size instead ofmax_font_size. DefaultNone.
Changed in version 0.2.1:
exclude_words,max_font_sizeare now keyword-only.- Return type
-
generate_from_git(git_url, outfile=None, *, sha=None, depth=None, exclude_words=(), exclude_dirs=(), max_font_size=None)[source]¶ Create a word_cloud from a directory of source code files.
- Parameters
git_url (
str) – The url of the git repository to processoutfile (
Union[str,Path,PathLike,None]) – The file to save the wordle as. Supported formats arePNG,JPEGand SVG. IfNonethe wordle is not saved. DefaultNone.sha (
Optional[str]) – An optional SHA hash of a commit to checkout. DefaultNone.depth (
Optional[int]) – An optional depth to clone at. IfNoneandshaisNonethe depth is1. IfNoneandshais given the depth is unlimited. DefaultNone.exclude_words (
Sequence[str]) – An optional list of words to exclude. Default().exclude_dirs (
Sequence[Union[str,Path,PathLike]]) – An optional list of directories to exclude. Default().max_font_size (
Optional[int]) – Use this font-size instead of self.max_font_size. DefaultNone.
-
Changed in version 0.2.1:
exclude_words,exclude_dirs,max_font_sizeare now keyword-only.Added the
shaanddepthkeyword-only arguments.
- Return type
-
recolor(random_state=None, color_func=None, colormap=None)[source]¶ Recolour the existing layout.
Applying a new coloring is much faster than regenerating the whole wordle.
- Parameters
random_state (
Union[RandomState,int,None]) – If notNone, a fixed random state is used. If anintis given, this is used as seed for arandom.Randomstate. DefaultNone.color_func (
Optional[Callable]) – Function to generate new color from word count, font size, position and orientation. IfNone,color_funcis used. DefaultNone.colormap (
Union[None,str,Colormap]) – Use this colormap to generate new colors. Ignored ifcolor_funcis specified. IfNone,color_funcorcolor_mapis used. DefaultNone.
- Return type
- Returns
self
-
to_svg(*, embed_font=False, optimize_embedded_font=True, embed_image=False)[source]¶ Export the wordle to an SVG.
- Parameters
embed_font (
bool) – Whether to include font inside resulting SVG file. DefaultFalse.optimize_embedded_font (
bool) – Whether to be aggressive when embedding a font, to reduce size. In particular, hinting tables are dropped, which may introduce slight changes to character shapes (w.r.t. to_image baseline). DefaultTrue.embed_image (
bool) – Whether to include rasterized image inside resulting SVG file. Useful for debugging. DefaultFalse.
- Return type
- Returns
The content of the SVG image.