您当前的位置: 首页 >  一个处女座的程序猿 Python

Python:wordcloud.wordcloud()函数的参数解析及其说明

一个处女座的程序猿 发布时间:2020-07-30 23:32:37 ,浏览量:3

Python:wordcloud.wordcloud()函数的参数解析及其说明

 

 

目录

wordcloud.wordcloud()函数的参数解析及其说明

 

 

wordcloud.wordcloud()函数的参数解析及其说明

class WordCloud Found at: wordcloud.wordcloudclass WordCloud(object):
    """Word cloud object for generating and drawing.
    
    Parameters
    ----------
    font_path: string
    Font path to the font that will be used (OTF or TTF).
    Defaults to DroidSansMono path on a Linux machine. If you are on another OS or don't have this font, you need to adjust this path.
    
    width : int (default=400)
    Width of the canvas.
    
    height : int (default=200)
    Height of the canvas.
    
    prefer_horizontal : float (default=0.90)
    The ratio of times to try horizontal fitting as opposed to vertical.  If prefer_horizontal < 1, the algorithm will try rotating the word   if it doesn't fit. (There is currently no built-in way to get only vertical words.)
    
    mask : nd-array or None (default=None)
    If not None, gives a binary mask on where to draw words. If mask  is not  None, width and height will be ignored and the shape of mask  will be used instead. All white (#FF or #FFFFFF) entries will be considerd   "masked out" while other entries will be free to draw on. [This  changed in the most recent version!]
    
    scale : float (default=1)
    Scaling between computation and drawing. For large word-cloud   images,
    using scale instead of larger canvas size is significantly faster, but might lead to a coarser fit for the words.
    
    min_font_size : int (default=4)
    Smallest font size to use. Will stop when there is no more room   in this  size.
    
    font_step : int (default=1)
    Step size for the font. font_step > 1 might speed up computation  but   give a worse fit.
    
    max_words : number (default=200)
    The maximum number of words.
    
    stopwords : set of strings or None
    The words that will be eliminated. If None, the build-in  STOPWORDS  list will be used.
    
    background_color : color value (default="black")
    Background color for the word cloud image.
    
    max_font_size : int or None (default=None)
    Maximum font size for the largest word. If None, height of the    image is used.
    
    mode : string (default="RGB")
    Transparent background will be generated when mode is "RGBA"  and  background_color is None.
    
    relative_scaling : float (default=.5)
    Importance of relative word frequencies for font-size.  With  relative_scaling=0, only word-ranks are considered.  With   relative_scaling=1, a word that is twice as frequent will have twice the size.  If you want to consider the word frequencies and not  only  their rank, relative_scaling around .5 often looks good.
    
    .. versionchanged: 2.0
    Default is now 0.5.
    
    color_func: callable, default=None
    Callable with parameters word, font_size, position, orientation,  font_path, random_state that returns a PIL color for each word.
    Overwrites "colormap". See colormap for specifying a matplotlib colormap instead.
    
    regexp : string or None (optional)
    Regular expression to split the input text into tokens in   process_text.
    If None is specified, ``r"\w[\w']+"`` is used.
    
    collocations : bool, default=True
    Whether to include collocations (bigrams) of two words.
    
    .. versionadded: 2.0
    
    colormap : string or matplotlib colormap, default="viridis"
    Matplotlib colormap to randomly draw colors from for each   word.
    Ignored if "color_func" is specified.
    
    .. versionadded: 2.0
    
    normalize_plurals : bool, default=True
    Whether to remove trailing 's' from words. If True and a word appears with and without a trailing 's', the one with trailing 's'  is removed and its counts are added to the version without  trailing 's' -- unless the word ends with 'ss'.
    

类WordCloud在:WordCloud找到。wordcloudclass WordCloud(对象):
用于生成和绘制的Word云对象。

参数
----------
font_path:字符串
要使用的字体(OTF或TTF)的字体路径。
Linux机器上的默认DroidSansMono路径。如果你在另一个操作系统上或者没有这个字体,你需要调整这个路径。

width :int(默认=400)
画布的宽度。

height :int(默认=200)
画布的高度。

prefer_horizontal : float(默认=0.90)
尝试水平拟合与垂直拟合的时间比。如果prefer_horizontal < 1,算法将尝试旋转不适合的单词。(目前还没有内置的方法来只获取垂直的单词。)

mask : nd-array或None(默认=None)
如果没有,给出一个二进制掩码在哪里绘制单词。如果遮罩不是None,宽度和高度将被忽略,而使用遮罩的形状。所有白色(#FF或#FFFFFF)的参赛作品将被视为“屏蔽”,而其他参赛作品将可以自由提取。[这在最近的版本中有所改变!]

scale :浮动(默认=1)
在计算和绘图之间缩放。对于大的字云图像,
使用scale而不是更大的画布尺寸会快得多,但可能会导致适合文字的粗化。

min_font_size : int(默认=4)
使用的最小字体大小。将停止时,没有更多的空间在这个大小。

font_step : int(默认=1)
字体的步长。font_step > 1可能会加速计算,但是匹配效果更差。

max_words :数字(默认=200)
单词的最大数量。

stopwords :一组字符串或没有
将被删除的单词。如果没有,将使用内置的STOPWORDS列表。

background_color :颜色值(默认=“黑色”)
背景色为字云图像。

max_font_size : int或None(默认=None)
为最大的字的最大字体大小。如果没有,则使用图像的高度。

mode :string(默认="RGB")
当模式为“RGBA”,background_color为None时,将生成透明背景。

relative_scaling :浮动(默认= 5)
字体大小的相对频率的重要性。对于relative_scaling=0,只考虑单词的等级。使用relative_scaling=1,出现频率两倍的单词的大小也会增加一倍。如果您想要考虑单词的频率而不仅仅是它们的排名,那么在5左右的relative_scaling通常看起来不错。

. .versionchanged: 2.0
现在默认值是0.5。

color_func:可调用,默认=无
可调用参数word, font_size, position, orientation, font_path, random_state,为每个单词返回一个PIL颜色。
覆盖“colormap”。请参阅colormap以指定matplotlib的colormap。

regexp :字符串或无(可选)
正则表达式,用于在process_text中将输入文本分割为令牌。
如果没有指定,“r”\ w (\ w) +”“使用。
&
collocations :bool, default=True
是否包含两个单词的搭配(双字母组合)。

. .versionadded: 2.0

colormap : string或matplotlib colormap,默认="viridis"
Matplotlib colormap为每个单词随机绘制颜色。
如果指定了“color_func”,则忽略。

. .versionadded: 2.0

normalize_plurals : bool, default=True
是否删除单词后面的“s”。如果是真的,并且一个单词出现时带有或不带有结尾s,那么带有结尾s的单词将被删除,并将其计数添加到没有结尾s的版本中——除非这个单词以“ss”结尾。
    Attributes
    ----------
    ``words_`` : dict of string to float
    Word tokens with associated frequency.
    
    .. versionchanged: 2.0
    ``words_`` is now a dictionary
    
    ``layout_ `` : list of tuples (string, int, (int, int), int, color))
    Encodes the fitted word cloud. Encodes for each word the string,   font size, position, orientation and color.
    
    Notes
    -----
    Larger canvases with make the code significantly slower. If you   need a  large word cloud, try a lower canvas size, and set the scale  parameter.
    
    The algorithm might give more weight to the ranking of the words  than their actual frequencies, depending on the ``max_font_size `   and the scaling heuristic.
    """
属性
---------
' ' words_ ' ':浮动字符串的dict
具有相关频率的单词标记。

. .versionchanged: 2.0
“words_”现在是一本字典

' ' layout_ ' ':元组列表(字符串,int, (int, int), int, color))
编码合适的词云。为每个单词编码字符串、字体大小、位置、方向和颜色。

笔记
-----
较大的画布使代码明显地变慢。如果你需要一个大的字云,尝试一个较低的画布大小,并设置比例参数。

根据' ' max_font_size '和缩放启发式,算法可能给予单词的排名比它们的实际频率更多的权重。
”“”

    def __init__(self, font_path=None, width=400, height=200, 
     margin=2, 
        ranks_only=None, prefer_horizontal=.9, mask=None, scale=1, 
        color_func=None, max_words=200, min_font_size=4, 
        stopwords=None, random_state=None, 
         background_color='black', 
        max_font_size=None, font_step=1, mode="RGB", 
        relative_scaling=.5, regexp=None, collocations=True, 
        colormap=None, normalize_plurals=True):
        if font_path is None:
            font_path = FONT_PATH
        if color_func is None and colormap is None:
            # we need a color map
            import matplotlib
            version = matplotlib.__version__
            if version[0] < "2" and version[2] < "5":
                colormap = "hsv"
            else:
                colormap = "viridis"
        self.colormap = colormap
        self.collocations = collocations
        self.font_path = font_path
        self.width = width
        self.height = height
        self.margin = margin
        self.prefer_horizontal = prefer_horizontal
        self.mask = mask
        self.scale = scale
        self.color_func = color_func or colormap_color_func(colormap)
        self.max_words = max_words
        self.stopwords = stopwords if stopwords is not None else 
         STOPWORDS
        self.min_font_size = min_font_size
        self.font_step = font_step
        self.regexp = regexp
        if isinstance(random_state, int):
            random_state = Random(random_state)
        self.random_state = random_state
        self.background_color = background_color
        self.max_font_size = max_font_size
        self.mode = mode
        if relative_scaling < 0 or relative_scaling > 1:
            raise ValueError(
                "relative_scaling needs to be "
                "between 0 and 1, got %f." % 
                relative_scaling)
        self.relative_scaling = relative_scaling
        if ranks_only is not None:
            warnings.warn("ranks_only is deprecated and will be 
             removed as"
                " it had no effect. Look into relative_scaling.", 
                DeprecationWarning)
        self.normalize_plurals = normalize_plurals
    
    def fit_words(self, frequencies):
        """Create a word_cloud from words and frequencies.

        Alias to generate_from_frequencies.

        Parameters
        ----------
        frequencies : dict from string to float
            A contains words and associated frequency.

        Returns
        -------
        self
        """
        return self.generate_from_frequencies(frequencies)
    
    def generate_from_frequencies(self, frequencies, 
     max_font_size=None):
        """Create a word_cloud from words and frequencies. Parameters

        ----------
        frequencies : dict from string to float
            A contains words and associated frequency.

        max_font_size : int
            Use this font-size instead of self.max_font_size

        Returns
        -------
        self

        """
        # make sure frequencies are sorted and normalized
        frequencies = sorted(frequencies.items(), key=itemgetter(1), 
         reverse=True)
        if len(frequencies)

关注
打赏
查看更多评论