可以减少织女星词云中的单词数量吗?

Possible to reduce the number of Words in a Vega wordcloud?

提问人:dsx 提问时间:10/24/2023 最后编辑:dsx 更新时间:10/29/2023 访问量:82

问:

情况

我正在使用 Looker Studio 中的 Vega 可视化语法工具。具体来说,词云图表。

文档:https://vega.github.io/vega/docs/transforms/wordcloud/

我尝试过什么/结果

我正在生成的词云中有太多的单词,我试图弄清楚如何减少它。

操作该页面上的示例编辑器,我可以通过将“字板”设置为 5 来获得我想要的结果,但这似乎只是将一些单词推出视野,而不是减少单词数量,然后将这些单词分布在大小范围内。

我还尝试更改 Looker 工作室的样式属性中的“填充”,但它并没有改变图表的任何内容。

对此的任何意见将不胜感激。

使用当前代码进行更新

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "data": [
    {
     "name": "default",
      "transform": [
        {
          "type": "formula", "as": "rotate",
          "expr": "[0, 90][~~(datum.index % 2)]"
        },
  {
          "type": "formula", "as": "weight",
          "expr": "if(datum.index==0, 600, 400)"
        },
        {
          "type": "wordcloud",
          "size": [{"signal": "width"}, {"signal": "height"}],
          "text": {"field": "$dimension0"},
          "fontSize": {"field": "$metric0"},
          "fontWeight": {"field": "weight"},
          "fontSizeRange": [{"signal": "(width+height)/96"}, {"signal": "(width+height)/24"}],
          "padding": {"value": 2},
          "rotate": {"field": "rotate"}
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "default", "field": "$dimension0"},
      "scheme": "datastudio20"
    }
  ],

  "marks": [
    {
      "type": "text",
      "from": {"data": "default"},
      "encode": {
        "enter": {
          "text": {"field": "$dimension0"},
          "align": {"value": "center"},
          "baseline": {"value": "alphabetic"},
          "fill": {"scale": "color", "field": "$dimension0"}
        },
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "angle": {"field": "angle"},
          "fontSize": {"field": "fontSize"},
         "fontWeight": {"field": "weight"},
          "fillOpacity": {"value": 0.7}
        },
        "hover": {
          "fillOpacity": {"value": 1}
        }
      }
    }
  ]
}

我基于此尝试了什么

@Davide

从那里,我在你的代码中发现了这一行:

{"type": "filter", "expr": "datum.row < 25"},

因此,我尝试在代码中的“转换”部分下添加它。

结果

然而,这只产生了一个空白的图表。

更新了具有“window”和“filter”转换的代码

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "data": [
    {
     "name": "default",
      "transform": [
        {
          "type": "formula", "as": "rotate",
          "expr": "[0, 90][~~(datum.index % 2)]"
        },
        {
          "type": "formula", "as": "weight",
          "expr": "if(datum.index==0, 600, 400)"
        },
        {
          "type": "formula",
          "as": "weight",
          "expr": "if(datum.text=='VEGA', 600, 300)"
        },
        {
          "type": "formula",
          "as": "rotate",
          "expr": "[-rotate, 0, rotate][~~(random() * 3)]"
        },
        {
          "type": "window",
          "sort": {"field": "count", "order": "descending"},
          "ops": ["row_number"],
          "fields": [null],
          "as": ["row"]
        },
        {"type": "filter", "expr": "datum.row < 25"},
        {
          "type": "wordcloud",
          "size": [{"signal": "width"}, {"signal": "height"}],
          "text": {"field": "text"},
          "font": "Helvetica Neue, Arial",
          "fontSize": {"field": "count"},
          "fontWeight": {"field": "weight"},
          "fontSizeRange": [
            {"signal": "fontSizeRange0"},
            {"signal": "fontSizeRange1"}
          ],
          "padding": {"signal": "wordPadding"},
          "rotate": {"field": "rotate"}
        },
        {
          "type": "wordcloud",
          "size": [{"signal": "width"}, {"signal": "height"}],
          "text": {"field": "$dimension0"},
          "fontSize": {"field": "$metric0"},
          "fontWeight": {"field": "weight"},
          "fontSizeRange": [{"signal": "(width+height)/96"}, {"signal": "(width+height)/24"}],
          "padding": {"value": 2},
          "rotate": {"field": "rotate"}
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "default", "field": "$dimension0"},
      "scheme": "datastudio20"
    }
  ],

  "marks": [
    {
      "type": "text",
      "from": {"data": "default"},
      "encode": {
        "enter": {
          "text": {"field": "$dimension0"},
          "align": {"value": "center"},
          "baseline": {"value": "alphabetic"},
          "fill": {"scale": "color", "field": "$dimension0"}
        },
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "angle": {"field": "angle"},
          "fontSize": {"field": "fontSize"},
         "fontWeight": {"field": "weight"},
          "fillOpacity": {"value": 0.7}
        },
        "hover": {
          "fillOpacity": {"value": 1}
        }
      }
    }
  ]
}

有效的代码

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "data": [
    {
     "name": "default",
      "transform": [
        {
          "type": "countpattern",
          "field": "$dimension0",
          "pattern": "[\\w']{3,}",
          "stopwords": "very|now|can't|are|800|every|also|ever|just|dont|don't|been|pnly|I've|I'm|you|why|try|but|was|it's|her|2021|where|com|not|for|that|from|and|out|this|the|has|have|2022|2021"
        },
        {
        "type": "formula", "as": "weight",
        "expr": "log(datum.count)*5"
        },
        {
          "type": "window",
          "sort": {"field": "count", "order": "descending"},
          "ops": ["row_number"],
          "fields": [null],
          "as": ["row"]
        },
        {"type": "filter", "expr": "datum.row < 40"},
        {
          "type": "formula", "as": "rotate",
          "expr": "[0, 90][~~(datum.count % 2)]"
        },
        {
        "type": "wordcloud",
        "size": [{"signal":"width"}, {"signal":"height"}],
        "text": {"field": "text"},
        "font": "Helvetica Neue",
        "fontSize": {"field": "count"},
        "fontWeight": {"field": "weight"},
        "fontSizeRange": [
          {"signal": "(width+height)/96"},
          {"signal": "(width+height)/24"}],
        "rotate": {"field": "rotate"},
        "padding": 2
      }
      ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "default", "field": "text"},
      "scheme": "set3"
    }
  ],

  "marks": [
    {
      "type": "text",
      "from": {"data": "default"},
      "encode": {
        "enter": {
          "text": {"field": "text"},
          "align": {"value": "center"},
          "baseline": {"value": "alphabetic"},
          "fill": {"scale": "color", "field": "text"}
        },
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "angle": {"field": "angle"},
          "fontSize": {"field": "fontSize"},
          "fontWeight":{"field": "weight"},
          "fillOpacity": {"value": 0.6}
        },
        "hover": {
          "fillOpacity": {"value": 1}
        }
      }
    }
  ]
}
可视化 Looker-Studio Vega 词云

评论


答:

2赞 Davide Bacci 10/24/2023 #1

enter image description here

本示例按计数筛选前 25 个单词。

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "name": "wordcloud",
  "width": 400,
  "height": 200,
  "padding": 0,
  "autosize": "none",
  "signals": [
    {
      "name": "wordPadding",
      "value": 1,
      "bind": {"input": "range", "min": 0, "max": 5, "step": 1}
    },
    {
      "name": "fontSizeRange0",
      "value": 8,
      "bind": {"input": "range", "min": 8, "max": 42, "step": 1}
    },
    {
      "name": "fontSizeRange1",
      "value": 24,
      "bind": {"input": "range", "min": 8, "max": 42, "step": 1}
    },
    {
      "name": "rotate",
      "value": 45,
      "bind": {"input": "select", "options": [0, 30, 45, 60, 90]}
    }
  ],
  "data": [
    {
      "name": "table",
      "values": [
        "Declarative visualization grammars can accelerate development, facilitate retargeting across platforms, and allow language-level optimizations. However, existing declarative visualization languages are primarily concerned with visual encoding, and rely on imperative event handlers for interactive behaviors. In response, we introduce a model of declarative interaction design for data visualizations. Adopting methods from reactive programming, we model low-level events as composable data streams from which we form higher-level semantic signals. Signals feed predicates and scale inversions, which allow us to generalize interactive selections at the level of item geometry (pixels) into interactive queries over the data domain. Production rules then use these queries to manipulate the visualization’s appearance. To facilitate reuse and sharing, these constructs can be encapsulated as named interactors: standalone, purely declarative specifications of interaction techniques. We assess our model’s feasibility and expressivity by instantiating it with extensions to the Vega visualization grammar. Through a diverse range of examples, we demonstrate coverage over an established taxonomy of visualization interaction techniques.",
        "We present Reactive Vega, a system architecture that provides the first robust and comprehensive treatment of declarative visual and interaction design for data visualization. Starting from a single declarative specification, Reactive Vega constructs a dataflow graph in which input data, scene graph elements, and interaction events are all treated as first-class streaming data sources. To support expressive interactive visualizations that may involve time-varying scalar, relational, or hierarchical data, Reactive Vega’s dataflow graph can dynamically re-write itself at runtime by extending or pruning branches in a data-driven fashion. We discuss both compile- and run-time optimizations applied within Reactive Vega, and share the results of benchmark studies that indicate superior interactive performance to both D3 and the original, non-reactive Vega system.",
        "We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection is an abstraction that defines input event processing, points of interest, and a predicate function for inclusion testing. Selections parameterize visual encodings by serving as input data, defining scale extents, or by driving conditional logic. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. In contrast to existing reactive specifications, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units. We evaluate Vega-Lite through a range of examples, demonstrating succinct specification of both customized interaction methods and common techniques such as panning, zooming, and linked selection."
      ],
      "transform": [
        {
          "type": "countpattern",
          "field": "data",
          "case": "upper",
          "pattern": "[\\w']{3,}",
          "stopwords": "(i|me|my|myself|we|us|our|ours|ourselves|you|your|yours|yourself|yourselves|he|him|his|himself|she|her|hers|herself|it|its|itself|they|them|their|theirs|themselves|what|which|who|whom|whose|this|that|these|those|am|is|are|was|were|be|been|being|have|has|had|having|do|does|did|doing|will|would|should|can|could|ought|i'm|you're|he's|she's|it's|we're|they're|i've|you've|we've|they've|i'd|you'd|he'd|she'd|we'd|they'd|i'll|you'll|he'll|she'll|we'll|they'll|isn't|aren't|wasn't|weren't|hasn't|haven't|hadn't|doesn't|don't|didn't|won't|wouldn't|shan't|shouldn't|can't|cannot|couldn't|mustn't|let's|that's|who's|what's|here's|there's|when's|where's|why's|how's|a|an|the|and|but|if|or|because|as|until|while|of|at|by|for|with|about|against|between|into|through|during|before|after|above|below|to|from|up|upon|down|in|out|on|off|over|under|again|further|then|once|here|there|when|where|why|how|all|any|both|each|few|more|most|other|some|such|no|nor|not|only|own|same|so|than|too|very|say|says|said|shall)"
        },
        {
          "type": "formula",
          "as": "weight",
          "expr": "if(datum.text=='VEGA', 600, 300)"
        },
        {
          "type": "formula",
          "as": "rotate",
          "expr": "[-rotate, 0, rotate][~~(random() * 3)]"
        },
        {
          "type": "window",
          "sort": {"field": "count", "order": "descending"},
          "ops": ["row_number"],
          "fields": [null],
          "as": ["row"]
        },
        {"type": "filter", "expr": "datum.row < 25"},
        {
          "type": "wordcloud",
          "size": [{"signal": "width"}, {"signal": "height"}],
          "text": {"field": "text"},
          "font": "Helvetica Neue, Arial",
          "fontSize": {"field": "count"},
          "fontWeight": {"field": "weight"},
          "fontSizeRange": [
            {"signal": "fontSizeRange0"},
            {"signal": "fontSizeRange1"}
          ],
          "padding": {"signal": "wordPadding"},
          "rotate": {"field": "rotate"}
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "range": ["#d5a928", "#652c90", "#939597"]
    }
  ],
  "marks": [
    {
      "type": "text",
      "from": {"data": "table"},
      "encode": {
        "enter": {
          "text": {"field": "text"},
          "align": {"value": "center"},
          "baseline": {"value": "alphabetic"},
          "fill": {"scale": "color", "field": "text"},
          "font": {"value": "Helvetica Neue, Arial"},
          "fontWeight": {"field": "weight"}
        },
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "angle": {"field": "angle"},
          "fontSize": {"field": "fontSize"},
          "fillOpacity": {"value": 1}
        },
        "hover": {"fillOpacity": {"value": 0.5}}
      }
    }
  ]
}

评论

0赞 dsx 10/26/2023
谢谢!我有点让它工作。我能够插入您的代码并显示它,但无法调整我认为是您代码的关键元素。我在上面帖子的底部添加了一个部分来澄清。非常感谢您的意见!
0赞 Davide Bacci 10/26/2023
您需要按此顺序进行窗口转换和筛选器转换。如果您发现该解决方案有帮助,请不要忘记标记为已解决和/或点赞。
0赞 dsx 10/27/2023
谢谢@Davide,我不确定我做错了什么,但它仍然没有显示任何内容,同时具有“窗口”和“过滤器”转换。我已经粘贴了上面的最新版本。
0赞 Davide Bacci 10/27/2023
您需要在开始时进行 countpattern 转换,并且由于某种原因,您还有两个词云转换......
0赞 dsx 10/29/2023
是的,谢谢。成功了!我不知道为什么,但它在图表的右上角有点。我以为这段代码可能会修复它,但它已经设置为“center”(“align”: {“value”: “center”},)。我将新的工作代码放在上面窗口的底部。感谢您的帮助!