将字符串推送到数组中,或将其设置为对象属性的值,是否复制字符串或在 JavaScript 中保留引用?

Does pushing a string into an array, or setting it as the value to an object property, copy the string or keep a reference in JavaScript?

提问人:Lance 提问时间:11/7/2023 更新时间:11/8/2023 访问量:110

问:

特别是在 v8 / Node.js 中,当您将原始类型(字符串、数字、布尔值)推送到数组中时,它是克隆字符串还是存储引用?

我知道你不能这样做并更改字符串:

let array = []
let x = 'foo'
array.push(x)
x = 'bar'
console.log(array) //=> ['foo']

但是,如果我这样做,是否会多次复制字符串(从而增加内存占用)?

let array = []
let x = 'foo'
array.push(x)
array.push(x)
array.push(x)
...

对象键的相同问题,如果我这样做它会克隆字符串吗?

let object = {}
let x = 'foo'
object.a = x
object.b = x
object.c = x

我四处搜索了一下,但没有找到这个问题的直接答案。

在 javascript 中将对象推送到数组中是深度复制还是浅复制?

这篇博文说:

对象和数组作为指向原始对象的指针推送。内置基元类型(如数字或布尔值)作为副本推送。

但我不确定这是否正确(它没有备份)。我必须运行一堆彻底的测试才能真正检查并查看当我推送到数组时内存是否增长。我不太确定实现这一点的最简单方法,所以也许 v8 工程师或其他精通编译器理论的人知道这是如何实现的。

我想用它来计算我添加到 trie 的每个字符串的大小,然后跟踪 trie 的粗略大小(将其中使用的字符串大小相加,并粗略地对用于存储 n 个对象属性和 x 长度数组的字节进行粗略的客调)。所以第一步是理解,当我将字符串推送到多个位置时,我的字符串会被复制吗,还是会在每个位置携带相同的引用?Buffer.byteLength(text, 'utf8')

我希望博客是不正确的,并且它推送了一个引用,只是一旦变量被发送到另一个函数,你就无法修改它。但是字符串仍然是一个引用,直到你尝试更改变量,类似的东西。

JavaScript 字符串 v8

评论

0赞 Bergi 11/7/2023
js 值是克隆的,因为它是一个基元。但是,底层表示通常是对文本字节的引用,并且只需要复制该引用,因为引用的字节永远不会更改。
0赞 jmrk 11/7/2023
@Bergi 关键点是字符串没有被克隆。我不认为被欺骗的问题根本无法回答这个问题。想重新打开它以便我可以发布我的答案吗?
0赞 Bergi 11/7/2023
@jmrk呵呵,你是对的,其中任何一个都没有明确说明(尽管 imo 强烈暗示/可以很容易地推断出来)。随意重新开放。

答:

1赞 Yoric 11/7/2023 #1

如果可以避免字符串,JavaScript VM 将永远不会复制字符串。在这种情况下,不复制字符串是微不足道的。

如果你真的想复制字符串,你需要经历一些恶作剧,比如将它们转换为其他编码并返回,或者将它们拆分并连接回来。如果我的记忆正确,上次我检查时,一旦复制了字符串,VM 就不会尝试对它们进行重复数据删除。

资料来源:曾经在 SpiderMonkey 上工作。

2赞 jmrk 11/7/2023 #2

(V8 开发人员在这里。

将字符串存储在数组中(或实际上,在任何地方)时,不会复制该字符串。布尔值也不是。

对于数字,这取决于:它们通常也存储为参考,除非某些优化的情况有更有效的替代方案。

原因有二:
(1)不需要克隆字符串。
(2)不克隆字符串更简单快捷。

就实现细节而言,您引用的片段是完全错误的。有人可能会争辩说,就可观察语义而言,这并不完全不正确:程序的行为无法判断是否存储了字符串的副本,或者只是对它的另一个引用。(但当然,这只会让整个语句变得毫无意义:如果对象被存储为引用,而对于基元,我们无法区分,为什么不简单地假设所有内容都存储为引用呢?

As a rule of thumb: VMs for dynamic languages like JavaScript treat everything as a reference, except for whichever special cases they choose to optimize (typically some definition of number; search for the terms "smi-tagging" and "nan-boxing" if you want to dig deeper).
Whether a value is a "primitive" or not only affects whether it has object identity:

{foo: 42} === {foo: 42}  // false, objects have identity
42 === 42                // true, numbers have no identity
"foo" === "foo"          // true, strings have no identity

Being a primitive does not affect how a value is stored in arrays/objects/variables/whatever, nor where it is allocated (a related myth I sometimes see is "primitives are allocated on the stack" -- nope, they are not).


Added clarification on @Bergi's request:
Of course, when you repeatedly call , the size of the array's backing stores grows, because it needs to store increasingly many references to the string. So while the string won't be copied, overall memory usage will increase (on average by one pointer per push, but actually happening in chunks).
array.push(x)

评论

0赞 Bergi 11/7/2023
"The snippet you quoted is plain wrong as far as implementation details are concerned" - well that blog article is not concerned with implementation details, it tries to explain the difference between primitive and reference values (objects) to beginners.
0赞 Bergi 11/7/2023
"the string is not copied. Neither are booleans." - you might first want to qualify what exactly you mean by "string" or "boolean". There must be something that will be copied to stored it in the array/object.
0赞 jmrk 11/7/2023
@Bergi "it tries to explain the difference between primitive and reference values" Well, I don't think it does a good job of that. Even "primitives don't have properties" would arguably be a more meaningful characterization. "There must be something that will be copied" Yes, a reference to the thing (no matter whether "the thing" is a boolean or a string or an object). and are very similar: and will afterwards hold a reference to whatever value is also referencing. That value could be an object, or an immutable primitive.a[0] = foovar x = fooxa[0]foo
0赞 Bergi 11/7/2023
Please add that to your answer - the array/object will grow by the size of the reference when the new property is added. Btw, do you not consider the reference to be the value? I usually refer to the "referenced thing" as the "contents of the value". But maybe that's just the JS perspective not the engine perspective.
0赞 jmrk 11/8/2023
@Bergi: "do you not consider the reference to be the value?" No, I don't. Of course, references are a particular kind of value, but distinguishing concepts like "pass by reference" and "pass by value" only makes sense when you don't equate values and references. I'll happily agree that "value" is an overloaded term though, and is commonly used for different concepts, e.g. depending on the abstraction level of the statement in question :-)