提问人:Mike Stay 提问时间:11/16/2023 最后编辑:Mike Stay 更新时间:11/16/2023 访问量:57
Mistral 7B 训练样本的大小限制是多少?
What is the size limit on Mistral 7B training samples?
问:
在使用此 colab 学习本教程时,我尝试将默认数据集替换为我从“The Terraformers”一书的文本中生成的数据集,以尝试向模型传授本书的内容。原版在小到 T4 的 GPU 上运行良好(模改为 )。\\n
\n
我生成了一些自己的训练数据。即使只有一行包含一个大样本(大约 400 个令牌),内存使用量也会爆炸式增长。它在 A100 上消耗了 40Gb:
{"text":"<s>[INST]Please quote part of \"The Terraformers\" that would answer the following question: What is Destry's reaction when she sees someone tending a fire at the edge of the boreal forest?[/INST]\nDestry could smell the smoke long before she saw its improbable source. There was some kind of person---possibly Homo sapiens---tending a fire at the edge of the boreal forest. She squinted, trying to make out details from half a klick away. The person's skin was so pale she guessed it had hardly met real sunlight, which meant they were definitely not a stray worker from one of the construction camps. When the intruder crouched next to the flames, she caught a glimpse of red beard merging into a tangle of hair. In their hands, a hare was speared and cooking on an expensive alloy spit. The sight was horrifying, and Destry flinched back reflexively.\n\n\"Let's stop,\" she whispered to her mount, a thick-barreled moose with red-brown fur and a crown of antlers spreading from his forehead like a pair of massive, cupped hands. He flicked an ear in acknowledgement as she slid off his back and into his long shadow. Sinking down on one knee, Destry pressed her bare fingers into the soil, spreading them wide, establishing a high-bandwidth connection with the local ecosystem.\n\nThousands of sensors welcomed her into the planet's network, their collective perceptions knitting together from shards of cached memory, fragments of recorded sensation and perception. In this state, she too was a sensor, processing data through her eyes, nose, tongue, skin, and ears. What she perceived she shared with the ecosystem. She could feel the sensors collaboratively reviewing the scene from her perspective, learning that she wanted to know more about the mammal at the edge of the forest. It was like her body had become the land. Her awareness stretched forward, racing through root systems and over insects, tasting acid levels in the soil. The person's feet on the ground registered as pressure on her back, and she smelled redox reactions in the fire. Each sensor's evaluation joined the swelling chorus in her ears as the tiny machines voted on what their data points might mean: polymer, hair, carnivore, unprocessed excrement, dead trees, carbon cycle perturbation, predator, metal, fur, synthetic microbiome. As Destry's data surged across the field and into the forest, the sensors could see what she did, and their analysis coalesced into a strong probability: Homo sapiens in the region for eight days, causally linked to tree loss, small mammal loss, excrement buildup, complex toxins.</s>"
训练样本的令牌大小限制是多少?在哪里配置?
答:
0赞
Eduard
11/29/2023
#1
该错误与数据集无关。
optim = "paged_adamw_32bit"
将其替换为
optim = "paged_adamw_8bit"
它应该解决 OOM 问题
评论