DataInputStream.read() 只读取非常大的数组(>100,000 字节)的前几个字节有什么原因吗?

Is there a reason why DataInputStream.read() only reads the first few bytes of really big arrays (>100,000 bytes)?

提问人:Archonic 提问时间:12/5/2022 最后编辑:Archonic 更新时间:12/6/2022 访问量:42

问:

我正在尝试编写软件,以不同的格式(分块、压缩、原始)发送一组数据(视频游戏的一部分),并测量每个格式之间的速度。但是,我在整理 CHUNKED 方法时遇到了一个问题。我发现,当读取超过 140000 字节的字节数组时,客户端开始只读取大约 131072,无论数组实际大多少。这是有原因的,或者可能有更好的方法吗?我的代码如下所示。我正在使用 DataInputStream 的 read() 方法(及其返回值)。

服务器

/**
 *
 * @return Time taken to complete transfer.
 */
public int start(String mode, int length) throws IOException, InterruptedException {
    if(mode.equals("RAW")){
        byte[] all = new ByteCollector(ServerMain.FILES, length).collect();
        output.writeUTF("SENDING " + mode + " " + all.length);
        expect("RECEIVING " + mode);
        long start = System.currentTimeMillis();
        echoSend(all);
        return (int) (System.currentTimeMillis() - start);
    }else if(mode.equals("CHUNKED")){ /*the important part*/
        //split into chunks
        byte[] all = new ByteCollector(ServerMain.FILES, length).collect();
        int chunks = maxChunks(all);
        output.writeUTF("SENDING " + mode + " " + chunks);
        System.out.println("Expecting RECEIVING " + chunks + "...");
        expect("RECEIVING " + chunks);
        int ms = 0;
        for(int i = 0; i<chunks; i++){
            byte[] currentChunk = getChunk(i, all);
            System.out.println("My chunk length is " + currentChunk.length);
            long start = System.currentTimeMillis();
            System.out.println("Sending...");
            echoSend(currentChunk);
            ms += System.currentTimeMillis() - start;
        }
        if(chunks == 0) expect("0"); //still need to confirm, even though no data was sent
        return ms;
    }else if(mode.equals("COMPRESSED")){
        byte[] compressed = new ByteCollector(ServerMain.FILES, length).collect();
        compressed = ExperimentUtils.compress(compressed);
        output.writeUTF("SENDING " + mode + " " + compressed.length);
        expect("RECEIVING " + mode);
        long start = System.currentTimeMillis();
        echoSend(compressed, length);
        return (int) (System.currentTimeMillis() - start);
    }
    return -1;
}

public static void main(String[] args) throws IOException,InterruptedException{
    FILES = Files.walk(Paths.get(DIRECTORY)).filter(Files::isRegularFile).toArray(Path[]::new);
    SyncServer server = new SyncServer(new ServerSocket(12222).accept());
    System.out.println("--------[CH UNK ED]--------");
    short[] chunkedSpeeds = new short[FOLDER_SIZE_MB + 1/*for "zero" or origin*/];
    for(int i = 0; i<=FOLDER_SIZE_MB; i++){
        chunkedSpeeds[i] = (short) server.start("CHUNKED", i * MB);
        System.out.println(i + "MB, Chunked: " + chunkedSpeeds[i]);
    }
    short[] compressedSpeeds = new short[FOLDER_SIZE_MB + 1];
    for(int i = 0; i<=FOLDER_SIZE_MB; i++){
        compressedSpeeds[i] = (short) server.start("COMPRESSED", i * MB);
    }
    short[] rawSpeeds = new short[FOLDER_SIZE_MB + 1];
    for(int i = 0; i<=FOLDER_SIZE_MB; i++){
        rawSpeeds[i] = (short) server.start("RAW", i * MB);
    }
    System.out.println("Raw speeds: " + Arrays.toString(rawSpeeds));
    System.out.println("\n\nCompressed speeds: " + Arrays.toString(compressedSpeeds));
    System.out.println("\n\nChunked speeds: " + Arrays.toString(chunkedSpeeds));
}

客户

public static void main(String[] args) throws IOException, InterruptedException {
    Socket socket = new Socket("localhost", 12222);
    DataInputStream input = new DataInputStream(socket.getInputStream());
    DataOutputStream output = new DataOutputStream(socket.getOutputStream());
    while(socket.isConnected()){
        String response = input.readUTF();
        String[] content = response.split(" ");
        if(response.startsWith("SENDING CHUNKED")){
            int chunks = Integer.parseInt(content[2]);
            System.out.println("Read chunk amount of " + chunks);
            output.writeUTF("RECEIVING " + chunks);
            for(int i = 0; i<chunks; i++){
                byte[] chunk = new byte[32 * MB];
                System.out.println("Ready to receive...");
                int read = input.read(chunk);
                System.out.println("Echoing read length of " + read);
                output.writeUTF(String.valueOf(read));
            }
            if(chunks == 0) output.writeUTF("0");
        }else if(response.startsWith("SENDING COMPRESSED")){
            byte[] compressed = new byte[Integer.parseInt(content[2])];
            output.writeUTF("RECEIVING " + compressed.length);
            input.read(compressed);
            decompress(compressed);
            output.writeInt(decompress(compressed).length);
        }else if(response.startsWith("SENDING RAW")){
            int length = Integer.parseInt(content[2]);
            output.writeUTF("RECEIVING " + length);
            byte[] received = new byte[length];
            input.read(received);
            output.writeInt(received.length);
        }
    }
}
public static byte[] decompress(byte[] in) throws IOException {
    try {
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        InflaterOutputStream infl = new InflaterOutputStream(out);
        infl.write(in);
        infl.flush();
        infl.close();

        return out.toByteArray();
    } catch (Exception e) {
        System.out.println("Error decompressing byte array with length " + in.length);
        throw e;
    }
}

使用 SDK 17

我尝试切换字节量,发现截止值就在我上面所说的位置。我什至在一个测试客户端/服务器项目中复制了它,没有任何多余的装饰(在这里找到它,发现截止值甚至更低!我真的希望这不是 Java 的实际问题......

Java 数组套 接字 IO DataInputStream

评论

3赞 access violation 12/5/2022
如果这是 TCP,您是否了解单次发送中的字节数与传送到单个 recv 的字节数之间没有相关性?在套接字级别,当有“一些”字节要读取时,读取完成。
2赞 President James K. Polk 12/5/2022
那么,你想发生什么?该方法正在按照它应该的方式工作,文档所说的它应该工作的方式。也许你想要readFully()read()
0赞 access violation 12/5/2022
一般来说,你只需要循环阅读,直到你读完所有内容。你怎么知道那是什么时候?你需要一个协议来告诉你;通常,要么有一个长度(如 HTTP 的 Content-Length 标头),要么读取直到连接结束。
0赞 Archonic 12/6/2022
感谢大家的输入!看来我误解了一些基本原理。我以为 Java 的 API 有某种神奇的方式来知道一次发送了多少字节。至少我的问题会帮助其他一些为同样的事情绞尽脑汁的家伙......

答:

1赞 Archonic 12/6/2022 #1

DataInputStream 的 read() 方法不直接对应于 DataOutputStream 的 write() 方法。如果您想知道在单个方法调用中发送了多少字节,服务器必须手动通知客户端。

这是因为 read() 方法不依赖于设定的长度,因此在读取某些字节时将其过程视为已完成,因为它无法知道您想要多少字节。