提问人:Johnny Maelstrom 提问时间:11/22/2008 最后编辑:Ramesh RJohnny Maelstrom 更新时间:11/15/2023 访问量:2638222
如何在 Java 中读取 InputStream 并将其转换为字符串?
How do I read / convert an InputStream into a String in Java?
问:
如果你有一个对象,你应该如何处理该对象并生成一个?java.io.InputStream
String
假设我有一个包含文本数据的,我想将其转换为 ,例如,我可以将其写入日志文件。InputStream
String
采取并将其转换为最简单的方法是什么?InputStream
String
public String convertStreamToString(InputStream is) {
// ???
}
答:
一个很好的方法是使用 Apache Commons 将 ...类似的东西IOUtils
InputStream
StringWriter
StringWriter writer = new StringWriter();
IOUtils.copy(inputStream, writer, encoding);
String theString = writer.toString();
甚至
// NB: does not close inputStream, you'll have to use try-with-resources for that
String theString = IOUtils.toString(inputStream, encoding);
或者,如果您不想混合流和写入器,则可以使用。ByteArrayOutputStream
评论
IOUtils.convertStreamToString()
考虑到文件,应该首先得到一个实例。然后可以读取它并将其添加到一个(如果我们不在多个线程中访问它,我们不需要它,并且速度更快)。这里的诀窍是我们在块中工作,因此不需要其他缓冲流。块大小经过参数化,以实现运行时性能优化。java.io.Reader
StringBuilder
StringBuffer
StringBuilder
public static String slurp(final InputStream is, final int bufferSize) {
final char[] buffer = new char[bufferSize];
final StringBuilder out = new StringBuilder();
try (Reader in = new InputStreamReader(is, "UTF-8")) {
for (;;) {
int rsz = in.read(buffer, 0, buffer.length);
if (rsz < 0)
break;
out.append(buffer, 0, rsz);
}
}
catch (UnsupportedEncodingException ex) {
/* ... */
}
catch (IOException ex) {
/* ... */
}
return out.toString();
}
Apache Commons 允许:
String myString = IOUtils.toString(myInputStream, "UTF-8");
当然,除了 UTF-8 之外,您还可以选择其他字符编码。
另请参阅:(文档)
评论
用:
import java.io.BufferedInputStream;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.io.IOException;
public static String readInputStreamAsString(InputStream in)
throws IOException {
BufferedInputStream bis = new BufferedInputStream(in);
ByteArrayOutputStream buf = new ByteArrayOutputStream();
int result = bis.read();
while(result != -1) {
byte b = (byte)result;
buf.write(b);
result = bis.read();
}
return buf.toString();
}
如果不能使用 Commons IO(FileUtils、IOUtils 和 CopyUtils),下面是一个使用 BufferedReader 逐行读取文件的示例:
public class StringFromFile {
public static void main(String[] args) /*throws UnsupportedEncodingException*/ {
InputStream is = StringFromFile.class.getResourceAsStream("file.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(is/*, "UTF-8"*/));
final int CHARS_PER_PAGE = 5000; //counting spaces
StringBuilder builder = new StringBuilder(CHARS_PER_PAGE);
try {
for(String line=br.readLine(); line!=null; line=br.readLine()) {
builder.append(line);
builder.append('\n');
}
}
catch (IOException ignore) { }
String text = builder.toString();
System.out.println(text);
}
}
或者,如果你想要原始速度,我会建议对 Paul de Vrieze 的建议进行变体(避免使用 StringWriter(在内部使用 StringBuffer):
public class StringFromFileFast {
public static void main(String[] args) /*throws UnsupportedEncodingException*/ {
InputStream is = StringFromFileFast.class.getResourceAsStream("file.txt");
InputStreamReader input = new InputStreamReader(is/*, "UTF-8"*/);
final int CHARS_PER_PAGE = 5000; //counting spaces
final char[] buffer = new char[CHARS_PER_PAGE];
StringBuilder output = new StringBuilder(CHARS_PER_PAGE);
try {
for(int read = input.read(buffer, 0, buffer.length);
read != -1;
read = input.read(buffer, 0, buffer.length)) {
output.append(buffer, 0, read);
}
} catch (IOException ignore) { }
String text = output.toString();
System.out.println(text);
}
}
如果您使用的是 Google-Collections/Guava,您可以执行以下操作:
InputStream stream = ...
String content = CharStreams.toString(new InputStreamReader(stream, Charsets.UTF_8));
Closeables.closeQuietly(stream);
请注意,第二个参数(即 Charsets.UTF_8)不是必需的,但如果您知道编码,通常最好指定编码(您应该这样做!InputStreamReader
这是一种仅使用标准 Java 库的方法(请注意,流不会关闭,您的里程可能会有所不同)。
static String convertStreamToString(java.io.InputStream is) {
java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
return s.hasNext() ? s.next() : "";
}
我从“愚蠢的扫描仪技巧”文章中学到了这个技巧。它之所以有效,是因为 Scanner 遍历流中的标记,在这种情况下,我们使用“输入边界的开始”(\A) 来分隔标记,因此只为流的整个内容提供一个标记。
请注意,如果需要具体说明输入流的编码,可以向 Scanner
构造函数提供第二个参数,以指示要使用的字符集(例如“UTF-8”)。
帽子提示也要送给雅各布,他曾经向我指出了上述文章。
评论
用:
InputStream in = /* Your InputStream */;
StringBuilder sb = new StringBuilder();
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String read;
while ((read=br.readLine()) != null) {
//System.out.println(read);
sb.append(read);
}
br.close();
return sb.toString();
评论
readLine()
删除换行符,以便生成的字符串不包含换行符,除非在添加到生成器的每行之间添加行分隔符。
我进行了一些计时测试,因为时间总是很重要的。
我试图以不同的方式将响应转换为字符串 3。(如下图所示)
为了可读性,我省略了 try/catch 块。
为了提供上下文,这是所有 3 种方法的前面代码:
String response;
String url = "www.blah.com/path?key=value";
GetMethod method = new GetMethod(url);
int status = client.executeMethod(method);
1)
response = method.getResponseBodyAsString();
2)
InputStream resp = method.getResponseBodyAsStream();
InputStreamReader is=new InputStreamReader(resp);
BufferedReader br=new BufferedReader(is);
String read = null;
StringBuffer sb = new StringBuffer();
while((read = br.readLine()) != null) {
sb.append(read);
}
response = sb.toString();
3)
InputStream iStream = method.getResponseBodyAsStream();
StringWriter writer = new StringWriter();
IOUtils.copy(iStream, writer, "UTF-8");
response = writer.toString();
因此,在使用相同的请求/响应数据对每种方法运行 500 次测试后,以下是数字。再一次,这些是我的发现,你的发现可能并不完全相同,但我写这篇文章是为了向其他人表明这些方法的效率差异。
排名:
方法 #1 方法 #3 - 比 #1 慢 2.6% 方法 #2 - 比 #1
慢 4.3%
这些方法中的任何一种都是获取响应并从中创建字符串的合适解决方案。
如果你喜欢冒险,你可以混合使用 Scala 和 Java,最后得到这个结果:
scala.io.Source.fromInputStream(is).mkString("")
混合使用 Java 和 Scala 代码和库有其好处。
在此处查看完整说明: 在 Scala 中将 InputStream 转换为字符串的惯用方法
这里或多或少是 Sampath 的答案,稍微清理了一下并表示为一个函数:
String streamToString(InputStream in) throws IOException {
StringBuilder out = new StringBuilder();
BufferedReader br = new BufferedReader(new InputStreamReader(in));
for (String line = br.readLine(); line != null; line = br.readLine())
out.append(line);
br.close();
return out.toString();
}
评论
br.close()
必须在 a 或 'try-with-resources' 块中完成。finally
这是最适合 Android 和任何其他 JVM 的最佳纯 Java 解决方案。
这个解决方案效果非常好......它简单、快速,并且同样适用于小型和大型流!(见上面的基准。.第8名)
public String readFullyAsString(InputStream inputStream, String encoding)
throws IOException {
return readFully(inputStream).toString(encoding);
}
public byte[] readFullyAsBytes(InputStream inputStream)
throws IOException {
return readFully(inputStream).toByteArray();
}
private ByteArrayOutputStream readFully(InputStream inputStream)
throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int length = 0;
while ((length = inputStream.read(buffer)) != -1) {
baos.write(buffer, 0, length);
}
return baos;
}
方便快捷:
String result = (String)new ObjectInputStream( inputStream ).readObject();
评论
java.io.StreamCorruptedException: invalid stream header
ObjectInputStream
是关于反序列化的,并且流必须遵守序列化协议才能工作,这在这个问题的上下文中可能并不总是正确的。
下面的代码对我有用。
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
byte[] buffer = new byte[bi.available() ];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
请注意,根据 Java 文档,该方法可能不适用于 ,但始终适用于 .
如果您不想使用方法,我们可以随时使用以下代码available()
InputStream
BufferedInputStream
available()
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
File f = new File(url.getPath());
byte[] buffer = new byte[ (int) f.length()];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
我不确定是否会有任何编码问题。如果代码有任何问题,请发表评论。
以下是仅使用字节数组缓冲区的 JDK 来执行此操作的方法。这实际上就是 commons-io 方法的工作方式。如果您要从 .IOUtils.copy()
byte[]
char[]
Reader
InputStream
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
...
InputStream is = ....
ByteArrayOutputStream baos = new ByteArrayOutputStream(8192);
byte[] buffer = new byte[8192];
int count = 0;
try {
while ((count = is.read(buffer)) != -1) {
baos.write(buffer, 0, count);
}
}
finally {
try {
is.close();
}
catch (Exception ignore) {
}
}
String charset = "UTF-8";
String inputStreamAsString = baos.toString(charset);
如果您使用流读取器,请确保在最后关闭流
private String readStream(InputStream iStream) throws IOException {
// Build a Stream Reader, it can read character by character
InputStreamReader iStreamReader = new InputStreamReader(iStream);
// Build a buffered Reader, so that I can read whole line at once
BufferedReader bReader = new BufferedReader(iStreamReader);
String line = null;
StringBuilder builder = new StringBuilder();
while((line = bReader.readLine()) != null) { // Read till end
builder.append(line);
builder.append("\n"); // Append new line to preserve lines
}
bReader.close(); // Close all opened stuff
iStreamReader.close();
//iStream.close(); // Let the creator of the stream close it!
// some readers may auto close the inner stream
return builder.toString();
}
在 JDK 7+ 上,可以使用 try-with-resources 构造。
/**
* Reads the stream into a string
* @param iStream the input stream
* @return the string read from the stream
* @throws IOException when an IO error occurs
*/
private String readStream(InputStream iStream) throws IOException {
// Buffered reader allows us to read line by line
try (BufferedReader bReader =
new BufferedReader(new InputStreamReader(iStream))) {
StringBuilder builder = new StringBuilder();
String line;
while((line = bReader.readLine()) != null) { // Read till end
builder.append(line);
builder.append("\n"); // Append new line to preserve lines
}
return builder.toString();
}
}
这是我经过一些实验后想出的最优雅的纯 Java(无库)解决方案:
public static String fromStream(InputStream in) throws IOException
{
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder out = new StringBuilder();
String newLine = System.getProperty("line.separator");
String line;
while ((line = reader.readLine()) != null) {
out.append(line);
out.append(newLine);
}
return out.toString();
}
好吧,你可以自己编程......这并不复杂......
String Inputstream2String (InputStream is) throws IOException
{
final int PKG_SIZE = 1024;
byte[] data = new byte [PKG_SIZE];
StringBuilder buffer = new StringBuilder(PKG_SIZE * 10);
int size;
size = is.read(data, 0, data.length);
while (size > 0)
{
String str = new String(data, 0, size);
buffer.append(str);
size = is.read(data, 0, data.length);
}
return buffer.toString();
}
评论
buffer
StringBuilder
buffers
lines
InputStream IS=new URL("http://www.petrol.si/api/gas_prices.json").openStream();
ByteArrayOutputStream BAOS=new ByteArrayOutputStream();
IOUtils.copy(IS, BAOS);
String d= new String(BAOS.toByteArray(),"UTF-8");
System.out.println(d);
评论
InputStreamReader i = new InputStreamReader(s);
BufferedReader str = new BufferedReader(i);
String msg = str.readLine();
System.out.println(msg);
这里 s 是您的对象,它将被转换为InputStream
String
评论
do-while
JDK 7/8 答案关闭流,仍然抛出 IOException:
StringBuilder build = new StringBuilder();
byte[] buf = new byte[1024];
int length;
try (InputStream is = getInputStream()) {
while ((length = is.read(buf)) != -1) {
build.append(new String(buf, 0, length));
}
}
您可以使用 Apache Commons。
在 IOUtils 中,您可以找到具有三个有用实现的 toString 方法。
public static String toString(InputStream input) throws IOException {
return toString(input, Charset.defaultCharset());
}
public static String toString(InputStream input) throws IOException {
return toString(input, Charset.defaultCharset());
}
public static String toString(InputStream input, String encoding)
throws IOException {
return toString(input, Charsets.toCharset(encoding));
}
评论
试试这 4 个陈述..
根据 Fred 回忆的观点,不建议附加 with 运算符,因为每次将新对象附加到现有对象时,都会再次创建一个新对象并将其地址分配给旧对象,而旧对象将成为垃圾。String
+=
char
String
String
st
st
public String convertStreamToString(InputStream is)
{
int k;
StringBuffer sb=new StringBuffer();
while((k=fin.read()) != -1)
{
sb.append((char)k);
}
return sb.toString();
}
不推荐,但这也是一种方式
public String convertStreamToString(InputStream is) {
int k;
String st="";
while((k=is.read()) != -1)
{
st+=(char)k;
}
return st;
}
评论
+=
StringBuilder
StringBuffer
此代码段位于 \sdk\samples\android-19\connectivity\NetworkConnect\NetworkConnectSample\src\main\java\com\example\android\networkconnect\MainActivity.java 中找到,该代码段根据 Apache 许可证 2.0 版获得许可,由 Google 编写。
/** Reads an InputStream and converts it to a String.
* @param stream InputStream containing HTML from targeted site.
* @param len Length of string that this method returns.
* @return String concatenated according to len parameter.
* @throws java.io.IOException
* @throws java.io.UnsupportedEncodingException
*/
private String readIt(InputStream stream, int len) throws IOException, UnsupportedEncodingException {
Reader reader = null;
reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[len];
reader.read(buffer);
return new String(buffer);
}
我写了一个类就是这样做的。有时你不想只为了一件事而添加Apache Commons,而想要一些比Scanner更愚蠢的东西,它不检查内容。
用法如下
// Read from InputStream
String data = new ReaderSink(inputStream, Charset.forName("UTF-8")).drain();
// Read from File
data = new ReaderSink(file, Charset.forName("UTF-8")).drain();
// Drain input stream to console
new ReaderSink(inputStream, Charset.forName("UTF-8")).drainTo(System.out);
下面是 ReaderSink 的代码:
import java.io.*;
import java.nio.charset.Charset;
/**
* A simple sink class that drains a {@link Reader} to a {@link String} or
* to a {@link Writer}.
*
* @author Ben Barkay
* @version 2/20/2014
*/
public class ReaderSink {
/**
* The default buffer size to use if no buffer size was specified.
*/
public static final int DEFAULT_BUFFER_SIZE = 1024;
/**
* The {@link Reader} that will be drained.
*/
private final Reader in;
/**
* Constructs a new {@code ReaderSink} for the specified file and charset.
* @param file The file to read from.
* @param charset The charset to use.
* @throws FileNotFoundException If the file was not found on the filesystem.
*/
public ReaderSink(File file, Charset charset) throws FileNotFoundException {
this(new FileInputStream(file), charset);
}
/**
* Constructs a new {@code ReaderSink} for the specified {@link InputStream}.
* @param in The {@link InputStream} to drain.
* @param charset The charset to use.
*/
public ReaderSink(InputStream in, Charset charset) {
this(new InputStreamReader(in, charset));
}
/**
* Constructs a new {@code ReaderSink} for the specified {@link Reader}.
* @param in The reader to drain.
*/
public ReaderSink(Reader in) {
this.in = in;
}
/**
* Drains the data from the underlying {@link Reader}, returning a {@link String} containing
* all of the read information. This method will use {@link #DEFAULT_BUFFER_SIZE} for
* its buffer size.
* @return A {@link String} containing all of the information that was read.
*/
public String drain() throws IOException {
return drain(DEFAULT_BUFFER_SIZE);
}
/**
* Drains the data from the underlying {@link Reader}, returning a {@link String} containing
* all of the read information.
* @param bufferSize The size of the buffer to use when reading.
* @return A {@link String} containing all of the information that was read.
*/
public String drain(int bufferSize) throws IOException {
StringWriter stringWriter = new StringWriter();
drainTo(stringWriter, bufferSize);
return stringWriter.toString();
}
/**
* Drains the data from the underlying {@link Reader}, writing it to the
* specified {@link Writer}. This method will use {@link #DEFAULT_BUFFER_SIZE} for
* its buffer size.
* @param out The {@link Writer} to write to.
*/
public void drainTo(Writer out) throws IOException {
drainTo(out, DEFAULT_BUFFER_SIZE);
}
/**
* Drains the data from the underlying {@link Reader}, writing it to the
* specified {@link Writer}.
* @param out The {@link Writer} to write to.
* @param bufferSize The size of the buffer to use when reader.
*/
public void drainTo(Writer out, int bufferSize) throws IOException {
char[] buffer = new char[bufferSize];
int read;
while ((read = in.read(buffer)) > -1) {
out.write(buffer, 0, read);
}
}
}
这是在不使用任何第三方库的情况下转换为的完整方法。用于单线程环境,否则使用 .InputStream
String
StringBuilder
StringBuffer
public static String getString( InputStream is) throws IOException {
int ch;
StringBuilder sb = new StringBuilder();
while((ch = is.read()) != -1)
sb.append((char)ch);
return sb.toString();
}
我有可用的Log4j,所以我能够使用org.apache.log4j.lf5.util.StreamUtils.getBytes方法来获取字节,我能够使用String构造函数将其转换为字符串:
String result = new String(StreamUtils.getBytes(inputStream));
评论
这个很好,因为:
- 它安全地处理字符集。
- 您可以控制读取缓冲区大小。
- 您可以预配生成器的长度,它不必是精确值。
- 没有库依赖性。
- 适用于 Java 7 或更高版本。
怎么做
public static String convertStreamToString(InputStream is) throws IOException {
StringBuilder sb = new StringBuilder(2048); // Define a size if you have an idea of it.
char[] read = new char[128]; // Your buffer size.
try (InputStreamReader ir = new InputStreamReader(is, StandardCharsets.UTF_8)) {
for (int i; -1 != (i = ir.read(read)); sb.append(read, 0, i));
}
return sb.toString();
}
对于 JDK 9
public static String inputStreamString(InputStream inputStream) throws IOException {
try (inputStream) {
return new String(inputStream.readAllBytes(), StandardCharsets.UTF_8);
}
}
我会使用一些 Java 8 技巧。
public static String streamToString(final InputStream inputStream) throws Exception {
// buffering optional
try
(
final BufferedReader br
= new BufferedReader(new InputStreamReader(inputStream))
) {
// parallel optional
return br.lines().parallel().collect(Collectors.joining("\n"));
} catch (final IOException e) {
throw new RuntimeException(e);
// whatever.
}
}
除了更简洁之外,与其他一些答案基本相同。
评论
这是一个改编自源代码的答案,适用于那些想要拥有 Apache 实现但又不想要整个库的人。org.apache.commons.io.IOUtils
private static final int BUFFER_SIZE = 4 * 1024;
public static String inputStreamToString(InputStream inputStream, String charsetName)
throws IOException {
StringBuilder builder = new StringBuilder();
InputStreamReader reader = new InputStreamReader(inputStream, charsetName);
char[] buffer = new char[BUFFER_SIZE];
int length;
while ((length = reader.read(buffer)) != -1) {
builder.append(buffer, 0, length);
}
return builder.toString();
}
InputStream is = Context.openFileInput(someFileName); // whatever format you have
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] b = new byte[8192];
for (int bytesRead; (bytesRead = is.read(b)) != -1;) {
bos.write(b, 0, bytesRead);
}
String output = bos.toString(someEncoding);
以下内容不涉及原始问题,而是解决一些回答。
一些响应表明该形式的循环
String line = null;
while((line = reader.readLine()) != null) {
// ...
}
或
for(String line = reader.readLine(); line != null; line = reader.readLine()) {
// ...
}
第一种形式通过在封闭作用域中声明一个变量“read”来污染封闭作用域的命名空间,该变量不会用于 for 循环之外的任何内容。第二种形式复制了 readline() 调用。
这是在 Java 中编写这种循环的更简洁的方法。事实证明,for 循环中的第一个子句不需要实际的初始值设定项值。这会将变量 “line” 的范围保持在 for 循环的主体内。更优雅!我还没有看到有人在任何地方使用这种表格(几年前的一天我偶然发现了它),但我一直在使用它。
for (String line; (line = reader.readLine()) != null; ) {
//...
}
Kotlin 用户只需执行以下操作:
println(InputStreamReader(is).readText())
而
readText()
是 Kotlin 标准库的内置扩展方法。
使用 Streams 的纯 Java 解决方案,从 Java 8 开始工作。
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.stream.Collectors;
// ...
public static String inputStreamToString(InputStream is) throws IOException {
try (BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
return br.lines().collect(Collectors.joining(System.lineSeparator()));
}
}
正如克里斯托弗·哈马斯特伦(Christoffer Hammarström)在下面提到的,其他答案是明确指定字符集更安全。即 InputStreamReader 构造函数可以按如下方式更改:
new InputStreamReader(is, Charset.forName("UTF-8"))
Guava 提供了更短、更有效的自动关闭解决方案,以防输入流来自类路径资源(这似乎是流行的任务):
byte[] bytes = Resources.toByteArray(classLoader.getResource(path));
或
String text = Resources.toString(classLoader.getResource(path), StandardCharsets.UTF_8);
还有 ByteSource 和 CharSource 的一般概念,它们轻轻地处理打开和关闭流。
因此,例如,与其显式打开一个小文件来读取其内容:
String content = Files.asCharSource(new File("robots.txt"), StandardCharsets.UTF_8).read();
byte[] data = Files.asByteSource(new File("favicon.ico")).read();
或者只是
String content = Files.toString(new File("robots.txt"), StandardCharsets.UTF_8);
byte[] data = Files.toByteArray(new File("favicon.ico"));
这是我基于 Java 8 的解决方案,它使用新的 Stream API 从 :InputStream
public static String toString(InputStream inputStream) {
BufferedReader reader = new BufferedReader(
new InputStreamReader(inputStream));
return reader.lines().collect(Collectors.joining(
System.getProperty("line.separator")));
}
为了完整起见,这里是 Java 9 解决方案:
public static String toString(InputStream input) throws IOException {
return new String(input.readAllBytes(), StandardCharsets.UTF_8);
}
这使用添加到 Java 9 中的 readAllBytes
方法。
评论
InputStream
读取到一个字符串中。当然,您可以将其拆分为多个步骤,这些步骤不会读取整个流,但是当您对这些步骤所做的只是在读取整个流后将部分结果重新组合为单个结果时,这是没有意义的。因此,在读取整个流之前,此特定任务的任何解决方案都不会返回。
注意:这可能不是一个好主意。此方法使用递归,因此会很快命中:StackOverflowError
public String read (InputStream is) {
byte next = is.read();
return next == -1 ? "" : next + read(is); // Recursive part: reads next byte recursively
}
评论
StackOverflowError
基于公认的 Apache Commons 答案的第二部分,但填补了始终关闭流的小空白:
String theString;
try {
theString = IOUtils.toString(inputStream, encoding);
} finally {
IOUtils.closeQuietly(inputStream);
}
评论
就 而言,它可以在 Java 8 中表示为:reduce
concat
String fromFile = new BufferedReader(new
InputStreamReader(inputStream)).lines().reduce(String::concat).get();
使用 Java 9 中支持的 java.io.InputStream.transferTo(OutputStream) 和采用字符集名称的 ByteArrayOutputStream.toString(String):
public static String gobble(InputStream in, String charsetName) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
in.transferTo(bos);
return bos.toString(charsetName);
}
为了总结其他答案,我找到了 11 种主要方法可以做到这一点(见下文)。我写了一些性能测试(见下面的结果):
将 InputStream 转换为字符串的方法:
使用 (Apache Utils)
IOUtils.toString
String result = IOUtils.toString(inputStream, StandardCharsets.UTF_8);
使用(番石榴)
CharStreams
String result = CharStreams.toString(new InputStreamReader( inputStream, Charsets.UTF_8));
使用 (JDK)
Scanner
Scanner s = new Scanner(inputStream).useDelimiter("\\A"); String result = s.hasNext() ? s.next() : "";
使用 Stream API (Java 8)。警告:此解决方案将不同的换行符(如)转换为 。
\r\n
\n
String result = new BufferedReader(new InputStreamReader(inputStream)) .lines().collect(Collectors.joining("\n"));
使用并行流 API (Java 8)。警告:此解决方案将不同的换行符(如)转换为 。
\r\n
\n
String result = new BufferedReader(new InputStreamReader(inputStream)) .lines().parallel().collect(Collectors.joining("\n"));
使用 和 (JDK)
InputStreamReader
StringBuilder
int bufferSize = 1024; char[] buffer = new char[bufferSize]; StringBuilder out = new StringBuilder(); Reader in = new InputStreamReader(stream, StandardCharsets.UTF_8); for (int numRead; (numRead = in.read(buffer, 0, buffer.length)) > 0; ) { out.append(buffer, 0, numRead); } return out.toString();
使用 和 (Apache Commons)
StringWriter
IOUtils.copy
StringWriter writer = new StringWriter(); IOUtils.copy(inputStream, writer, "UTF-8"); return writer.toString();
使用 和 (JDK)
ByteArrayOutputStream
inputStream.read
ByteArrayOutputStream result = new ByteArrayOutputStream(); byte[] buffer = new byte[1024]; for (int length; (length = inputStream.read(buffer)) != -1; ) { result.write(buffer, 0, length); } // StandardCharsets.UTF_8.name() > JDK 7 return result.toString("UTF-8");
使用 (JDK)。警告:此解决方案将不同的换行符(如 )转换为系统属性(例如,在 Windows 中转换为“\r\n”)。
BufferedReader
\n\r
line.separator
String newLine = System.getProperty("line.separator"); BufferedReader reader = new BufferedReader( new InputStreamReader(inputStream)); StringBuilder result = new StringBuilder(); for (String line; (line = reader.readLine()) != null; ) { if (result.length() > 0) { result.append(newLine); } result.append(line); } return result.toString();
使用 和 (JDK)
BufferedInputStream
ByteArrayOutputStream
BufferedInputStream bis = new BufferedInputStream(inputStream); ByteArrayOutputStream buf = new ByteArrayOutputStream(); for (int result = bis.read(); result != -1; result = bis.read()) { buf.write((byte) result); } // StandardCharsets.UTF_8.name() > JDK 7 return buf.toString("UTF-8");
使用 和 (JDK)。警告:此解决方案在 Unicode 方面存在问题,例如俄语文本(仅适用于非 Unicode 文本)
inputStream.read()
StringBuilder
StringBuilder sb = new StringBuilder(); for (int ch; (ch = inputStream.read()) != -1; ) { sb.append((char) ch); } return sb.toString();
警告:
解决方案 4、5 和 9 将不同的换行符转换为一个换行符。
解决方案 11 无法正确处理 Unicode 文本
性能测试
小(长度 = 175),github 中的 url 的性能测试(mode = Average Time,system = Linux,得分 1,343 是最好的):String
Benchmark Mode Cnt Score Error Units
8. ByteArrayOutputStream and read (JDK) avgt 10 1,343 ± 0,028 us/op
6. InputStreamReader and StringBuilder (JDK) avgt 10 6,980 ± 0,404 us/op
10. BufferedInputStream, ByteArrayOutputStream avgt 10 7,437 ± 0,735 us/op
11. InputStream.read() and StringBuilder (JDK) avgt 10 8,977 ± 0,328 us/op
7. StringWriter and IOUtils.copy (Apache) avgt 10 10,613 ± 0,599 us/op
1. IOUtils.toString (Apache Utils) avgt 10 10,605 ± 0,527 us/op
3. Scanner (JDK) avgt 10 12,083 ± 0,293 us/op
2. CharStreams (guava) avgt 10 12,999 ± 0,514 us/op
4. Stream Api (Java 8) avgt 10 15,811 ± 0,605 us/op
9. BufferedReader (JDK) avgt 10 16,038 ± 0,711 us/op
5. parallel Stream Api (Java 8) avgt 10 21,544 ± 0,583 us/op
big(长度 = 50100),github 中的 url 的性能测试(mode = Average Time,system = Linux,得分 200,715 是最好的):String
Benchmark Mode Cnt Score Error Units
8. ByteArrayOutputStream and read (JDK) avgt 10 200,715 ± 18,103 us/op
1. IOUtils.toString (Apache Utils) avgt 10 300,019 ± 8,751 us/op
6. InputStreamReader and StringBuilder (JDK) avgt 10 347,616 ± 130,348 us/op
7. StringWriter and IOUtils.copy (Apache) avgt 10 352,791 ± 105,337 us/op
2. CharStreams (guava) avgt 10 420,137 ± 59,877 us/op
9. BufferedReader (JDK) avgt 10 632,028 ± 17,002 us/op
5. parallel Stream Api (Java 8) avgt 10 662,999 ± 46,199 us/op
4. Stream Api (Java 8) avgt 10 701,269 ± 82,296 us/op
10. BufferedInputStream, ByteArrayOutputStream avgt 10 740,837 ± 5,613 us/op
3. Scanner (JDK) avgt 10 751,417 ± 62,026 us/op
11. InputStream.read() and StringBuilder (JDK) avgt 10 2919,350 ± 1101,942 us/op
图形(性能测试取决于 Windows 7 系统中的输入流长度)
性能测试(平均时间)取决于 Windows 7 系统中的输入流长度:
length 182 546 1092 3276 9828 29484 58968
test8 0.38 0.938 1.868 4.448 13.412 36.459 72.708
test4 2.362 3.609 5.573 12.769 40.74 81.415 159.864
test5 3.881 5.075 6.904 14.123 50.258 129.937 166.162
test9 2.237 3.493 5.422 11.977 45.98 89.336 177.39
test6 1.261 2.12 4.38 10.698 31.821 86.106 186.636
test7 1.601 2.391 3.646 8.367 38.196 110.221 211.016
test1 1.529 2.381 3.527 8.411 40.551 105.16 212.573
test3 3.035 3.934 8.606 20.858 61.571 118.744 235.428
test2 3.136 6.238 10.508 33.48 43.532 118.044 239.481
test10 1.593 4.736 7.527 20.557 59.856 162.907 323.147
test11 3.913 11.506 23.26 68.644 207.591 600.444 1211.545
评论
InputStream.transferTo
Reader.transferTo
InputStream.transferTo
test8
Reader.transferTo
test8
test1
while
for
将inputStream转换为String的方法
public static String getStringFromInputStream(InputStream inputStream) {
BufferedReader bufferedReader = null;
StringBuilder stringBuilder = new StringBuilder();
String line;
try {
bufferedReader = new BufferedReader(new InputStreamReader(
inputStream));
while ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line);
}
} catch (IOException e) {
logger.error(e.getMessage());
} finally {
if (bufferedReader != null) {
try {
bufferedReader.close();
} catch (IOException e) {
logger.error(e.getMessage());
}
}
}
return stringBuilder.toString();
}
InputStream inputStream = null;
BufferedReader bufferedReader = null;
try {
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
String stringBuilder = new StringBuilder();
String content;
while((content = bufferedReader.readLine()) != null) {
stringBuilder.append(content);
}
System.out.println("content of file::" + stringBuilder.toString());
}
catch (IOException e) {
e.printStackTrace();
}
finally {
if(bufferedReader != null) {
try {
bufferedReader.close();
}
catch(IoException ex) {
ex.printStackTrace();
}
评论
此外,还可以从指定的资源路径获取 InputStream:
public static InputStream getResourceAsStream(String path)
{
InputStream myiInputStream = ClassName.class.getResourceAsStream(path);
if (null == myiInputStream)
{
mylogger.info("Can't find path = ", path);
}
return myiInputStream;
}
要从特定路径获取 InputStream,请执行以下操作:
public static URL getResource(String path)
{
URL myURL = ClassName.class.getResource(path);
if (null == myURL)
{
mylogger.info("Can't find resource path = ", path);
}
return myURL;
}
评论
另一个,对于所有 Spring 用户:
import java.nio.charset.StandardCharsets;
import org.springframework.util.FileCopyUtils;
public String convertStreamToString(InputStream is) throws IOException {
return new String(FileCopyUtils.copyToByteArray(is), StandardCharsets.UTF_8);
}
中的实用程序方法与 中的方法类似,但它们在完成后会使流保持打开状态。org.springframework.util.StreamUtils
FileCopyUtils
JDK 中最简单的方法是使用以下代码片段。
String convertToString(InputStream in) {
String resource = new Scanner(in).useDelimiter("\\Z").next();
return resource;
}
评论
public String read(InputStream in) throws IOException {
try (BufferedReader buffer = new BufferedReader(new InputStreamReader(in))) {
return buffer.lines().collect(Collectors.joining("\n"));
}
}
评论
inputStream.getText()
拉古·奈尔(Raghu K Nair)是唯一一个使用扫描仪的人。 我使用的代码略有不同:
String convertToString(InputStream in){
Scanner scanner = new Scanner(in)
scanner.useDelimiter("\\A");
boolean hasInput = scanner.hasNext();
if (hasInput) {
return scanner.next();
} else {
return null;
}
}
关于分隔符:如何在 Java Scanner 中使用分隔符?
您可以使用 Cactoos:
String text = new TextOf(inputStream).asString();
UTF-8 编码是默认编码。如果您需要另一个:
String text = new TextOf(inputStream, "UTF-16").asString();
这个问题的解决方案并不是最简单的,但由于没有提到 NIO 流和通道,这里是一个使用 NIO 通道和 ByteBuffer 将流转换为字符串的版本。
public static String streamToStringChannel(InputStream in, String encoding, int bufSize) throws IOException {
ReadableByteChannel channel = Channels.newChannel(in);
ByteBuffer byteBuffer = ByteBuffer.allocate(bufSize);
ByteArrayOutputStream bout = new ByteArrayOutputStream();
WritableByteChannel outChannel = Channels.newChannel(bout);
while (channel.read(byteBuffer) > 0 || byteBuffer.position() > 0) {
byteBuffer.flip(); //make buffer ready for write
outChannel.write(byteBuffer);
byteBuffer.compact(); //make buffer ready for reading
}
channel.close();
outChannel.close();
return bout.toString(encoding);
}
下面是一个如何使用它的示例:
try (InputStream in = new FileInputStream("/tmp/large_file.xml")) {
String x = streamToStringChannel(in, "UTF-8", 1);
System.out.println(x);
}
此方法的性能应该适用于大型文件。
我在这里对 14 个不同的答案进行了基准测试(很抱歉没有提供学分,但重复太多)。
结果非常令人惊讶。事实证明,Apache IOUtils 是最慢的,也是最快的解决方案:ByteArrayOutputStream
所以首先这里是最好的方法:
public String inputStreamToString(InputStream inputStream) throws IOException {
try(ByteArrayOutputStream result = new ByteArrayOutputStream()) {
byte[] buffer = new byte[1024];
int length;
while ((length = inputStream.read(buffer)) != -1) {
result.write(buffer, 0, length);
}
return result.toString(UTF_8);
}
}
基准测试结果,20 个周期内 20 MB 随机字节
时间(毫秒)
- ByteArrayOutputStreamTest:194
- NioStream:198
- Java9ISTransferTo:201
- Java9ISReadAll字节:205
- BufferedInputStreamVsByteArrayOutputStream:314
- ApacheStringWriter2:574
- 番石榴CharStreams:589
- ScannerReaderNoNextTest:614
- 扫描仪阅读器:633
- ApacheStringWriter:1544
- StreamApi:错误
- ParallelStreamApi:错误
- BufferReaderTest:错误
- InputStreamAndStringBuilder:错误
基准测试源代码
import com.google.common.io.CharStreams;
import org.apache.commons.io.IOUtils;
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.ReadableByteChannel;
import java.nio.channels.WritableByteChannel;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;
/**
* Created by Ilya Gazman on 2/13/18.
*/
public class InputStreamToString {
private static final String UTF_8 = "UTF-8";
public static void main(String... args) {
log("App started");
byte[] bytes = new byte[1024 * 1024];
new Random().nextBytes(bytes);
log("Stream is ready\n");
try {
test(bytes);
} catch (IOException e) {
e.printStackTrace();
}
}
private static void test(byte[] bytes) throws IOException {
List<Stringify> tests = Arrays.asList(
new ApacheStringWriter(),
new ApacheStringWriter2(),
new NioStream(),
new ScannerReader(),
new ScannerReaderNoNextTest(),
new GuavaCharStreams(),
new StreamApi(),
new ParallelStreamApi(),
new ByteArrayOutputStreamTest(),
new BufferReaderTest(),
new BufferedInputStreamVsByteArrayOutputStream(),
new InputStreamAndStringBuilder(),
new Java9ISTransferTo(),
new Java9ISReadAllBytes()
);
String solution = new String(bytes, "UTF-8");
for (Stringify test : tests) {
try (ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes)) {
String s = test.inputStreamToString(inputStream);
if (!s.equals(solution)) {
log(test.name() + ": Error");
continue;
}
}
long startTime = System.currentTimeMillis();
for (int i = 0; i < 20; i++) {
try (ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes)) {
test.inputStreamToString(inputStream);
}
}
log(test.name() + ": " + (System.currentTimeMillis() - startTime));
}
}
private static void log(String message) {
System.out.println(message);
}
interface Stringify {
String inputStreamToString(InputStream inputStream) throws IOException;
default String name() {
return this.getClass().getSimpleName();
}
}
static class ApacheStringWriter implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
StringWriter writer = new StringWriter();
IOUtils.copy(inputStream, writer, UTF_8);
return writer.toString();
}
}
static class ApacheStringWriter2 implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
return IOUtils.toString(inputStream, UTF_8);
}
}
static class NioStream implements Stringify {
@Override
public String inputStreamToString(InputStream in) throws IOException {
ReadableByteChannel channel = Channels.newChannel(in);
ByteBuffer byteBuffer = ByteBuffer.allocate(1024 * 16);
ByteArrayOutputStream bout = new ByteArrayOutputStream();
WritableByteChannel outChannel = Channels.newChannel(bout);
while (channel.read(byteBuffer) > 0 || byteBuffer.position() > 0) {
byteBuffer.flip(); //make buffer ready for write
outChannel.write(byteBuffer);
byteBuffer.compact(); //make buffer ready for reading
}
channel.close();
outChannel.close();
return bout.toString(UTF_8);
}
}
static class ScannerReader implements Stringify {
@Override
public String inputStreamToString(InputStream is) throws IOException {
java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
return s.hasNext() ? s.next() : "";
}
}
static class ScannerReaderNoNextTest implements Stringify {
@Override
public String inputStreamToString(InputStream is) throws IOException {
java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
return s.next();
}
}
static class GuavaCharStreams implements Stringify {
@Override
public String inputStreamToString(InputStream is) throws IOException {
return CharStreams.toString(new InputStreamReader(
is, UTF_8));
}
}
static class StreamApi implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
return new BufferedReader(new InputStreamReader(inputStream))
.lines().collect(Collectors.joining("\n"));
}
}
static class ParallelStreamApi implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
return new BufferedReader(new InputStreamReader(inputStream)).lines()
.parallel().collect(Collectors.joining("\n"));
}
}
static class ByteArrayOutputStreamTest implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
try(ByteArrayOutputStream result = new ByteArrayOutputStream()) {
byte[] buffer = new byte[1024];
int length;
while ((length = inputStream.read(buffer)) != -1) {
result.write(buffer, 0, length);
}
return result.toString(UTF_8);
}
}
}
static class BufferReaderTest implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
String newLine = System.getProperty("line.separator");
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
StringBuilder result = new StringBuilder(UTF_8);
String line;
boolean flag = false;
while ((line = reader.readLine()) != null) {
result.append(flag ? newLine : "").append(line);
flag = true;
}
return result.toString();
}
}
static class BufferedInputStreamVsByteArrayOutputStream implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
BufferedInputStream bis = new BufferedInputStream(inputStream);
ByteArrayOutputStream buf = new ByteArrayOutputStream();
int result = bis.read();
while (result != -1) {
buf.write((byte) result);
result = bis.read();
}
return buf.toString(UTF_8);
}
}
static class InputStreamAndStringBuilder implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
int ch;
StringBuilder sb = new StringBuilder(UTF_8);
while ((ch = inputStream.read()) != -1)
sb.append((char) ch);
return sb.toString();
}
}
static class Java9ISTransferTo implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
inputStream.transferTo(bos);
return bos.toString(UTF_8);
}
}
static class Java9ISReadAllBytes implements Stringify {
@Override
public String inputStreamToString(InputStream inputStream) throws IOException {
return new String(inputStream.readAllBytes(), UTF_8);
}
}
}
评论
System.currentTimeMillis()
System.gc()
使用 Okio:
String result = Okio.buffer(Okio.source(inputStream)).readUtf8();
评论
我已经创建了这段代码,它可以工作。没有必需的外部插件。
有一个转换器来和到:String
Stream
Stream
String
import java.io.ByteArrayInputStream;
import java.io.InputStream;
public class STRINGTOSTREAM {
public static void main(String[] args)
{
String text = "Hello Bhola..!\nMy Name Is Kishan ";
InputStream strm = new ByteArrayInputStream(text.getBytes()); // Convert String to Stream
String data = streamTostring(strm);
System.out.println(data);
}
static String streamTostring(InputStream stream)
{
String data = "";
try
{
StringBuilder stringbuld = new StringBuilder();
int i;
while ((i=stream.read())!=-1)
{
stringbuld.append((char)i);
}
data = stringbuld.toString();
}
catch(Exception e)
{
data = "No data Streamed.";
}
return data;
}
国际标准化组织-8859-1
如果您知道输入流的编码是 ISO-8859-1 或 ASCII,则这是一种非常有效的方法。它(1)避免了内部数组中不必要的同步,(2)避免了,并且(3)最大限度地减少了内部数组必须被复制的次数。StringWriter
StringBuffer
InputStreamReader
StringBuilder
char
public static String iso_8859_1(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
chars.append((char)(buffer[i] & 0xFF));
}
}
return chars.toString();
}
UTF-8 格式
对于使用 UTF-8 编码的流,可以使用相同的常规策略:
public static String utf8(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
int state = 0;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
if ((state = nextStateUtf8(state, buffer[i])) >= 0) {
chars.appendCodePoint(state);
} else if (state == -1) { //error
state = 0;
chars.append('\uFFFD'); //replacement char
}
}
}
return chars.toString();
}
其中函数定义如下:nextStateUtf8()
/**
* Returns the next UTF-8 state given the next byte of input and the current state.
* If the input byte is the last byte in a valid UTF-8 byte sequence,
* the returned state will be the corresponding unicode character (in the range of 0 through 0x10FFFF).
* Otherwise, a negative integer is returned. A state of -1 is returned whenever an
* invalid UTF-8 byte sequence is detected.
*/
static int nextStateUtf8(int currentState, byte nextByte) {
switch (currentState & 0xF0000000) {
case 0:
if ((nextByte & 0x80) == 0) { //0 trailing bytes (ASCII)
return nextByte;
} else if ((nextByte & 0xE0) == 0xC0) { //1 trailing byte
if (nextByte == (byte) 0xC0 || nextByte == (byte) 0xC1) { //0xCO & 0xC1 are overlong
return -1;
} else {
return nextByte & 0xC000001F;
}
} else if ((nextByte & 0xF0) == 0xE0) { //2 trailing bytes
if (nextByte == (byte) 0xE0) { //possibly overlong
return nextByte & 0xA000000F;
} else if (nextByte == (byte) 0xED) { //possibly surrogate
return nextByte & 0xB000000F;
} else {
return nextByte & 0x9000000F;
}
} else if ((nextByte & 0xFC) == 0xF0) { //3 trailing bytes
if (nextByte == (byte) 0xF0) { //possibly overlong
return nextByte & 0x80000007;
} else {
return nextByte & 0xE0000007;
}
} else if (nextByte == (byte) 0xF4) { //3 trailing bytes, possibly undefined
return nextByte & 0xD0000007;
} else {
return -1;
}
case 0xE0000000: //3rd-to-last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0x80000000: //3rd-to-last continuation byte, check overlong
return (nextByte & 0xE0) == 0xA0 || (nextByte & 0xF0) == 0x90 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0xD0000000: //3rd-to-last continuation byte, check undefined
return (nextByte & 0xF0) == 0x80 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0x90000000: //2nd-to-last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xA0000000: //2nd-to-last continuation byte, check overlong
return (nextByte & 0xE0) == 0xA0 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xB0000000: //2nd-to-last continuation byte, check surrogate
return (nextByte & 0xE0) == 0x80 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xC0000000: //last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0x3F : -1;
default:
return -1;
}
}
自动检测编码
如果您的输入流是使用 ASCII 或 ISO-8859-1 或 UTF-8 编码的,但您不确定是哪种,我们可以使用与上一种方法类似的方法,但使用额外的编码检测组件在返回字符串之前自动检测编码。
public static String autoDetect(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
int state = 0;
boolean ascii = true;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
if ((state = nextStateUtf8(state, buffer[i])) > 0x7F)
ascii = false;
chars.append((char)(buffer[i] & 0xFF));
}
}
if (ascii || state < 0) { //probably not UTF-8
return chars.toString();
}
//probably UTF-8
int pos = 0;
char[] charBuf = new char[2];
for (int i = 0, len = chars.length(); i < len; i++) {
if ((state = nextStateUtf8(state, (byte)chars.charAt(i))) >= 0) {
boolean hi = Character.toChars(state, charBuf, 0) == 2;
chars.setCharAt(pos++, charBuf[0]);
if (hi) {
chars.setCharAt(pos++, charBuf[1]);
}
}
}
return chars.substring(0, pos);
}
如果您的输入流的编码既不是 ISO-8859-1,也不是 ASCII 也不是 UTF-8,那么我将遵循已经存在的其他答案。
我建议使用 StringWriter 类来解决这个问题。
StringWriter wt= new StringWriter();
IOUtils.copy(inputStream, wt, encoding);
String st= wt.toString();
评论
此代码适用于新的 Java 学习者:
private String textDataFromFile;
public String getFromFile(InputStream myInputStream) throws FileNotFoundException, IOException {
BufferedReader bufferReader = new BufferedReader(new InputStreamReader(myInputStream));
StringBuilder stringBuilder = new StringBuilder();
String eachStringLine;
while ((eachStringLine = bufferReader.readLine()) != null) {
stringBuilder.append(eachStringLine).append("\n");
}
textDataFromFile = stringBuilder.toString();
return textDataFromFile;
}
评论
String inputStreamToString(InputStream inputStream, Charset charset) throws IOException {
try (
final StringWriter writer = new StringWriter();
final InputStreamReader reader = new InputStreamReader(inputStream, charset)
) {
reader.transferTo(writer);
return writer.toString();
}
}
- 纯 Java 标准库解决方案 - 无库
- 从 Java 10 开始 - Reader#transferTo(java.io.Writer)
- 无循环解决方案
- 无换行符处理
如果需要在没有外部库的情况下将字符串转换为特定字符集,则:
public String convertStreamToString(InputStream is) throws IOException {
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();) {
is.transferTo(baos);
return baos.toString(StandardCharsets.UTF_8);
}
}
最简单的方法,一句话:
public static void main(String... args) throws IOException {
System.out.println(new String(Files.readAllBytes(Paths.get("csv.txt"))));
}
如果您使用的是 AWS 开发工具包 v2,请调用 IoUtils.toUtf8String():
public String convertStreamToString(InputStream is) {
return IoUtils.toUtf8String(is);
}
我看到很多只是片段,但关于为什么这个片段是一个很好的方法的解释为零。我将把自己限制在普通的 Java 上。了解 Java 的关键是什么:
java.io.InputStream
用于读取字节。java.io.Reader
用于读取字符数据。
您不应该自己读取 InputStream 并将其转换为字符串,因为我们所处的世界不再以单字节为中心。这就是物体发挥作用的地方。 类将字节 -> 字符转换为字符。他们桥接和.您必须将 InputStream 转换为 Reader。java.nio.charset.Charset
Charset
java.io.InputStream
java.io.Reader
java.nio.charset.Charset
我将以低内存使用率的方式执行此操作(好吧,直到您将事情读入单个字符串,这将占用全部内存,但会满足要求;-)
代码如下:
public String readString(InputStream inStream) {
char[] buffer = new char[2**16];
try( Reader reader = new InputStreamReader( inStream, "UTF-8") ) {
int length = -1;
StringBuilder builder = new StringBuilder();
while( (length = reader.read( buffer, 0, buffer.length )) >= 0 ) {
builder.append( buffer, 0, length );
}
}
return builder.toString();
}
这里的关键是用于在 -> 之间架起桥梁,您可以在其中读取字符数据。从那里开始,使用 .另一个重要部分是指定字符编码。在这种情况下,我使用了“UTF-8”,但它可能是“ISO-8859-1”或“UTF-16”等。InputStreamReader
InputStream
Reader
StringBuilder
InputStreamReader
给智者的话。我没有使用BufferedInputStream或BufferedReader来包装这些。这些类被过度使用。如果要调用或其他数组读取方法,则将提供缓冲区。您的使用情况大部分已优化(缓冲区的大小是有问题的),无需让 BufferedInputStream 为您创建另一个缓冲区。您已经分配了缓冲区,因此将 InputStream 包装在 BufferedInputStream 中不会增加任何优势。现在,如果您正在调用或是,这些类通过将单字节/字符读取转换为缓冲读取来提供速度提升。但大多数时候,人们不会逐字节读取。Reader.read( char[], int start, int len)
InputStream.read()
Reader.read()
该建议的唯一例外是,如果您想使用 readLine,在这种情况下,BufferedReader 是您的朋友,请继续使用它。需要注意的是。如果您正在读取一个 1GB 的单行文件,它将占用超过 1GB 的内存来读取您的文件,这可能不是您想要的。
这是读取实际数据时使用最少内存的最紧密循环。这就是为什么将完整数据读入 String 可能并不总是最好的方法,但此代码可以适应其他情况。如果你把它写到一个把它流出到外部存储的地方,它会非常紧张。java.io.Writer
评论
如何在 Java 中读取 InputStream 并将其转换为字符串?
在性能是主要关注点的情况下,您可以提供另一种可能的解决方案,您可以通过使用 BufferedReader 逐行读取 InputStream,而不是逐字节读取 InputStream 来提高速度。 这是一个代码:
public String convertStreamToString(InputStream inputStream) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
StringBuilder stringBuilder = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
stringBuilder.append(line).append("\n");
}
return stringBuilder.toString();
}
与逐字节读取输入相比,此方法缓冲信息并分块读取,这可以大大提高性能。
评论