提问人:plaidshirt 提问时间:11/6/2018 最后编辑:plaidshirt 更新时间:11/7/2018 访问量:163
无法在 java.util.Scanner 中设置字符编码
Unable to set character encoding in java.util.Scanner
问:
我用来获取文件的编码。Apache Tika
FileInputStream fis = new FileInputStream(my_file);
final AutoDetectReader detector = new AutoDetectReader(fis);
fis.close();
System.out.println("Encoding:" + detector.getCharset().toString());
我用来从文件中读取值。Scanner
Scanner scanner = new Scanner(my_file, detector.getCharset().toString());
Map<String, String> values = new HashMap<>();
String line, key = null, value = null;
while (scanner.hasNextLine()) {
line = scanner.nextLine();
if (line.contains(":")) {
if (key != null) {
values.put(key, value.trim());
key = null;
value = null;
}
int indexOfColon = line.indexOf(":");
key = line.substring(0, indexOfColon);
value = line.substring(indexOfColon + 1);
} else {
value += " " + line;
}
}
Scanner
无法从编码文件中读取文本,我得到空字符串。windows-1252
更新 2018.11.07.在BufferedReader的情况下,我有同样的问题。
Map<String, String> values = new HashMap<>();
String line, key = null, value = null;
FileInputStream is = new FileInputStream(my_file);
InputStreamReader isr = new InputStreamReader(is, getEncoding(my_file));
BufferedReader buffReader = new BufferedReader(isr);
while (buffReader.readLine() != null) {
line = buffReader.readLine();
if (line.contains(":")) {
if (key != null) {
values.put(key, value.trim());
key = null;
value = null;
}
int indexOfColon = line.indexOf(":");
key = line.substring(0, indexOfColon);
value = line.substring(indexOfColon + 1);
} else {
value += " " + line;
}
}
答:
0赞
gi097
11/6/2018
#1
我不会阅读台词,而是尝试使用以下方法阅读字符:
ByteArrayOutputStream line = new ByteArrayOutputStream();
Scanner scanner = new Scanner(my_file);
while (scanner.hasNextInt()) {
int c = 0;
// read every line
while (c != newline) { // TODO: Check for a newline char
c = scanner.nextInt();
line.write((byte) c);
}
byte[] array = line.toByteArray();
String output = new String(array, "Windows-1252"); // This should do the trick
// We have a string here, do your logic
line.reset();
}
这种方法很丑陋,但使用具有指定特定编码的能力。我根本没有测试或运行这段代码,但至少它会告诉你是否有任何内容实际上被正确阅读。new String
评论
0赞
plaidshirt
11/7/2018
它有同样的效果,字符串是空的。我也试过了: Scanner scanner = new Scanner(new FileInputStream(my_file), detector.getCharset().toString());
0赞
gi097
11/7/2018
啊,太可悲了!它有吗?hasNextInt()
0赞
plaidshirt
11/7/2018
不,没有,但我用 hasNextLine() 方法替换了它。
0赞
gi097
11/7/2018
我明白了,但我建议您检查该文件是否有任何内容并且有效。.nextInt()
0赞
plaidshirt
12/5/2018
文件有内容,我可以在用文本编辑器打开时阅读它。
评论