提问人:Maria 提问时间:1/25/2023 最后编辑:Maria 更新时间:1/28/2023 访问量:262
Apache Flink 中的 job/jar 不允许在 Docker 中提交文件
Job/jar in Apache Flink doesn't have permisson to file in Docker
问:
我有用于解析 csv 文件的 Apache Flink 作业,该作业在 Windows 上的 IntelliJ IDEA 中工作正常。但是当我把我的工作(jar)放在docker-container Apache Flink中时,我遇到了允许使用类提交文件的问题。在容器内,我有文件:。权限还可以(我甚至可以从我的工作中更改文件)。因为我用了 , , .FileSource.forRecordStreamFormat(...)
/opt/flink/data/test2.csv
fileName
/opt/flink/data/test2.csv
//opt/flink/data/test2.csv
///opt/flink/data/test2.csv
权限:
# pwd
/opt/flink/data
# ls -ls
total 16088
1204 -rwxrwxrwx 1 root root 1231979 Jan 24 15:54 test2.csv
14876 -rwxrwxrwx 1 root root 15231523 Jan 22 19:24 test3.csv
8 -rwxrwxrwx 1 root root 6623 Jan 24 14:32 test_Home.xlsx
Docker-compose:
version: "2.2"
services:
jobmanager:
image: flink:1.16-java8
ports:
- "8081:8081"
command: jobmanager
environment:
- |
FLINK_PROPERTIES=
jobmanager.rpc.address: jobmanager
volumes:
- /c/Users/MGubina/Desktop/data:/opt/flink/data
taskmanager:
image: flink:1.16-java8
depends_on:
- jobmanager
command: taskmanager
scale: 1
environment:
- |
FLINK_PROPERTIES=
jobmanager.rpc.address: jobmanager
taskmanager.numberOfTaskSlots: 2
部分职位代码:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
CsvReaderFormat<Product> csvFormat = CsvReaderFormat.forPojo(Product.class);
FileSource<Product> csvSource =
// FileSource.forRecordStreamFormat(csvFormat, Path.fromLocalFile(file)).build(); // firsrt version
FileSource.forRecordStreamFormat(csvFormat, new Path(fileName)).build(); // second version
DataStream<Product> csvInputStream = env.fromSource(csvSource, WatermarkStrategy.noWatermarks(), "csv-source");
...
异常日志:
Caused by: java.io.FileNotFoundException: File file:/opt/flink/data/test2.csv does not exist or the user running Flink ('flink') has insufficient permissions to access it.
at org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:106)
at org.apache.flink.connector.file.src.impl.StreamFormatAdapter.openStream(StreamFormatAdapter.java:157)
at org.apache.flink.connector.file.src.impl.StreamFormatAdapter.createReader(StreamFormatAdapter.java:70)
at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.checkSplitOrStartNext(FileSourceSplitReader.java:112)
at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:65)
at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:142)
我尝试使用不同的方式获得 Path,但这种方式没有运气。
只要我有例外,我认为问题可能是在Docker(类Unix)的本地系统中需要像这样的路径。File file:/opt/flink/data/test2.csv does not exist
file:///
我能做些什么?也许我错过了什么?
答:
0赞
Dominik Wosiński
1/28/2023
#1
问题似乎是您正在部署两个单独的容器和 ,但该文件仅在 上可用,而在 上不可用。您也可以尝试将正确的挂载添加到任务管理器中,然后重试吗?taskmanager
jobmanager
jobmanager
taskmanager
评论
chown