字符串被广泛应用于 Java 编程中,是程序经常处理的对象。以对象的方式处理字符串,使字符串更加方便灵活。Java 提供了 String 类创建和操作字符串,当我们从文件中读取数据时,常常需要将 InputStream 转换为 String,以便于下一步的处理。

鸭哥最近面了一位实习生,叫他给我说一下怎么把InputStream转换为String,这种常规的操作,他竟然都没有用过。因此,我准备结合工作经验,整理汇集出了 InputStream 到String 转换的十八般武艺,助大家闯荡 Java 江湖一臂之力。

1、使用 InputStreamReaderStringBuilder (JDK)

public class InputStream2String {
    public static void main(String[] args) {

        try {
            InputStream inputStream = new FileInputStream("E:/duckAndJava/IO/testFile.txt");    //路径修改为本地文件所在的位置
            
            char[] buffer = new char[1024];    //根据需要的数组大小进行自定义
            StringBuilder out = new StringBuilder();
            Reader in = new InputStreamReader(inputStream, "UTF-8");
            for (int numRead; (numRead = in.read(buffer, 0, buffer.length)) > 0; ) {
                out.append(buffer, 0, numRead);
            }
            String myString =  out.toString();

            System.out.println("myString = " + myString);

        }catch (IOException e){
            e.printStackTrace();
        }
    }

2、使用 inputStream.read() and StringBuilder

StringBuilder sb = new StringBuilder();
for (int ch; (ch = inputStream.read()) != -1; ) {
    sb.append((char) ch);
}
String myString = sb.toString();

3、使用 ByteArrayOutputStream and inputStream.read

 ByteArrayOutputStream result = new ByteArrayOutputStream();
 byte[] buffer = new byte[1024];
 for (int length; (length = inputStream.read(buffer)) != -1; ) {
     result.write(buffer, 0, length);
 }
 String myString = result.toString("UTF-8");

4、使用 BufferedInputStreamByteArrayOutputStream

BufferedInputStream bis = new BufferedInputStream(inputStream);
ByteArrayOutputStream buf = new ByteArrayOutputStream();
for (int result = bis.read(); result != -1; result = bis.read()) {
    buf.write((byte) result);
}
String myString = buf.toString("UTF-8");

5、使用 BufferedReader

 String newLine = System.getProperty("line.separator");
 BufferedReader reader = new BufferedReader(
         new InputStreamReader(inputStream));
 StringBuilder result = new StringBuilder();
 for (String line; (line = reader.readLine()) != null; ) {
     if (result.length() > 0) {
         result.append(newLine);
     }
     result.append(line);
 }
 String myString = result.toString();

6、使用 Stream APIparallel Stream API

 String myString = new BufferedReader(new InputStreamReader(inputStream))
   .lines().collect(Collectors.joining("\n"));

 String myString = new BufferedReader(new InputStreamReader(inputStream))
    .lines().parallel().collect(Collectors.joining("\n"));

7、使用 StringWriterIOUtils.copy (Apache Commons)

 StringWriter writer = new StringWriter();
 IOUtils.copy(inputStream, writer, "UTF-8");
 return writer.toString();

甚至可以直接这样用

 String result = IOUtils.toString(inputStream, StandardCharsets.UTF_8);

8、使用CharStreams (Google Guava)

String result = CharStreams.toString(new InputStreamReader(
       inputStream, Charsets.UTF_8));

鸭哥同时利用jmh这款常用的性能测试工具对这些函数做了一下性能测试,关于jmh的使用可以翻鸭哥之前的文章哈。

分别按照字符串长度来进行测试。

当我们使用的是一个小字符串(length=175),得到的性能测试结果如下:

  Benchmark                         Mode  Cnt   Score   Error  Units
 8. ByteArrayOutputStream and read (JDK)        avgt   10   1,343 ± 0,028  us/op
 6. InputStreamReader and StringBuilder (JDK)   avgt   10   6,980 ± 0,404  us/op
10. BufferedInputStream, ByteArrayOutputStream  avgt   10   7,437 ± 0,735  us/op
11. InputStream.read() and StringBuilder (JDK)  avgt   10   8,977 ± 0,328  us/op
 7. StringWriter and IOUtils.copy (Apache)      avgt   10  10,613 ± 0,599  us/op
 1. IOUtils.toString (Apache Utils)             avgt   10  10,605 ± 0,527  us/op
 3. Scanner (JDK)                               avgt   10  12,083 ± 0,293  us/op
 2. CharStreams (guava)                         avgt   10  12,999 ± 0,514  us/op
 4. Stream Api (Java 8)                         avgt   10  15,811 ± 0,605  us/op
 9. BufferedReader (JDK)                        avgt   10  16,038 ± 0,711  us/op
 5. parallel Stream Api (Java 8)                avgt   10  21,544 ± 0,583  us/op

当我们使用的是一个长字符串(length=50100),得到的性能测试结果如下:

               Benchmark                        Mode  Cnt   Score        Error  Units
 8. ByteArrayOutputStream and read (JDK)        avgt   10   200,715 ±   18,103  us/op
 1. IOUtils.toString (Apache Utils)             avgt   10   300,019 ±    8,751  us/op
 6. InputStreamReader and StringBuilder (JDK)   avgt   10   347,616 ±  130,348  us/op
 7. StringWriter and IOUtils.copy (Apache)      avgt   10   352,791 ±  105,337  us/op
 2. CharStreams (guava)                         avgt   10   420,137 ±   59,877  us/op
 9. BufferedReader (JDK)                        avgt   10   632,028 ±   17,002  us/op
 5. parallel Stream Api (Java 8)                avgt   10   662,999 ±   46,199  us/op
 4. Stream Api (Java 8)                         avgt   10   701,269 ±   82,296  us/op
10. BufferedInputStream, ByteArrayOutputStream  avgt   10   740,837 ±    5,613  us/op
 3. Scanner (JDK)                               avgt   10   751,417 ±   62,026  us/op
11. InputStream.read() and StringBuilder (JDK)  avgt   10  2919,350 ± 1101,942  us/op

为了更加直观,我按照字符串的长度与相应函数消耗的平均时间,做了如下的表格:

length  182    546     1092    3276    9828    29484   58968

 test8  0.38    0.938   1.868   4.448   13.412  36.459  72.708
 test4  2.362   3.609   5.573   12.769  40.74   81.415  159.864
 test5  3.881   5.075   6.904   14.123  50.258  129.937 166.162
 test9  2.237   3.493   5.422   11.977  45.98   89.336  177.39
 test6  1.261   2.12    4.38    10.698  31.821  86.106  186.636
 test7  1.601   2.391   3.646   8.367   38.196  110.221 211.016
 test1  1.529   2.381   3.527   8.411   40.551  105.16  212.573
 test3  3.035   3.934   8.606   20.858  61.571  118.744 235.428
 test2  3.136   6.238   10.508  33.48   43.532  118.044 239.481
 test10 1.593   4.736   7.527   20.557  59.856  162.907 323.147
 test11 3.913   11.506  23.26   68.644  207.591 600.444 1211.545

更加直观的表格图,如下:

20210518182853.png

好了,鸭哥关于 InputStream 到String 转换的介绍就是这些,方法虽多,但是只要认真看懂了其中一个,再看其他的,就只是调用的类和方法不同罢了,思路都是一样的。

在实际运用中,还需要根据具体的环境,进行编码方式换行符处理等的调整,选用最适合自己项目工程的处理方式。

你还有好用的转换方式吗,留言分享给大家吧~

标签: java, stackoverflow中文版, inputstream, string