且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在C#中高效地编写大文本文件?

更新时间:2023-11-27 23:18:04

文件 I/O 操作通常在现代操作系统中得到了很好的优化.您不应该尝试为内存中的文件......只需将其逐个写出.FileStream 将负责缓冲和其他性能考虑.

File I/O operations are generally well optimized in modern operating systems. You shouldn't try to assemble the entire string for the file in memory ... just write it out piece by piece. The FileStream will take care of buffering and other performance considerations.

您可以通过移动轻松进行此更改:

You can make this change easily by moving:

using (StreamWriter outfile = new StreamWriter(filePath)) {

到函数的顶部,并摆脱直接写入文件的StringBuilder.

to the top of the function, and getting rid of the StringBuilder writing directly to the file instead.

避免在内存中构建大字符串的原因有几个:

  1. 实际上它的性能可能更差,因为 StringBuilder 必须在您写入时增加其容量,从而导致重新分配和复制内存.
  2. 它可能需要比物理分配更多的内存 - 这可能会导致使用比 RAM 慢得多的虚拟内存(交换文件).
  3. 对于真正的大文件 (> 2Gb),您将耗尽地址空间(在 32 位平台上)并且永远无法完成.
  4. 要将 StringBuilder 内容写入文件,您必须使用 ToString() 这有效地使进程的内存消耗加倍,因为两个副本都必须在内存中一段的时间.如果您的地址空间足够碎片化,以至于无法分配单个连续的内存块,此操作也可能会失败.
  1. It can actually perform worse, because the StringBuilder has to increase its capacity as you write to it, resulting in reallocation and copying of memory.
  2. It may require more memory than you can physically allocate - which may result in the use of virtual memory (the swap file) which is much slower than RAM.
  3. For truly large files (> 2Gb) you will run out of address space (on 32-bit platforms) and will fail to ever complete.
  4. To write the StringBuilder contents to a file you have to use ToString() which effectively doubles the memory consumption of the process since both copies must be in memory for a period of time. This operation may also fail if your address space is sufficiently fragmented, such that a single contiguous block of memory cannot be allocated.