C# C# Streams and Data Processing Reading Data Breaking It Up Into Lines

Daniel Tkach
Daniel Tkach
7,607 Points

No \r, only \n

Hello folks, I only have a \n, and well I can use the Split method with '\n\' as a parameter and it works just fine. I was wondering why I don't have this \r and if there are any implications. I guess before splitting a csv file I should add some code to check if the \r is present? What do you think?

3 Answers

Steven Parker
Steven Parker
177,888 Points

It wouldn't hurt to play it safe.

Depending on the OS, the standard for line ending might be just "\n" (Linux or Mac), or "\r\n" (Windows).

Just to be safe, you might want to do a Replace("\r\n", "\n") before the split.

Actually since Replace(string OldValue, string NewValue), it should be the other way round, Replace("\n", "\r\n").

Steven Parker
Steven Parker
177,888 Points

I think the point was that only "\n" is needed here so the replace suggestion removes the "\r", and only if it exists.

If you reverse it, then if you start with "\r\n" you will end up with "\r\r\n".

For the purpose of splitting the csv file into an array of substrings i do agree that having one character to use for delimiting makes it easier and hence replacing \r\n with just \n would serve perfect for the task, all i am saying is that for the purpose of those who would wish to follow closely along with the video, the csv file provided seems to only have \n as the newline character, and hence using \n as the only delimiting character does not bring about the problem of empty strings in the resulting string array and so if people would so wish to see this problem first hand they would rather replace the \n with \r\n to get a clear understanding of why we are using the StringSplitOptions.RemoveEmptyEntries at the end to solve this problem.

Also, if in a case of generally creating text files to be read across many windows systems, \r\n would be better as a go to newline character to be safe, as it is backwards compatible with older windows systems that only recognize the combination of both characters as a new line character.

Steven Parker
Steven Parker
177,888 Points

We must be looking at different videos. The one I saw definitely does have an issue with two delimiters and a problem with empty strings, as demonstrated around time index 3:50. It gets handled (around 5:00) by the addition of StringSplitOptions.RemoveEmptyEntries as an extra argument to "Split". The use of "Replace" originally suggested here would be an alternative fix.

And for creating output files (not part of the video exercise), I agree with your suggestion if the file is likely to be used on multiple platforms. But in the more likely case where the output will remain on the same system as where it is created, I recommend sticking with the convention used on that system.

i believe having both "\r\n" is a good thing for backwards compatibility with older windows systems that require both the carriage return and line feed characters, in the case of a text file only having "\n", calling the Replace("\n" , "\r\n") would replace all instances of the escape sequence "\n" in the text file with "\r\n" thereby making it backwards compatible with older systems. but if it's only for the cause of the tutorial then only "\n" would do just fine.

Steven Parker
Steven Parker
177,888 Points

We must be talking about different purposes for the code. In the video, the string containing multiple lines is being split into separate strings for each line. So converting to just a single line terminating character makes this job easier to do, no matter what kind of system the code is being used on. The resulting strings have no terminating characters so the concern about "backwards compatibility" would not apply here.