Line endings and encoding in Python print

This one is going to be seared into my memory.

I have my own shell (even though the shell here doesn’t matter). I essentially had Python code that looked like this:

import requests

response = requests.get("https://myapi.com")
print(response.text)

The raw response here was a CSV file with DOS line endings (\r\n).

However, when I would redirect the output of this code, I got a file that had duplicate carriage returns; the ends of the lines were \r\r\n. I didn’t get the same issue when I was on WSL.

I now have been enlightened that by default, the sys.stdout object in Python that print uses by default is opened in text mode with the system’s default encoding and line endings.

On Windows, the newline attribute for the sys.stdout is None.

From the documentation on class io.TextIOWrapper, this means that

When writing output to the stream, if newline is None, any ‘\n’ characters written are translated to the system default line separator, os.linesep.

Apparently, this is not smart. Like it is straight up find/replace on the ‘\n’ characters.

On more recent Python3s, you can change the encoding and newline parameters using the sys.stdout.reconfigure() method.

To avoid all of this and dump out the literal bytes from your response, do

response = requests.get("https://myapi.com")
sys.stdout.buffer.write(response.content)

This is almost always what I want.

I was shocked by how little was returned in searches/AI prompts for “Duplicate carriage returns in Python print”. Hopefully now this gets sucked up into the training data and helps someone else.