Source: Reading and Writing Files in Python (Guide)
At its core, a file is a contiguous set of bytes used to store data. This data is organized in a specific format and can be anything as simple as a text file or as complicated as a program executable. In the end, these byte files are then translated into binary 1 and 0 for easier processing by the computer.
Files on most modern file systems are composed of three main parts:
The file path is a string that represents the location of a file. It’s broken up into three major parts:
/
│
├── path/
| │
│ ├── to/
│ │ └── cats.gif
│ │
│ └── dog_breeds.txt
|
└── animals.csv
ASA standard states that line endings should use the sequence of the Carriage Return (CR or \r) and the Line Feed (LF or \n) characters (CR+LF or \r\n). The ISO standard however allowed for either the CR+LF characters or just the LF character.
Another common problem that you may face is the encoding of the byte data. An encoding is a translation from byte data to human readable characters. This is typically done by assigning a numerical value to represent a character. The two most common encodings are the ASCII and UNICODE Formats. ASCII can only store 128 characters, while Unicode can contain up to 1,114,112 characters.
ASCII is actually a subset of Unicode (UTF-8), meaning that ASCII and Unicode share the same numerical to character values. It’s important to note that parsing a file with the incorrect character encoding can lead to failures or misrepresentation of the character. For example, if a file was created using the UTF-8 encoding, and you try to parse it using the ASCII encoding, if there is a character that is outside of those 128 values, then an error will be thrown.
When you want to work with a file, the first thing to do is to open it. This is done by invoking the open() built-in function. open() has a single required argument that is the path to the file. open() has a single return, the file object.
It’s important to remember that it’s your responsibility to close the file. In most cases, upon termination of an application or script, a file will be closed eventually. However, there is no guarantee when exactly that will happen. This can lead to unwanted behavior including resource leaks. It’s also a best practice within Python (Pythonic) to make sure that your code behaves in a way that is well defined and reduces any unwanted behavior.
reader = open('dog_breeds.txt')
try:
# Further file processing goes here
finally:
reader.close()
Notice the
open
command at the top and theclose
command at the bottom.
Other options for modes are fully documented online, but the most commonly used ones are the following:
Character | Meaning |
---|---|
‘r’ | Open for reading (default) |
‘w’ | Open for writing, truncating (overwriting) the file first |
‘rb’ or ‘wb’ | Open in binary mode (read/write using byte data) |
Once you’ve opened up a file, you’ll want to read or write to the file. First off, let’s cover reading a file. There are multiple methods that can be called on a file object to help you out:
Method | What It Does |
---|---|
.read(size=-1) | This reads from the file based on the number of size bytes. If no argument is passed or None or -1 is passed, then the entire file is read. |
.readline(size=-1) | This reads at most size number of characters from the line. This continues to the end of the line and then wraps back around. If no argument is passed or None or -1 is passed, then the entire line (or rest of the line) is read. |
.readlines() | This reads the remaining lines from the file object and returns them as a list. |
Method | What It Does |
---|---|
.write(string) | This writes the string to the file. |
.writelines(seq) | This writes the sequence to the file. No line endings are appended to each sequence item. It’s up to you to add the appropriate line ending(s). |
You can actually open that file in Python and examine the contents! Since the .png file format is well defined, the header of the file is 8 bytes broken up like this:
Value | Interpretation |
---|---|
0x89 | A “magic” number to indicate that this is the start of a PNG |
0x50 0x4E 0x47 | PNG in ASCII |
0x0D 0x0A | A DOS style line ending \r\n |
0x1A | A DOS style EOF character |
0x0A | A Unix style line ending \n |
Sometimes, you may want to append to a file or start writing at the end of an already populated file. This is easily done by using the 'a'
character for the mode argument:
with open('dog_breeds.txt', 'a') as a_writer:
a_writer.write('\nBeagle')
There are common situations that you may encounter while working with files. Most of these cases can be handled using other modules. Two common file types you may need to work with are .csv and .json. Real Python has already put together some great articles on how to handle these:
Reading and Writing CSV Files in Python
Working With JSON Data in Python
Additionally, there are built-in libraries out there that you can use to help you:
There are plenty more out there. Additionally there are even more third party tools available on PyPI. Some popular ones are the following:
Source: Python Exceptions: An Introduction
Syntax errors occur when the parser detects an incorrect statement.
>>> print( 0 / 0 ))
File "<stdin>", line 1
print( 0 / 0 ))
^
SyntaxError: invalid syntax
The arrow indicates where the parser ran into the syntax error.
We can use raise to throw an exception if a condition occurs. The statement can be complemented with a custom exception.
If you want to throw an error when a certain condition occurs using raise, you could go about it like this:
x = 10
if x > 5:
raise Exception('x should not exceed 5. The value of x was: {}'.format(x))
When you run this code, the output will be the following:
Traceback (most recent call last):
File "<input>", line 4, in <module>
Exception: x should not exceed 5. The value of x was: 10
The program comes to a halt and displays our exception to screen, offering clues about what went wrong.
Instead of waiting for a program to crash midway, you can also start by making an assertion in Python. We assert that a certain condition is met. If this condition turns out to be True, then that is excellent! The program can continue. If the condition turns out to be False, you can have the program throw an AssertionError exception.
The try and except block in Python is used to catch and handle exceptions. Python executes code following the try statement as a “normal” part of the program. The code that follows the except statement is the program’s response to any exceptions in the preceding try clause.
The assert in this function will throw an AssertionError exception if you call it on an operating system other then Linux.
In Python, using the else statement, you can instruct a program to execute a certain block of code only in the absence of exceptions.
Imagine that you always had to implement some sort of action to clean up after executing your code. Python enables you to do so using the finally clause.
After seeing the difference between syntax errors and exceptions, you learned about various ways to raise, catch, and handle exceptions in Python. In this article, you saw the following options: