How to Read and Write Binary Files in Python?
Opening and Closing Files
To open files, use open()
. When opening binary files, "rb"
, "wb"
, "ab"
or "xb"
are given as a option. On the other hand, use close()
to close files.
f = open("test.dat", "wb")
f.close()
The difference between "rb"
, etc. is as follows.
Option | Description |
---|---|
"rb" | Reading |
"wb" | Writing |
"ab" | Moving to the end of file, then writing |
"xb" | Writing when a file doesn’t exist. An err occurs when a file exists. |
When there are continuous read and write operations, and you want to close the file when the operations are finished, it’s convenient to use the with
syntax. The file will be automatically closed when the block ends.
with open("test.dat", "rb") as f:
a = f.read()
Random Access
Getting Current Position
Use tell()
to get current position of the file.
# f is file object
pos = f.tell()
Moving Pointer
You can set options with the seek()
function to indicate where in the file to use as a reference point. The values are defined in the os
module. If omitted, os.SEEK_SET
is specified by default.
Value | Reference Point | Specified Value |
---|---|---|
os.SEEK_SET | Head of file | Positive |
os.SEEK_CUR | Current position | Positive or negative |
os.SEEK_END | End of file | Negative |
# import os module
import os
# f is file object
f.seek( 5, os.SEEK_SET) # 5 bytes after the head of file
f.seek(-3, os.SEEK_CUR) # 3 bytes before the current position
f.seek(-5, os.SEEK_END) # 5 bytes before the end of file
Reading and Writing
To read byte sequences from a file, you use read()
, and for writing, you use write()
. When reading or writing strings or numbers, you need to go through byte sequences and perform conversions.
# f if file object
b = f.read(32) # reading 32 bytes
b = b'123'
f.write(b) # writing byte sequences
Strings
To convert between strings and byte sequences, you use decode()
and encode()
. Additionally, you can utilize the struct
module, as mentioned later.
# f is file object
b = f.read(10)
str = b.decode() # converting byte sequences to string
str = "12345"
b = str.encode() # converting string to byte sequences
f.write(b)
Numerics
Integers
To convert between integers and byte sequences, you use int.from_byte()
or to_bytes()
. Additionally, you can utilize the struct
module, too.
# requirement for using byteorder
from sys import byteorder
# f is file object
b = f.read(4)
val = int.from_bytes(b, byteorder) # converting 4-byte sequeces to integer
val = 12345
b = val.to_bytes(4, byteorder) # converting integer to 4-byte sequences
f.write(b)
Floating Point Values
To convert floating point values to and from byte sequences, the struct
module is used. The return value of unpack()
is a tuple, so it’s important to handle it accordingly.
# requirement for using struct module
from struct import unpack, pack
# f is file object
b = f.read(8)
val, = unpack('d', b) # converting 8-byte sequeces to floating point value
val = 12.345
b = pack('f', val) # converting floating point value to 4-byte sequences
f.write(b)
The format characters that can be specified with unpack()
and pack()
include the following:
Characters | Description |
---|---|
'f' | Single-precision floating point (4 bytes) |
'd' | Double-precision floating point (8 bytes) |
'q' | Signed integer (8 bytes) |
'Q' | Unsigned integer (8 bytes) |
'i' , 'l' | Signed integer (4 bytes) |
'I' , 'L' | Unsigned integer (4 bytes) |
'h' | Signed integer (2 bytes) |
'H' | Unsigned integer (2 bytes) |
'c' | Character (byte string of length 1) |
'b' | Signed integer (1 byte) |
'B' | Unsigned integer (1 byte) |
's' , 'p' | Fixed-length byte string (specified with length, e.g., '10s' ) |
Multiple Data
To read and write multiple pieces of data at once, you can use the struct
module.
# requirement for struct module
from struct import unpack, pack, calcsize
format = "15sl10s"
size = calcsize(format) # calculating the buffer size from a format string
b = f.read(size)
s1, val, s2 = unpack(format, b)
s1 = s1.strip(b'\0x00').decode() # removing null character
s2 = s2.strip(b'\0x00').decode() # removing null character
s1 = "test"
val = 123
s2 = "abcdefghij"
b = pack(format, s1.encode(), val, s2.encode())
f.write(b)
Discussion
New Comments
No comments yet. Be the first one!