A character string is composed of several characters and is operated in character units; a byte
string is composed of several bytes and is operated in byte units.
Byte string and strings are basically all the same except that they operate on different
Byte strings and strings are immutable sequences. You cannot add or delete data as you wish.
Byte is only responsible for storing data in the form of byte sequences (binary form). In simple explanation,
bytes just simply record the raw data in memory. As how to use these data, bytes do not care, and you
can use it as you want.
Therefore, Bytes type of data is very suitable for internet transmission and can be used for network communication
programming. Bytes can also be used to store pictures, audio, video and other binary format files.
Strings and bytes have closed relationship with each other. You can use strings to create bytes objects or
convert strings to bytes objects. There are three ways to achieve this:
a. If the content of the string are all ASCII characters, you can convert them to bytes directly by prefixing
the string with alphabet b.
b. You can call its constructor, bytes() in order to convert a string into bytes according to a specified character
set. If no character set is specified. UTF-8 is used by default.
c. The string itself has an encode() method, which is specifically used to convert the string into the corresponding
byte string according to the specified character set. If no character set is specified. UTF-8 is used by
a1 = bytes () #Create empty bytes by constructor
a2 = b ' ' #Create empty bytes by empty string
a3 = b'http://www.freelearningpoints.com/' #Convert string to bytes by b prefix
print ("a3:", a3)
#Specify the character set for the bytes() method
a4 = bytes ('Welcome to Free Learning Points', encoding = 'UTF-8')
print ("a4:", a4)
#Convert string to bytes by encode() method
a5 = "Welcome to Free Learning Points" .encode ('UTF-8')
print ("a5:", a5)
The output is:
a4: b'Welcome to Free Learning Points'
a5: b'Welcome to Free Learning Points'
From the results, you can find that for non-ASCII characters, print its character encoding value (hexadecimal format), not the
character itself. Non-ASCII characters generally occupy more than two bytes of memory, while bytes process data as a single byte, thus multiple bytes cannot be processed at once.