In previous chapters, we learned to use a combination of open and read (or readline, readlines) function to read data from a single file. However, in some scenarios, it may be necessary to read data from multiple files. In this case, it is obviously inappropriate to use this combination.
Fortunately, Python provides a Fileinput module. With the input function in this module, we can open multiple specified files at the same time and read the contents of these files one by one.
The syntax of the input () function in the fileinput module is as follows:
fileinput.input (files = "filename1, filename2, ...", inplace = False, backup = '', bufsize = 0, mode = 'r', openhook = None)
This function returns a FileInput object, which can be understood as a file object after merging multiple specified files. The meaning of each parameter is as follows:
files: list of paths for multiple files;
inplace: used to specify whether to write the results of standard output back to a file. The default
value of this parameter is False;
backup: used to specify the extension of the backup file;
bufsize: specify the size of the buffer, the default is 0;
mode: Open file format, default is r (read-only format);
openhook: Controls how files are opened, such as encoding formats.
Note that, unlike the open function, the input function cannot specify the encoding format of the open file. This means that all files read using this function must be in the same encoding format as the current file unless read in binary mode. The default encoding format of the operating system is the same, otherwise the Python interpreter may prompt a UnicodeDecodeError.
Unlike the open function, which returns a single file object, the fileinput object does not need to call functions such as read, readline, readlines, and can directly read data from multiple files through a for loop.
It is worth mentioning that the fileinput module also provides a lot of functions (as shown in the following table). By calling these functions, we can help us achieve the desired function faster.
Function Name | Function Description |
---|---|
fileinput.filename() | Returns the name of the file currently being read. |
fileinput.fileno() | Returns the file descriptor of the file currently being read. |
fileinput.lineno() | Returns how many rows are currently read. |
fileinput.filelineno() | Returns the line number of the content currently being read in the current file. |
fileinput.isfirstline() | Determines whether the currently read content is on line 1 in the current file. |
fileinput.nextfile() | Close the file currently being read and start reading the next file. |
fileinput.close() | Close the FileInput object. |
Here is an example. Suppose you use input function to read 2 files, a.txt and file.txt, which are in the same directory and each contains the following:
# file.txt
Python Tutorial
www.freelearningpoints.com
# a.txt
Welcome to:
Free Learning Points Website
The following program shows how to use the input function to read these two files one by one:
import fileinput #Use a for loop to iterate over the fileinput object for line in fileinput.input (files = ('a.txt', 'file.txt')): # Output the read content print (line) # Close file stream fileinput.close ()
Obviously, the order in which file contents are read depends on the order of file names in the input function.
Before using the input function in the fileinput module, must ensure to import the fileinput module.
In addition to reading files with the help of the fileinput module, Python also provides a linecache module. Unlike the former, the linecache module is good at reading specified lines in a specified file. In other words, if we want to read the data contained in a specified line in a file, we can use the linecache module.
It is worth mentioning that the linecache module is commonly used to read the code in Python source files. It uses the UTF-8 encoding format to read the file contents. This means that the file read by this module must also be encoded in UTF-8, otherwise the data read is garbled, or the read fails directly (the Python interpreter will report a SyntaxError exception).
To use the linecache module, you must know which functions it contains. The functions and functions commonly used in the linecache module are shown in the following table:
Function Basic Format | Features |
---|---|
linecache.getline(filename, lineno, module_globals=None) | Read the specified line of the specified file in the specified module (the specified module is not necessary
when reading only the specified file). Among them, • the filename parameter is used to specify the filename, • lineno is used to specify the line number, • module_globals is used to specify the specific module name to be read. Note that when the specified file is passed to the filename parameter as a relative path, the function looks for the file at the path specified by sys.path. |
linecache.clearcache() | If the program somewhere no longer needs the data previously read using the getline function, you can use this function to clear the cache. |
linecache.checkcache(filename=None) | Check the validity of the cache, that is, if the data read using the getline function has been modified locally, and we need new data, you can use this function to check whether the cache is new data. Note that if the file name is omitted, this function will check the validity of all cached data. |
[Example:]
import linecache import string #Read the data in line 3 of the string module print (linecache.getline (string .__ file__, 3)) # Read the second line of a normal file print (linecache.getline ('a.txt', 2))
Before executing this program, you need to ensure that a.txt file is saved in UTF-8 encoding format (modules provided by Python, usually encoding format is UTF-8). On this basis, the program is executed, and the output is:
More Tutorials:
Python Installation - Linux (Ubuntu)More Python Exercises:
Python String Exercises