Title: Working with Files
1Working with Files
- HI5100 Data Structures for BioInformatics
- Lesson 6
2OBJECTIVES
- In this lesson you will learn
- About basic file processing and techniques for
reading and writing text files.
3Lesson 6 Sections
- 1.1 File Processing
- Summary
4Files
- Many applications work with information that is
stored in files external to the application - There are different types of files
- We are going to focus on text files for now
- A file is information that is allocated to memory
on a hard disk or other type of storage - The file contains information that defines its
own boundaries in memory
5Programs Using Files
- When an application is used to manipulate data in
a file, the application has to contain code to - Open the file
- So that info can be read from it
- Manipulate the file
- Read some or all of the information from the file
into RAM - Write info into the file in RAM
- Close the file
- Put an end of file (EOF) marker at the end of the
file - Save the file contents back to disk
6File Variables
- When a program is going to work with a file, a
variable must be created that represents the file
in the program
7Associate the File with a Variable
ltfilevargt open(ltpathgt,ltmodegt) NameFile
open(C\names.txt,r)
- ltmodegt is either
- r for read mode
- w for write mode
8Read from a File
ltfilevargt.read() ltfilevargt.readline() ltfilevargt.re
adlines()
- The read function reads the entire file into one
(perhaps large) string - The readline function reads the next line from
the file as a string - The readlines function reads the lines remaining
in the file - Newline characters separate one line from the next
9Create a Text File
- Use Microsoft Excel or a similar application to
create a comma delimited text file (.csv) of,
say, first names - You can also create the file in Notepad
- Name the file names.csv
10A Program that Opens a File
open_and_print_file.py def main() fname
"names.csv" namelist open(fname,"r")
namedata namelist.read() print namedata
- Enter the program
- Save it in the same folder as names.csv
- Run it
11File Contents are Output
12Read Function
- The information was read from the file with Read
- Open names.csv in Notepad or Excel and rearrange
the names so that there are more rows - Save the changes and run the program in
open_and_print_file.py again
13File Contents and Output
14Readline Function
open_and_print_file2.py def main() fname
"names.csv" namelist open(fname,"r")
for i_val in range(3) namedata
namelist.readline() print namedata
- Modify the program to use readline instead of
read - Save in the same folder as names.csv and run it
15Output Using Readline
- The program reads one line of the file at a time
- Then prints the line
- Stops after 3 lines. Why?
- What if the loop specifies more lines that there
are in the file?
16Readlines Function
- Returns a list of lines
- Enables a program to iterate through the list
17Readlines Function
open_and_print_file3.py def main() fname
"names.csv" namelist open(fname,"r")
for line_val in namelist.readlines()
print line_val namelist.close()
- Modify the program to use readlines instead of
readline - Save in the same folder as names.csv and run it
18File and Output Using Readlines()
19You Can Loop Through Lines in the File Directly
- Without reading the file into RAM
- Here is the modified file
20Reading the File Directly Output
21Open a File for Writing
- If no file by the given name exists, a new file
is created - If a file with the given name does exist, it will
be deleted and a new empty file is created - Then the date is written into the new empty file
22Write to a File
- Open for writing
- Then use the write operation
ltfile-vargt.write(ltstringgt)
23Write Operation
- Takes a single value as an argument
- Argument must be a string
- Writes the string to a file
- If you want to start a new line in the file, you
must explicitly provide a newline character - \n is the newline character in Python (and in
other languages)
24Write Data to a File
write_to_file1.py def main() namelist
open(namelist.dat,w) namelist.write(Daisy
Ducks nieces are named ) counter 1
namelist.write(April, ) counter counter
1 namelist.write(May, and ) counter
counter 1 namelist.write(June.\n)
namelist.write(Daisy Duck has d nieces.
counter) namelist.close()
25Output
- Where is the output?
- The output is a file named namelist.dat written
to the same folder as the one containing the
write_to_file1.py program - Open the output file in Notepad
26Open Output File in Notepad
27Output File namelist.dat
28Summary
- Now you know a little about file handling in
Python. - You are ready for bigger and better things!
29End of Slides for Lesson 6
- HI5100 Data Structures for BioInformatics
- Lesson 6