歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
您现在的位置: Linux教程網 >> UnixLinux >  >> Linux編程 >> Linux編程

Python os.walk() 簡介

Table of Contents

  • 1. os.walk目錄遍歷
    • 1.1. os.walk
    • 1.2. 例子
      • 1.2.1. 測試topdown
      • 1.2.2. 運行時修改遍歷目錄
  • 2. 參考資料

os.walk目錄遍歷

每個月都有那麼幾天想劃水,又到劃水的日子了,今天分享的是剛在處理遍歷目錄相關用到的相關方法。

os.walk

os.walk的參數如下:

os.walk(top, topdown=True, onerror=None, followlinks=False)

其中:

  • top是要遍歷的目錄。
  • topdown是代表要從上而下遍歷還是從下往上遍歷。
  • onerror可以用來設置當便利出現錯誤的處理函數(該函數接受一個OSError的實例作為參數),設置為空則不作處理。
  • followlinks表示是否要跟隨目錄下的鏈接去繼續遍歷,要注意的是,os.walk不會記錄已經遍歷的目錄,所以跟隨鏈接遍歷的話有可能一直循環調用下去。

os.walk返回的是一個3個元素的元組 (root, dirs, files) ,分別表示遍歷的路徑名,該路徑下的目錄列表和該路徑下文件列表。注意目錄列表和文件列表不是具體路徑,需要具體路徑(從root開始的路徑)的話可以用 os.path.join(root,dir)os.path.join(root,dir)

例子

假設現在存在如下的文件和目錄結構:

➜  test_os_walk git:(master) ✗ tree
.
├── a.py
├── b.py
├── c.py
├── dir1
│   ├── dir4
│   │   ├── g.py
│   │   └── h.py
│   ├── dirx
│   │   ├── diry
│   │   │   └── k.py
│   │   └── z.py
│   ├── e.py
│   ├── f.py
│   └── g.py
├── dir2
│   ├── dira
│   │   └── dirb
│   │       └── dirc
│   │           └── aha.py
│   ├── k.py
│   ├── l.py
│   └── m.py
└── dir3
    ├── dir5
    │   └── z.py
    ├── x.py
    └── y.py

10 directories, 17 files

測試topdown

當我用 os.walk 遍歷這個目錄時,程序和輸出如下:

import os

path = '/Users/nisen/Projects/python_advanced_class/test/test_os_walk'

for root, dirs, files in os.walk(path, True):
    print 'root: %s' % root
    print 'dirs: %s' % dirs
    print 'files: %s' % files
    print ''

結果如下,從root的路徑可以看出遍歷是自上而下的:

➜  test git:(master) ✗ python test11.py
root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
dirs: ['dir1', 'dir2', 'dir3']
files: ['a.py', 'b.py', 'c.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
dirs: ['dir4', 'dirx']
files: ['e.py', 'f.py', 'g.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
dirs: []
files: ['g.py', 'h.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
dirs: ['diry']
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
dirs: []
files: ['k.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
dirs: ['dira']
files: ['k.py', 'l.py', 'm.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
dirs: ['dirb']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
dirs: ['dirc']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
dirs: []
files: ['aha.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
dirs: ['dir5']
files: ['x.py', 'y.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
dirs: []
files: ['z.py']

而當設置os.walk的topdown為False時,結果如下, 可以看出他是自上而下遍歷的:

➜  test git:(master) ✗ python test11.py
root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
dirs: []
files: ['g.py', 'h.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
dirs: []
files: ['k.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
dirs: ['diry']
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
dirs: ['dir4', 'dirx']
files: ['e.py', 'f.py', 'g.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
dirs: []
files: ['aha.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
dirs: ['dirc']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
dirs: ['dirb']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
dirs: ['dira']
files: ['k.py', 'l.py', 'm.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
dirs: []
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
dirs: ['dir5']
files: ['x.py', 'y.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
dirs: ['dir1', 'dir2', 'dir3']
files: ['a.py', 'b.py', 'c.py']

運行時修改遍歷目錄

當topdown設置為True時,可以在處理時修改返回的 dirs 列表,這樣可以遍歷下面的目錄時會根據修改後的 dirs 來遍歷。比如下面的例子,在遍歷的時候不把"CSV"目錄包括在內:

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"
    if 'CVS' in dirs:
        dirs.remove('CVS')  # don't visit CVS directories

參考資料

  • https://docs.python.org/2/library/os.html#os.walk

Copyright © Linux教程網 All Rights Reserved