Checking Directory Contents6:09 with Kenneth Love
Once again, Python makes it easy to automate tasks. In this video let's take a look at how to search for files and directories.
One of the benefits that directories give us is a logical grouping of files. 0:00 We can keep all of our cat photos, 0:04 project files or favorite rock operas together in one place. 0:06 Often, when we're creating software for working with directories in files, 0:10 we wanna be able to search for particular files and directories. 0:13 Luckily, Python makes this pretty easy for us. 0:16 So I've imported os. 0:18 I'm gonna do os.listdir, and you can see all of the files and 0:20 directories that are currently in this directory. 0:24 Those dir method gives us back everything that's in the directory. 0:27 By default, it uses the current working directory, 0:31 but we can provide it a path right here if you wanted to. 0:33 And it'll tell you everything that's in that path. 0:36 Slightly more useful though, is the scan dir method. 0:40 Now we're gonna pass this one to list because it gives us an iterable to consume 0:43 and we wouldn't see anything good if we didn't pass it to list. 0:48 So we can see here all of these dir entries. 0:51 Each one of these dir entries is an object that represents an entry in the directory, 0:53 so it's either a file or it's a directory or whatever. 0:57 What's cool about these is we can use them to get some basic information about their 1:02 equivalent entry without having to go back to the file system and 1:05 inspect a particular file. 1:09 For instance, let's look at this one here. 1:11 So, 0123. 1:15 Okay? 1:18 So, I'm going to say files=list(os.scandir). 1:19 And then I'm gonna say files(3).name. 1:23 And I get bootstrap-3.3.7-dist.zip which you can get off 1:27 the getbootstrap.com website. 1:31 So let's find out if files is a file. 1:33 And it is, so cool. 1:37 So now I can get some statistics about the object by using the stat method. 1:39 So I can do files and then I can say stat. 1:44 And I get these stat results. 1:48 Now the one of these that is the most useful, 1:50 the most interesting is this one over here, this ST size. 1:52 This is the size of the file in bytes. 1:55 This is really handy if you wanted to, for example, 1:58 flag files that are above a certain size. 2:01 Now one more thing to bring up. 2:04 Scan dir gives you a stream like iterator. 2:05 Like when you use the open function. 2:08 So if you're not consuming it right away, and then ending the block it's in. 2:10 Like a for loop or a function. 2:13 Or you're not using it with a context manager like With. 2:15 You'll want to call the close method on it. 2:19 So for instance if I had scanner = os.scandir() I would 2:21 eventually want to do scanner.close. 2:25 That would close out the scanner and free up memory. 2:29 Now, let me show you another way to work your way through directories. 2:32 We can use the os.walk method to step through all of the files and 2:35 directories in a particular directory. 2:39 This isn't exactly the same as using scandir, but 2:40 it gives us a handy way to explore file trees. 2:43 I've already started, but I'm going to finish a script here called tree.py. 2:46 And I say started, I've created a file. 2:51 Now inside here, in this directory, I have a bootstrap directory and 2:54 that's full of all the files that you get when you download bootstrap. 3:00 So, I want to look through that. 3:02 So, I'm gonna import OS, and 3:05 then I'm gonna make a new function named treewalker. 3:06 And it's going to start at some directory. 3:10 So, my total size is 0. 3:12 And my total number of files is 0. 3:15 So for the root, for the dirs and for 3:18 the files that are in os.walk(start) whatever the start directory is. 3:22 My subtotal is going to be equal to the sum of os.path.getsize(), 3:28 which is similar to doing the os.stat and then pulling out the ST_SIZE. 3:34 But getsize just pulls directly that off. 3:38 And I'm gonna use os.path.join, and the root, and 3:40 the name, and I'm gonna do that for every name that's in files. 3:44 And then I'm gonna say that total_size will have the subtotal added to it. 3:50 And I'm gonna say that file_count will be equal to the len(files). 3:59 Total number of files over there. 4:07 And then total files is going to plus equal the file count. 4:09 Now I could of course just combine those two lines but 4:12 sometimes it's nice to print the whole thing out. 4:14 So,I'm gonna print root and then consumes. 4:17 And I'm gonna end that with a space, and 4:24 then I'm gonna print the subtotal, and end that with a space. 4:27 And then I'm gonna print bytes in, and then the file count, 4:33 and I'm gonna say non-directory files. 4:39 And then out here outside of the for loop, I'm gonna print start 4:44 contains and then total_files. 4:49 And then files with a combined size of, 4:55 and then total size, and then bytes. 4:59 So a lot of stuff to do there. 5:06 But then I'm gonna call treewalker down here with Bootstrap. 5:08 There we go, save that, come back over here, and 5:12 I'm gonna execute tree.py. 5:16 So Bootstrap has zero bytes, and zero non directory files. 5:20 Bootstrap css consumes 1.3 million bytes in 8 non-directory files. 5:23 Bootstrap fonts contains 215,000 bytes. 5:30 Bootstrap.js is 117,000. 5:34 Bootstrap in total has 16 files with a combined size of 1.6 million bytes. 5:37 So that's pretty cool. 5:44 That's a really nice way just to kind of quickly see what's going on. 5:46 There's a lot more to the OS module, and 5:49 we're going to explore some more of it in the next two sections. 5:51 Feel free though to explore it a bit more on your own. 5:55 It's one of the more interesting modules in Python, especially since dealing with 5:57 differences and operating systems is often such a thorny area. 6:00 All right, take a little break, have a snack or stretch, and then come back to 6:02 learn about manipulating files with Python, instead of just looking at them. 6:06
You need to sign up for Treehouse in order to download course files.Sign up