This Chat is Bananas11:00 with Kenneth Love
For our second plugin, let's create a bot that randomly replaces a noun in a random message with the word "banana".
You may find that you need to install some additional NLTK libraries before you can use the tokenizers and other parts of the library. You can do this in your Python shell with the following commands:
>>> import nltk >>> nltk.download()
And then choosing the appropriate components from the window that opens.
This last plugin that I want to make is simpler, 0:00 as in it has less code than the motivation one, but it uses some more powerful tools. 0:03 So, six of one, half a dozen of the other, I guess. 0:09 I want to have the bot randomly choose a message, break it into sentences, 0:12 pick a random sentence from all of them, and then break that sentence into words. 0:15 Then I'm gonna use a part of speech library to find all of the nouns and 0:20 pick a random noun and replace it with a word. 0:25 I think I'm gonna use banana, cuz I find that funny. 0:27 Then patch everything back up and post that message. 0:30 You could, of course, 0:33 use a word other than banana if you want to, but I'll leave that up to you. 0:34 So first thing's first. 0:38 I need to install a couple of libraries. 0:40 I need to install the NLTK library and the noun-hound library. 0:42 Now, if you're not familiar with NLTK, it's the natural language toolkit. 0:48 And it has tons of tools, who'd have thunk, for working with text, 0:52 including finding common word pairings, parts of speech and all. 0:56 Tons and tons of other stuff. 1:00 Noun Hound is a specialized library that sits on top of NLTK and 1:02 it finds individual nouns and noun phrases in a bit of text. 1:07 Now I could just use NLTK to find the nouns. 1:11 But that's a bit more fiddly than I want to deal with in a simple Slack bot 1:15 when I can just use an already available library that does exactly what I want. 1:18 So I'm gonna pip install NLTK and noun-hound. 1:22 I love the new pip with the progress bars, it's so nice. 1:34 Okay, so then of course I need to get my file and everything ready for the plugins. 1:37 So I'm gonna do a new folder that I'm gonna call banana. 1:42 And inside of there, a new file banana.py partly cuz I like the joke there. 1:47 One of my first [LAUGH] open source libraries was named banana pie. 1:54 And also because I'm putting in the word banana. 1:57 If you wanted to make this to where you could put in a random word or 2:00 you could configure the word with like a variable in here, 2:04 word_to_use = 'banana' kind of thing. 2:09 Then you could totally call this wordswap.py or something like that. 2:12 Call it whatever makes the most sense for you. 2:16 So, I'm gonna start with the crontable and 2:20 outputs variables again because it's good to have those. 2:23 It's good practice just to always put those in there for these plugins. 2:28 And again, you don't need the crontable one so feel free to leave that out. 2:31 But I like following formats. 2:34 So the process message function is still going to receive data and 2:38 the first thing I want to do is I want to get the text of the message. 2:43 So let's just say message equals data, so 2:47 that way I have it in a variable I can hold on to it. 2:51 This is where I need to start doing all the textual processing on all of 2:54 this stuff. 2:58 So I need to bring in a couple of other tools. 2:59 Since I want random words and sentences, I need to import random. 3:02 And then, I need to import stuff for doing the tokenizing and the noun finding. 3:09 So from nltk.tokenize 3:16 import sent_tokenize and 3:20 wordpunct_tokenize. 3:25 And then from noun_hound, import NounHound. 3:30 And then I'll make a new variable right down here that is nh=NounHound. 3:35 So that way, I just have it available. 3:41 So now, these nltk_tokenizers these two right here, 3:43 will let us break up text on specific tokens. 3:47 So, in this case the sent tokenize, tokenizer, breaks up text on sentences so 3:51 it knows how to find sentences no matter what the ending punctuation is and 3:56 break your text apart on those sentences. 4:01 The wordpunct tokenizer breaks up a bit of text on its words and its punctuation. 4:04 So you can get words without like periods or commas after them and 4:11 if a word has a apostrophe in it or something like that, 4:16 a hyphen or whatever, then it'll break that up into multiple words so 4:21 that you can get both sides of the hyphen. 4:25 They're both pretty fun to play with and 4:27 there's a couple of the tokenizers in NLTK. 4:28 So go check those out, read the docs, they're linked in the teacher's notes. 4:31 There's some neat stuff in there. 4:35 For now though, I need to break up the text into sentences and 4:37 get a random sentence. 4:40 So let's do sentences = sent_tokenize and 4:42 I'm gonna pass in message. 4:47 And then I'm gonna get a number that is 4:50 random.randint(0, len(sentences)). 4:54 Now why get a number instead of just getting a random sentence? 4:59 Because I need to know where it is in the list of sentences so 5:05 that I can replace that sentence, so the message still makes sense, right? 5:08 And then finally let's get out that sentence which will be 5:12 sentences(sentence_num). 5:17 Okay, so grabbing a random sentence and just holding on to that sentence. 5:20 Okay, so now I need to do something similar to this with the words that are in 5:27 that chosen sentence. 5:32 So let's do words = wordpunct_tokenize with the sentence. 5:34 So break up that sentence into words, and then I want to find all the nouns, 5:40 so I'm gonna do nh.process(sentence). 5:47 So what I've done here is I've created a list of all the words and punctuation. 5:51 And then here, I've gone and found all the nouns and noun phrases. 5:55 So nh.process will return a dictionary that has two keys, 5:59 one is noun phrases and one is nouns. 6:05 And the values for each of those keys is a list of words that are noun phrases or 6:08 nouns. 6:13 So a noun phrase would be the adjectives that surround a noun or refer to a noun, 6:13 and the nouns would be the individual nouns, right? 6:18 So quick brown fox, you'd get quick brown fox as the noun phrase and 6:21 you'd get fox as the noun. 6:26 So now I need to get a random noun, and replace that. 6:29 Let's find replacement random.choice 6:33 from the nouns key of the nouns dictionary. 6:37 And then words and I want to do [words.index(replacement)] = 'banana'. 6:43 Now, here's a good spot to extend this script. 6:52 What I'm using is only going to replace the first instance of the noun 6:55 in the sentence. 6:58 It might make more sense to replace all of the instances of that particular noun, but 6:59 I'm gonna leave that up to you. 7:03 This is also where you could change the replacement word if you want to use 7:05 something other than banana or have it use a random replacement word or whatever. 7:09 Finally, I need to put the whole thing back together, so 7:14 sentences[sentence_num] = space .join(words). 7:19 And then I need to send that back out, so 7:26 outputs.append[data("channel")] and 7:30 space .join[sentences]. 7:35 I'm a firm believer in one space after periods. 7:38 Now one thing I will point out is that this right here, this space .join(words), 7:42 you will end up getting spaces before apostrophes inside of words. 7:47 You can make this a bit more intelligent in how it rejoins these 7:52 sentences if you want but I didn't feel like that was necessary for this example. 7:55 So now let's try restarting the bot and see what happens. 8:00 Okay, and now lets send a message, 8:08 this message has some nouns in it. 8:12 And you can see here we've got a thing about the NLTK is missing a library. 8:16 So we need to install that through the NLTK Downloader. 8:21 So let's go ahead and hop into Python, 8:24 import nltk and do nltk.download. 8:30 And you can see this is looking for english.pickle. 8:36 So let's come here to All Packages and, Scroll down here. 8:39 I think it's in Models. 8:53 Punkt Tokenizer Models, I think that's the one we want to install. 8:59 So I'll hit Download and then we can quit Python and 9:02 let's try, Try running the bot. 9:07 The quick brown fox jumped over the lazy dog. 9:14 And we've got a thing here where it tried to pick the last word. 9:27 So we actually need to do len(sentences)-1), and 9:30 then let's try running this again and we'll send the same message. 9:35 And this right here, this nltk_data thing, 9:51 this is noun-hound downloading things that it needs. 9:54 And then we get the quick brown banana jumped over the lazy dog. 9:57 So that's cool but that's gonna run on every single message and 10:01 I probably don't want that to happen on every single message. 10:05 So what I'm gonna do is add in a bit of randomness. 10:08 So let's add a new variable up here that we'll call odds. 10:11 And it's gonna be 1 and then nine 0s inside of a list. 10:16 This is kind of a cool way to make a list that has multiple pieces to it. 10:21 And then, let's add a new thing in here. 10:25 And we'll say if random.choice(odds), and then we'll do all of this stuff. 10:28 So now it should only happen about once every ten messages, on average. 10:37 You can adjust this frequency of course by changing how many items are in this list 10:41 but that's it. 10:45 You should now have a bot plugin that in real time replaces a random word in 10:46 a random sentence in a random message with the word banana. 10:50 Nothing more important has ever been coded but 10:54 now you can build all sorts of great Slack bots with the RTM API. 10:56
You need to sign up for Treehouse in order to download course files.Sign up