Welcome to the Treehouse Community

The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.

General Discussion

Forum Contest: Reading and Displaying Data (Part 1)

Hello Everyone,

This time around we have a two part Forum Contest! One part for the developer, and the other for the designer.

After you've watched the video, please read the details below carefully. We're looking forward to your entries!

  1. Download a copy Ralph Waldo Emerson’s Essays, First Series in text format

  2. Programmatically read and analyze the text, we only want words.

  3. Eliminate some of the most used words in the english language by using this list

  4. Count the occurrence of each word across the entire document

  5. Output to the screen in a readable fashion

    1. total word count after filtering
    2. highest occurring word
    3. longest word(s) and its / their length
    4. sort the list from most occurring to least occurring then output that data to the screen as an unordered list

How to Enter:

  1. Post your code to a public gist on github

  2. Paste the url of the gist below this post.

Due Date:

All entries must be submitted by April 6th at 11:45pm ET. Here's a timezone chart so you can see what time that is for your locale.

Prize:

The entries will be judged by Treehouse teachers based on both code quality and best overall data set. One winner will receive a free month of Treehouse Gold on us! :) We'll announce the winner on April 7th and reveal the next contest.

Bonus:

Eventually we will want to make this data JSON This way we can give the findings over to next week's contest participants to analyze and display.

Good Luck, and Have Fun!

This sounds fun, who doesn't love a bit of data to play with.

nice

Question: When you say download a copy. Does this mean I can add the text file to the project or does the app need to download it from the url given?

Robert Bojor
Robert Bojor
Courses Plus Student 29,439 Points

Are multiple entries with different languages taken into consideration?

Aaron Ackerman - you can either download it programmatically to your project, then look at it or you can manually download it and add it to the project. In this case the text is not going to change over time, but in other cases it could. Perhaps check to see if the file online is different than your local copy before running the analyzation? (Not a requirement, just a thought)

Robert Bojor - You can submit as many as you like, but the output is the most important.

Jeremy Germenis
Jeremy Germenis
29,854 Points

You require total word count. Do you count hyphenated words like self-reliance as one or two words? By dictionary standards this is a word.

Jeremy Germenis
Jeremy Germenis
29,854 Points

also... what about possessive words like cat's? Does the possessiveness need to be retained?

Jeremy Germenis - Though these are good points, in regards to language, we inevitably want to visually show words that might provide some insight into the text. So self-reliance would mean more to me than self & reliance apart. Play with the options.

24 Answers

I'll give it a shot with Objective-C.

That sounds great!

bring it on. its going to be fun , #thumbs up

Stephen Mariano Cabrera
Stephen Mariano Cabrera
5,932 Points

For the purposes of judging does it matter what language we use?

You can use any language...

Scott Evans
Scott Evans
4,236 Points

You can do it in any language

Scott Evans
Scott Evans
4,236 Points

I apologies, Github decide to put the files in the opposite order. Please find my Output & Code towards the bottom of the Gist

Jose Colella
Jose Colella
3,526 Points

This is my version.

I have chosen Python for the language of choice.

https://gist.github.com/josecolella/9911228.

This assignment was fun

I was going to go with Python too because I still have a soft spot in my heart for it, but I shoot with Objective-C since I'm trying to do iOS development as a career.

Robert Bojor
PLUS
Robert Bojor
Courses Plus Student 29,439 Points

Finally went with PHP for this.

I've tried to keep it under 100 lines and no looping but apparently there's no way to manipulate some arrays and extract from them without some foreach loops.

https://gist.github.com/robertBraincache/9912392

Robert Bojor
PLUS
Robert Bojor
Courses Plus Student 29,439 Points

Here's my second implementation of the same logic, this time in Objective-C.

https://gist.github.com/robertBraincache/9915763

The output is in NSLog, hopefully that will suffice.

[self crossingFingers:YES]

The data is all that really matters. I will create the dataset based off the winner for the next stage.

Nice work Robert!

Robert Bojor
Robert Bojor
Courses Plus Student 29,439 Points

Thanks! Curious to see what others will come up with.

Mohamad El-Husseini
Mohamad El-Husseini
278 Points

Here is my Ruby solution. This is the same gist broken down by URL for ease of access:

The program: https://gist.github.com/abitdodgy/e3201365dd8328efbfd3#file-program-rb

And the program output: https://gist.github.com/abitdodgy/e3201365dd8328efbfd3#file-output-txt

One thing: For some odd reason the gist is eating my JSON, but I assure you the JSON is there. Running the program should output it.

Will we be penalised for code organisation? For example, I can split the classes into separate files, but for the sake of convenience, I'm keeping everything in the same file.

Mohamad El-Husseini - "Will we be penalised for code organisation?" Oh totally. That's Huge.... Seriously though, this is not an issue at all. It is way easier for this to keep it all in a single script. Looks good.

Brian McCall
Brian McCall
3,281 Points

I used javascript and jquery

https://gist.github.com/mcshiz/9962729

Please give me some feedback! anyone :-)

Jonathan Petersen
Jonathan Petersen
45,548 Points

Here is my Java version. Uses the treehouse hosted text files, and outputs JSON after the Stats. It is about as organized as I can get it with a single class. -> https://gist.github.com/jpete/9967105

By the way you have to click view raw to see the JSON output. It is all there. Here is the link to the raw post -> https://gist.githubusercontent.com/jpete/9967105/raw/98e718bea077ed3232a3c386b788b2ebfe54246f/TthParseReport+-+Output

Patrick Leary
Patrick Leary
2,812 Points

Here's one in node.js / JavaScript. It depends on the 'request' NPM package. It loads the files from their URLs right now and is all in a single file for simplicity.

Code: https://gist.github.com/pleary/9967501#file-waldo-js

Results: https://gist.github.com/pleary/9967501#file-results-txt

dan schmidt
dan schmidt
2,576 Points

Cool to see another node.js implementation. :)

Enara L. Otaegi
Enara L. Otaegi
13,107 Points

My entry with Objective-C. And the output.

It's my first time managing data with Objective-C and I'd love some feedback.

I'm not sure how we are supposed to display the output in a gist. I looked at a few of everyone else's output and I didn't know how to create it in the gist.

Jonathan Petersen
Jonathan Petersen
45,548 Points

At the bottom of your original Gist, click add another file. Then copy and paste the output. I don't think we needed to post the output. When I posted the output, Git hub flagged my account for posting spam, I had to contact support to get it reactivated.

Thanks Jonathan! I'll just leave it as is, but I went in and saw how I'd add another document.

dan schmidt
dan schmidt
2,576 Points

Here is mine, in node.js.

Did a command line app with optional parameter for json.

https://gist.github.com/DanSchmidt/10000777

Hi!

Here is my implementation:

https://gist.github.com/BeatrizEugenia/32437611d66a722b15ac

I'm using javascript (and jquery to get the file and display the results). I'm considering hyphenated words and possesives as one.

I'm displaying the JSON in my html file.

Congrats to Oleg Drobin who is our winner for this weeks forum contest!

Here is his winning answer using Javascript + jQuery + Underscore.js

Code

Output

This data will be used in this weeks contest which will be posted soon.

I would like Oleg to please take the time to go back and comment his code so that other students may learn from what he has done.

Please post that as another gist, as a comment to this answer.

Great work everyone! There were so many awesome submissions with such a great range of languages.

Mohamad El-Husseini
Mohamad El-Husseini
278 Points

If anyone is interested, I bundled my program into a gem, and gave it a much needed improvement. you can find the source code here. https://github.com/abitdodgy/words_counted

Cheers!