Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Business SEO Basics Better SEO Through Code Noindex and Nofollow

Dylan Carter
Dylan Carter
4,780 Points

www.xxxxxxx.com/robots.txt... "allow" & "disallow"?

so i found that page for a few of my favorite websites and on the links they list most say disallow and others say allow.

disallow always goes to a "page not found" page, but still custom to their website design not one of those generic black and white ones...

allow on the other hand was just a page on the website anyone could visit, but, there were very few allow pages compared to all the possible pages that you could visit

so my question is what is "allow" in the this context what does it mean and how do you do it and why. also when and who.

thanks

1 Answer

Ed Everett
Ed Everett
1,566 Points

Hey there Dylan,

The robots.txt file is a file that any search engine / bot checks before crawling a website to index its content. Disallow refers to places the site operator does not want indexed in a search engine, and Allow is the opposite of this. It does not prevent you, as a user from visiting those pages - it only prevents search engine crawlers from visiting those areas of your site.

As for the custom 404 errors you are receiving, it really does depend on what site you are visiting. It could be that they redirect you to a custom 404 page if you're not logged in or don't have high enough privilege to access the page you're trying to visit.