In the previous page it says "Captchas can be worked around w/ various technologies". Can you someone elaborate a bit?

Question

How exactly do scrapers get around captchas? I am particular interested to know the limitations as capchas from a defensive standpoint as a defender against certain types of bots. So don't hate me! Anyhow at this point it's purely for my own interest, as I'm not defending any particular pages at the moment.

Thank you!

Answer 1 · 2020-06-18T16:06:43Z

June 18, 2020 4:06pm

Captcha stands for "Completely Automated Public Turing test for telling Computers and Humans Apart."

however, computers/programs become smarter all the time, so the captchas need to be constantly updated to keep up with the smart AI.

Often, the captchas are also used to train AI, it's no coincidence that modern captchas are usually related to traffic (click all the images containing a bus, car, crosswalk, sidewalk, traffic light, etc). This is because they are used to train self driving cars.

At some point, the AI will become able to solve these captchas just as good or better than most humans, and then new captchas will need to be made.

Most of the time, for problems like these, they use self-learning AI, in other words, the AI gets trained. It just randomly guesses at first, but it becomes increasingly good at guessing once it gets feedback for which guesses are right and which or wrong. It will start to learn what information is relevant, and what isn't. If done enough times, it will eventually figure out the trick.

Welcome to the Treehouse Community

Looking to learn something new?

Josh Gold

Josh Gold

In the previous page it says "Captchas can be worked around w/ various technologies". Can you someone elaborate a bit?

1 Answer

Zimri Leijen

Zimri Leijen