Nowadays most of us have accepted that we will regularly be tracked as we go about our daily lives. Virtually everything we do on digital devices, especially when we are connected to the internet, is followed by someone somewhere. In fact, unless we take specific measures to prevent other people from doing so, it is a given that we will be tracked no matter what we’re doing.
Being tracked is nothing new. As long as the internet has been around, websites have been able to monitor their users. However, the range of techniques available today is truly mind-blowing. Fingerprinting, user agent analysis, IP address location tracking and many more are used to profile us and snoop on what we do when we’re online.
In fact, there is something of a paradox at play here. Genuine internet users are now subject to much greater scrutiny than before, largely because of the rise in bots and other non-human internet users.
You can understand why so many online services and websites want to prevent non-human users from accessing them. Every user represents a cost in resources, no matter how small. For websites that attract a large volume of traffic, these usually low-cost visits can be very significant. In some cases, preventing automated users and bots from accessing a service can significantly reduce the strain it finds itself under.
Why Is This Happening?
In spite of the fact that there is a concerted effort to remove bots from many online platforms, they have continued to increase at a remarkable rate. Anyone who uses social media heavily will probably encounter bots on the platform, whether he realizes it or not. And even if you don’t visit social media platforms, there are only a few websites and online services that aren’t being accessed by bots in some capacity or another.
For example, websites that provide price comparisons or aggregation of content will need to use bots to scrape other websites in order to get the data they present to their users. Without bots, search engines like Google would not be able to function. However, the term bot has acquired a number of negative connotations in recent years, in large part because they were used to spread Russian propaganda during the 2016 US elections.
Because bots are now under such scrutiny, and there is such a concentrated effort to prevent them from accessing online services, bot developers have had to be much smarter when trying to evade detection. There are a number of techniques bot developers can use to make their creations harder to spot.
Detecting Bots
Older and less sophisticated bots are fairly easy to detect online. In fact, many of them don’t employ any of the basic techniques that bot developers now use to avoid detection. However, gone are the days when simply looking at an IP address would be enough to tell if a user was human or not. Bots today employ much more sophisticated tactics than their predecessors had been able to do.
Not only do modern bots disguise their IP addresses; they also know how to spoof their user agents, throw off any fingerprinting efforts, and they respond to cookies in the same way a human user would do. All of these techniques make bots much harder to detect.
For now, bot developers continue to outwit most online services. Historically, they have been good at staying one step ahead of the competition. But that doesn’t stop websites from employing much stricter and more thorough detection methods, which ultimately means that they end up gathering more information from each user – whether they are human or not.
Caught in the Crossfire
As bot developers become more sophisticated and find new ways to avoid the detection measures, we have in practice, the websites are becoming much more aggressive in their efforts to unmask and prevent bots from using their platforms. Unfortunately, this more aggressive monitoring is also applied to regular users who have done nothing wrong.
While most of us expect bots to be a part of the web, and that websites have a right to prevent bots from accessing their platform if they choose so, there is still one very important question about informed consent. Websites are routinely tracking us in ways that they never did before, and some of these methods can infringe on our privacy.
In general, there is no way of finding out exactly what sort of defenses a website has without first visiting it. This makes it impossible to see in advance what detection methods it uses and then decide if we are happy to be subjected to them or not.
About the Author: Ebbe Kernel
Ebbe Kernel is Swedish data gathering analyst and consultant. He has been working in the field of data intelligence for over ten years and have advised hundreds of data providers and large companies on in-house data acquisition solutions. Read more about me over at ebbekernel.com.