Categories
Data Security

Vulnerability detection in 410k WordPress websites

Summary: I’ve scraped and analyzed 409943 WordPress websites for security vulnerabilities; 42.3% contained at least one. These potentially vulnerable websites get a combined 14.6 billion page views per day. Even though this estimation is significantly higher then what can in practice be exploited, it’s clear outdated software is a serious problem within the WordPress ecosystem.

Introduction

Some time ago I came up with an idea for a newly structured store for WordPress plugins & themes, attempting to solve many of the existing problems. While writing the proposal, a recurring theme was security. There seems to be some wild number throwing here and there (suggesting every WP installation is hacked on average once a month, for example), but I couldn’t find some satisfactory research. Moreover, people keep throwing around percentages of market share as primary form of analysis, which imo gives a very skewed, if not wrong, view of the current state of WordPress. So I had two main questions:

  • How popular is WordPress really?
  • What is the state of secure of WordPress “in the wild”?

Down the rabbit hole I went, and a week, 3800 vCPU hours and some abuse emails (sorry) later, here we are.

It’s important to note this analysis is far from perfect. There are quite some biases in the data collection and processing that might skew the results, but it’s hard to say in which direction. I did the best I could however, and I think this report does a reasonable job in estimating the answer to my two questions. In short: don’t take any number down here too literally.

Scraping & popularity

During the first week of December 2019, I analyzed about 4 million top-level-domains. They were based on the data of 3 public “popular” domain data sets: majestic million, alexa million and the open rank 10 million. 2181967 of these domains gave a 2XX HTTP response (meaning they worked and are no redirects, I followed a max of 3 redirects). 673933 (30.9%) of these turned out to be detectable WordPress installations, 560244 of which I tried to analyze.

Defining popularity

This brings us to the first main question; these numbers don’t mean a lot by themselves. A very popular website (i.e. Amazon) clearly has more impact then my personal blog. Simple “market share” statistics do not take this into account, and assume both websites are equal.

To be able to calculate more informed numbers, I came up with an impact score for each website. Based on data from hypestat.com and the “domain rank” from the domain data sets mentioned above, I estimated the daily number of page views for a website. It’s the most accurate I could do without paying hundreds of dollars (for a data set which is just someone else’s estimation), but it’s far from perfect. It’s still informative, but please keep in mind that the error on this score is pretty high. That being said, the

impact score is the estimated daily page views of a website in millions. Example: an impact score of 5.3 means a website (or group of websites) gets an estimated 5.3 million page views per day.

It is way more useful to talk in these kind of numbers, since it takes into account the estimated popularity of a website. To give an idea how the statistic behaves: the impact score for google.com is 3483 (meaning 3.5 billion page views per day), which roughly matches real-world estimations. Cnn.com gets a score of 13.5, ycombinator.com 0.66 etc. The mean score is 0.1 and the median 0.00096. If you get a little lost here, no worries, I’ll try to guide you through!

So how popular is WordPress actually?

We already found out that WordPress’ market share is around 30%, mostly in line with what other sources say. But what is the actual impact? For that we simply sum the individual impact scores of all the WordPress websites and get a score of 45938. So all combined WordPress sites of our data set get 46 billion page views, per day. That is, even for the most widely used CMS, huge!

When putting that in perspective though, it’s “only” 4.37% of the total impact score in the data set, which is significantly lower. And it makes sense, very little of the top websites use WordPress (I found the first WordPress website somewhere around 50 in Alexaโ€™s top million list), and that is where most of the internet traffic goes to. The image below shows that quite well; it plots the number of WordPress installations (y axis) relative to the position in Alexa’s list.

Hopefully this gave a clear introduction and impression of the impact score. With the easy part being over, let’s get down to business.

Security Analysis

Updating plugins en WordPress itself is quite an effort, and one of the major problems with the current eco system. Websites keep using old versions with wide open (and publicly known) security gaps. I’ve used all security issues listen on cve.mitre.org, and successfully scanned 409943 WordPress installations for known security problems (113689 scans failed). I’ve excluded themes from this security analysis for practical reasons, but will give you a pretty graph with the most popular ones (number of installs) to make up for it.

CVE’s 101

To identify security issues, I’ve used the CVE database from https://cve.mitre.org. CVE stands for “Common Vulnerabilities and Exposures”, and is usually a description of a vulnerability in a piece of software before a certain version number. One of the latest in WordPress land for example, CVE-2019-17675, describes a number of security issues in WordPress 5.2.4 and lower. When a CVE is present in a website it does not automatically mean it’s exploitable; quite some vulnerabilities require some extra condition, ie. a level of privileges, or some PHP setting to be enabled. Those conditions are both not available in a strucutred way, and very hard to test without actually trying to exploit a vulnerability. I was hoping not to end up in jail for this, so I didn’t attempt a single exploit. As a result, numbers discussed below are upper bounds, and probably significantly lower in reality. I wouldn’t be surprised if less then 10% would actually be exploitable.

After processing, 3159 CVE’s were left for analysis.

Core

WordPress itself (the “core” as you might say) contains quite some security issues itself, but are generally quickly fixed. The 10 most detected WordPress versions (major.monir) are shown below. This is in agreement to statistics from other sources at the time of writing.

WordPress uses a semver versioning schema. Essentially

  • major version updates (i.e. from 4.x to 5.0) contain major feature updates
  • minor version updates (4.1 to 4.2) contain (as you might guess) minor feature updates
  • patch updates (i.e. 4.0.0 to 4.0.1) only fix bugs and security issues. Performing these patch updates is usually safe, and most site owners seem to do them (WordPress actually does them automatically nowadays).

To make life a little easier in analyzing core vulnerabilities, I assume that vulnerability fixes in a patch release affected all the patch versions in the same minor release, but no other. So a CVE fixed in 4.7.2 only affected version 4.7.1 and 4.7.0, but not 4.5.x.

20810 sites (5%) have one or more potential vulnerabilities in their core installation, totaling to an impact score of 2229. The “most popular” website with a core vulnerability is higher then 1000 on Alexa’s rank list. Worst is version 5.5.2, with an impact score of 772.

To put it more directly: an estimated 2.2 billion daily page views happen on WordPress installations that might vulnerable because the WordPress core is not updated.

Plugins

In the scanned 410k websites, I found 590109 plugin vulnerabilities. The most vulnerabilities were found for the Yoast plugin: (53182 times, total impact score of 4610), followed by jetpack (27529 found, impact score of 2377). Below is a plot of the top 20. Interesting to note is that Woocommerce (a plugin that facilitates web stores, and might result in WordPress to contain significant user & financial data) is pretty high.

Below is a simple breakdown of the statistics broken into categories.

Vuln. type# websites with this vuln typeimpact score
A1 (Injection)498554311
A2 (Broken authentication)419922899
A4 (XXE)1698211
A6 (Security misconfig)98273
A7 (XSS)1133809289
A8 (insecure deserialization)141041241
Remote code execution356243041
Upload attack5497460

Please contemplate these numbers for a second (even though they are rough estimations); there are 4.3 billion daily page views on websites with injection vulnerabilities, & 9.3 billion daily page views with potential XSS vulnerabilities.

This is only for the analyzed 410k websites. There are as much as 27 million WordPress websites live right now, and even though they are the long tail, probably receive a significant amount of traffic.

Finally, I found 2405 configuration file backups (i.e. wp-config.php.bak ,wp-config.php~ etc.) available for downloading, directly exposing the mysql credentials and hash keys.

Conclusion

The data set itself is quite interesting, and a lot more can be done then the relatively simple analysis here. The result is enough however to answer the two questions asked in the introduction, and I can quite confidently say that

WordPress has a market share of ~4%

WordPress websites are surprisingly vulnerable, simply due to a lack of maintenance. Billions of page views happen daily on WordPress installations that have potentially exploitable security vulnerabilities.

It’s no wonder most of the websites being hacked are WordPress. I expected some pretty large numbers before, but I’m honestly surprised at the sheer size and impacts. Combine a scrape bot like the one I used with existing tools (like metasploit, db-exploit etc.) and a week or 2, and even a self-made HTML-hacker can do huge damage. Since I do not want to encourage anybody to actually do this, I’ve for now decided to keep this data set & code private.

I’ll leave the social impact and potential solutions for another time. I hope you enjoyed reading. And if you happen to maintain WordPress sites, please go and do some updates! ๐Ÿ™‚

Notes

Disclaimer: I’m a freelance software- & data engineer, not a security specialist. I didn’t attempt a single exploitation, and did this project purely from a personal interest.

Assumptions & biases

There are so many silent assumptions & biases in this project that listing and discussing them would probably make a post longer then the post itself. The bottom line however, is the same: the internet is a messy place, and getting hard, consistent data from there is impossible. We have to do with estimations. That being said, some notes:

  • Data scraping was messy, and was done from different locations, mostly with VPN’s. Lot of the failures were in fact blocking the scrapers, meaning a (significant) part of the failed scrapes are in fact well-secured WordPress installations.
  • The scraper didn’t analyze WordPress.com sites, subdomains, or any domain with more then 1 dot in there in general. WordPress detection in itself was straight-forward, and plugins that are used to hide a WordPress installation should have easily fooled the scraper.
  • It’s unclear how WordPress usage distributes over the long tail (there are over 1.4 billion existing websites) and how generalizable these conclusions are.
  • Shortly mentioned before; the identification of a “vulnerability” was quite generous. I made simple matches: lets say plugin A has a vulnerability solved in v4.3, I identified a vulnerability for every website that has plugin A with version lower then v4.3 (so even v0.1). Even though the vulnerable code might be only introduced in v4.2. Or a specific, uncommon PHP setting is required. Or an attacker already needs admin privileges to escalate. I’m no security expert, but most CVE’s I looked into seemed to have some extra requirements like these. The actual exploitability will be far lower then the numbers mentioned. How much is impossible to say without actually attempting to exploit. I wouldn’t be surprised if the “exploitable impact score” is an order of magnitude lower.
  • The calculation of the impact score is pretty ugly: it’s an estimation of an estimation, allowing errors to grow wildly. I did expect a pretty direct relationship between hypestat page views and the alexa list rank position of a website, and although there is a general curve, it’s far from perfect.

Impact score estimator

For the curious: the impact score estimator is a LGBMRegressor. The target variable was the square of the scraped daily page views from hypestats.com (for about 3000 websites), with 12 inputs. Inputs were created by taking the 4 different scores (alexa rank, open score rank, open score score & majestic million rank) and for each taking the score itself, the sqrt(score), and np.exp(1/score). I’ve failed to properly document the error of the model – sue me ๐Ÿ˜‰ – it was large, but not unreasonable (sign less then an order of magnitude).