Background
For the past 8 years or so, it's been the case that the Browsers report at analytics.wikimedia.org reports a portion of 10-12% "Other".
In my view, this continuously calls into question the validity of the data and makes it difficult to trust. This lack of trust is due to the following assumptions being difficult to believe in their totality. The below assumptions are based solely on my own interpretations, and I expect these assumptions to be incorrect or incomplete. Having said that, I've shared these interpretations many times over the years, and have yet to encounter an alternate explanation, including from conversations with PMs, Directors, and engineers working on the measurements methods and underlying datasets.
- 100% = only from "user" type (excludes known bots and spiders).
- 100% = only page views (excludes page loads during other actions such as edit, history, search, special pages, and non-HTML requests; regardless of user agent string).
- Other = other browser families, representing a long tail of lesser known and "fake" browsers.
I have no trouble believing there are thousands or even millions of lesser known browser families or distinct (unparsable) user agent strings seen in pageview traffic.
The part that's unbelievable is that we have a combined 12% of (assumed-human) page view traffic worldwide coming from lesser known browsers. If true, this would make a very significant thing to talk about publicly and widely to support organisations like Open Web Advocacy speak up for browser diversity. And we should then strive to try to publish some kind of dataset that provides more insight into what some of the "biggest" of the smallest browsers are.
What's also hard to believe is that the market share of the main browsers are as high or as low as reported, based on other information available.
Impact
When we make product decisions around which part of our global audience we can support at a certain level (Basic/Modern, Grade C/A, as per mw:Compatibility#Browsers), we theoretical maximum of 88%. That's a pretty low ceiling.
Based on stats.wikimedia.org, that's more than 2 billion page views every month (of 24 billion), and 190 milion unique devices (of 1.6 billion); that we can't account for.
Examples:
- Nov 2016: Resort to stat collection from pageviews with JavaScript to make up for missing browser data. We can't meet goals like "make sure 99% keeps working" if we have a 12% gap in the data. For T141344: Remove JSON polyfill, data at T141344#2784065
- Aug 2021: T288287: Remove IE11 from Basic support ("Grade C")
- March 2023: Resort to stat collection with JavaScript for T178356: Raise Grade A JavaScript requirement from ES5 (2009) to ES6 (2015), data at T178356#8709512.
- March 2027: T128115: Drop support for ES3 javascript browsers in MediaWiki, data at T128115#3066522
Compared to other data
https://en.wikipedia.org/wiki/Usage_share_of_web_browsers
Source | Chrome | Firefox | Other |
---|---|---|---|
analytics.wikimedia.org (Week of 2023-07-09) | 48% | 3.3% | 12% |
StatCounter (Worldwide: June 2023) | 62.58% | 2.81% | 1.37% |
W3Counter (Dec 2022) | 71.2% | 3.0% | 5.7% |
Investigation so far
I've raised this numerous times internally, including to Jon Katz (then-Director in WMF Product) in 2020, in the hopes someone could analyze this.
The issue is also raised regularly when I publicly share Wikimedia'a browser (example 1). Most recently, Sime Vidas (of WebPlatformNews fame) raised it again on social media (example 2).
So, I'll try to investigate it now and report my findings here.