Web Development Data

This group intends to analyse web development data from around the world and publish monthly reports. By leveraging open source tools, we hope to create an open source project to do this.

We are just getting started. In the meantime, you can grab:

NEW: latest data set 2018-01-05 (1.3 Gb) 120,000 pages approx.
latest data set 2016-04-13 (1.3 Gb) 120,000 pages approx.
data set 2015-01-08 (780 Mb) 87,000 pages.
data set 2013-10-30 (780 Mb) 78,000 pages.
data set 2013-09-01 (980 Mb) 102,000 pages.
data set 18/06/2013 (484 Mb). Read the Read me for details of data set.

The raw data from March 2012 (121Mb). Consisting of approximately 8000 home pages from the top 10,000 most popular web sites.
A subset of the latest raw data from December 2012 (102 Mb). Consisting of approximately 7,300 home pages, that use the HTML5 doctype , from the top 50,000 most popular web sites.
The complete raw data from December 2012 (518 Mb). Consisting of 35,830 home pages from the top 50,000 most popular web sites.
If you’d like to join the community group, head over to our W3C page.