From: Zachary Vance Date: Fri, 20 Nov 2015 06:54:26 +0000 (-0800) Subject: Typos X-Git-Url: https://git.za3k.com/?a=commitdiff_plain;h=c2ab99ad295b11fc4c236d8364a62582de5ca0bb;p=za3k.git Typos --- diff --git a/github.html b/github.html index c9d39b6..fa3fd19 100644 --- a/github.html +++ b/github.html @@ -28,8 +28,8 @@ done

The Events Timeline is emphemeral, and being successfully recorded by githubarchive.org. A second person running the same program in case of downtime would be a plus.

Estimates on archiving repositories

-

I selected 1000 random respoitories from the above list, removing 427 forks. I then checked out all repositories. The total size was 4.3G, with or without compression. It was around 3 GB for a shallow checkout. If we assume forks take no space, this means an average github repository takes up 4.3M. Omitting the largest repositories may improve this estimate, but I didn't run further tests. I haven't checked, but the issue taken up by metadata like issues should be very small in comparison.

-

If there are 35,000,000 repositories on github at an average size of 4.3M each, that multiplies out to around 150TB data total.

+

I selected 1000 random repositories from the above list, removing 427 forks. I then checked out all repositories. The total size was 4.3G, with or without compression. It was around 3 GB for a shallow checkout. If we assume forks take no space, this means an average github repository takes up 4.3M. Omitting the largest repositories may improve this estimate, but I didn't run further tests.

+

If there are 35,000,000 repositories on github at an average size of 4.3M each, that multiplies out to around 150TB data total for the git repositories.

Additional information