Software Heritage
Software Heritage is a non-profit multi-stakeholder initiative unveiled in 2016 by Inria,[1] and supported by UNESCO.[2][3][4]
Formation | June 30, 2016 |
---|---|
Founder | Roberto Di Cosmo Stefano Zacchiroli |
Headquarters | Inria |
Location | |
Scientific Advisors | Gérard Berry Jean-François Abramatic Serge Abiteboul |
Affiliations | Inria |
Staff | 13 |
Website | softwareheritage |
Overview
The stated mission of Software Heritage is to collect, preserve and share all software that is publicly available in source code form, with the goal of building a common, shared infrastructure at the service of industry, research, culture and society as a whole.[5]
Software source code is collected by crawling code hosting platforms, like GitHub, GitLab.com or Bitbucket, and package archives, like Npm or Pypi, and ingested into a special data structure, a Merkle DAG, that is the core of the Software Heritage archive.[6] Each artifact in the archive is associated with an identifier, called SWHID.[7]
In order to increase the chances of preserving the Software Heritage archive over the long term, a mirror program has been put in place in 2018, joined by ENEA [8] and FossID [9] as of October 2020.
History
Software Heritage was developed at Inria since early 2015, under the direction of computer scientists Roberto Di Cosmo and Stefano Zacchiroli,[10] and announced officially to the public on June 30, 2016.[1][11]
In 2017 Inria signed an agreement with UNESCO for the long term preservation of software source code and for making it widely available, in particular through the Software Heritage initiative .[12]
In June 2018, at UNESCO Headquarters, the Software Heritage Archive [6] was opened.[2]
On July 4, 2018, Software Heritage was included in the French National Plan for Open Science [13]
In October 2018 the strategy and vision underlying the mission of Software Heritage was published in Communication of the ACM.[5]
In November 2018, Inria and UNESCO convened a group of 40 international experts to meet in November 2018 on invitation from Inria and UNESCO [14] leading to the publication on February 2019 of the Paris Call on Software Source Code.[15]
In November 2019, GitHub signed an agreement with Inria to improve the archival process of GitHub hosted projects in the Software Heritage archive.[16]
Software Heritage’s repository holds today over 143 million software projects, with an archive of over 9.1 billion unique source files as of October 2020.[6]
Funding
Software Heritage is a non-profit organization, funded largely from donations from supporting sponsors, that include private companies, public bodies and academic institutions.[17]
Software Heritage also seeks support for funding third parties interested in contributing to its mission. A grant from NLNet [18] funded the work of Octobus [19] and Tweag [20] that led to rescuing 250.000 Mercurial repositories phased out from Bitbucket.[21]
A grant from the Alfred P. Sloan Foundation funds experts to develop new connectors for expanding coverage of the Software Heritage Archive [22]
Awards
In 2016 Software Heritage received the best community project award at Paris Open Source Summit 2016.[23][24]
In 2019 Software Heritage received the award of Academic Initiative from the Pôle Systematic.[25]
References
- "Collect, organise, preserve and share the Software Heritage of mankind" (PDF). Software Heritage. 30 June 2016. Retrieved 26 July 2016.
- UNESCO. "Software Heritage". Retrieved 2 November 2020.
- Brown, Paul (30 June 2016). "Software Heritage: Creating a safe haven for software". Boing Boing. Retrieved 26 July 2016.
- Jost, Clémence (1 July 2016). "Open source: lancement de Software Heritage, la plus grande bibliothèque de codes source de la planète". Archimag. Retrieved 27 July 2016.
- Abramatic, Jean-François; Di Cosmo, Roberto; Zacchiroli, Stefano (1 October 2018). "Building the Universal Archive of Source Code Journal Article". Communications of the ACM. Retrieved 2 November 2020.
- "Software Heritage Archive". Retrieved 2 November 2020.
- "Software Heritage Persistent Identifiers". Software Heritage. Retrieved 2 November 2020.
- "At ENEA the first institutional mirror of Software Heritage". ENEA. Retrieved 2 November 2020.
- "FossID establishes first independent mirror of world's larges source code archive". FossID. Retrieved 2 November 2020.
- Moody, Lyn (30 June 2016). "Software Heritage, the "Library of Alexandria of software," launches today". Ars Technica. Retrieved 26 July 2016.
- Brogan, Jacob (30 June 2016). "Introducing Software Heritage, the Library of Alexandria for Code". Slate. Retrieved 26 July 2016.
- UNESCO. Director-General, 2009-2017 (Bokova, I.G.) (3 April 2020). "Discours de la Directrice générale de l'UNESCO, Irina Bokova, à l'occasion de la signature de l'accord entre l'UNESCO et INRIA portant sur la préservation et le partage du patrimoine logiciel" (Press release). Paris: UNESCO. Retrieved 2020-11-03.CS1 maint: multiple names: authors list (link)
- "National Plan for Open Science" (PDF). Ouvrir La Science. Retrieved 2 November 2020.
- "Experts call for greater recognition of software source code as heritage for sustainable development" (Press release). Paris: UNESCO. 16 November 2020. Retrieved 2 November 2020.
- "Paris Call on software source code as heritage for sustainable development". Paris: UNESCO. February 2019. Retrieved 2 November 2020.
- "GitHub Archive Program". November 2019. Retrieved 2 November 2020.
- "Software Heritage Sponsors". Retrieved 2 November 2020.
- "NLNet Software Heritage grant". Retrieved 2 November 2020.
- "Augmenting Software Heritage archiving capabilities". Retrieved 2 November 2020.
- "Long-term reproducibility with Nix and Software HERITAGE". Retrieved 2 November 2020.
- "Announcing the Mercurial public Bitbucket archive". Retrieved 2 November 2020.
- Sloan Foundation. "Excited to support Software Heritage". Retrieved 2 November 2020.
- Les Acteurs du Libre - Précédents Lauréats at the Wayback Machine (archived 18 January 2019)
- "Paris Open Source Summit 2016 : Prix Acteurs du Libre : et les gagnants sont..." Programmez! (in French). 17 November 2016. Retrieved 28 June 2019.
- @Pole_Systematic (27 June 2019). "Convention @Pole_Systematic le Trophée Prix Initiative académique est remis @SWHeritage" (Tweet) – via Twitter.