From time to time I have helped companies do Open Source code audits in their own source code. Basically this consists of auditing their code to find open source code.
These code audits are particularly important during software releases and M&A events. I've helped companies do this for releases and been on both sides of M&A event driven audits.
If the developers have kept the attributions with any open source code they have re-used then grep is a fine tool for auditing. However this is a big IF. If your developers are sloppy and do not keep the attributions (ie copyright and license notices) with code they lift from open source you have a problem. A software tool needs to be used to scan the corporate source for hits in open source repositories.
There are at least three companies providing software to do this:
Ideally the outcome of this process is as follows:
- A clear company policy is set on what open source licenses are allowed and how developers can use open source come or components.
- The corporate code is cleanly annotated with any third party attributions (see below).
- Open Source code that has bad licenses for commercial usage is identified and removed before release.
- A Bill of Materials is created for each release listing third-party software in the release.
- Necessary copyright or other notices appear in About dialogs, manuals or product websites.
Example comment block:
/*
* XYZ.com Third-party or Open Source Declaration
* Name: Bart Simpson
* Date of first commit: 04/25/2009
* Release: 3.5 “The Summer Lager Release”
* Component: tinyjson
* Description: C++ JSON object serializer/deserializer
* Homepage: http://blog.beef.de/projects/tinyjson/
* License: MIT style license
* Copyright: Copyright (c) 2008 Thomas Jansen (thomas@beef.de)
* Note: See below for original declarations from the code
*/
If the above were upgraded to be in a javadoc style comment then a tool could be built to auto-magically generate a Bill of Materials for each release.
There is one grey area in all this: how to handle developers using code from discussion sites like PHP.net, CodeProject, StackOverflow and similar sites. Generally code put in these type of forums has no defined license. In this case the code is either copyrighted by the site or the author of the post... and developers should not use the code without getting an explicit license. However developers generally feel like people put the code up there to share. This conflict means the company policy on usage of this type of code must be clearly communicated to all developers.
This is a nice review article of other considerations for open source auditing:
Posted via email from nealrichter's posterous