Growing Open Source Software Communities Around Academic Projects

Authored by: Marx Kreutzsch From the book Open Advice, translated by: Fahad Al-Saeedi
Marx Kreutzsch is a postdoctoral researcher in the Department of Computer Science at the University of Oxford. He received his PhD from the Institute of Applied Informatics and Formal Description Methods at the Karlsruhe Institute of Technology in 2010. His research interests in intelligent information automation extend from the fundamentals of basic knowledge representations to applications such as the Semantic Web. He is the lead developer of the successful Semantic Web application platform MediaWiki, a co-editor of the W3C OWL 2 specification, a lead manager of the semanticweb.org portal community, and a co-author of the book Foundations of Semantic Web Technologies.

Academic researchers develop a large amount of software, whether it is to verify a hypothesis, to illustrate a new approach, or simply to assist in some area of study. In most cases, a small, central model performs the required function and is quickly discarded after the focus of the study shifts to other topics. Now and then, however, a new approach or technology emerges that has the potential to truly change the way a problem is solved. This promises professional prestige, commercial success, and the psychological satisfaction of realizing the full potential of a new idea. The researcher who makes this discovery will later be drawn beyond the prototype stage to the product stage and will be confronted with a new set of practical problems.

Fear of the user

In one of his most famous papers on software engineering, Frederick B. Brock gave a good picture of the effort involved in maintaining real software, and warned us about the user:
“The total cost of maintaining a widely used program is usually 40 percent or more of the cost of developing it. Surprisingly, this cost is strongly influenced by the number of users. More users will find more bugs.” (1)

While this picture would be slightly different in today’s environment, the basic observation remains true or even worsens with the use of instant global communications. Worse, not only will more users find more bugs, but they will also more likely request improvements. Whether it’s a genuine bug, a feature request, or just a basic misunderstanding of the software’s operations, user requests are often far from being accurate technical reports of the problem. This forces developers to review each request individually, wasting valuable time that wouldn’t be available for writing actual code.

The analytical mind of a researcher who anticipates this problem, and his natural resistance to preventing the dark future of customer service, may develop a real fear of the user. In the worst case, this may lead to a decision against the project altogether, and in the weakest case, the researcher may deliberately hide the amazing software products from potential users. I have heard more than one researcher say, “We don’t need more advertising, we get enough mail!” There are many cases where the effort of communicating a software tool exceeds the effort that the researcher can invest without having to leave his main job.

Often, however, these tragic outcomes could easily be prevented. It was hard for Brock to predict this. When he wrote his papers, users were customers, and software maintenance was part of the product they purchased. There had to be a balance between development effort, market size, and price. This is still the case for many commercial programs today, but it has little to do with the realities of small-scale open-source software development. Users of open-source software typically do not pay for the service they receive. They are therefore not in the position of demanding customers but usually of grateful, enthusiastic supporters. No small part of the art of successful free software maintenance is converting this enthusiasm into much-needed support, balancing increasing user interest with increasing user participation.

Understanding that open-source software users are not just “non-paying customers” is an important realization, but it should not lead us to overestimate their potential. The optimistic counterpart to irrational fear is the belief that active and supportive open-source communities grow organically based solely on the license chosen to release the source code. This fatal error in judgment is still surprisingly widespread and has led to the failure of many attempts to create open communities.

Planting and harvesting

The plural of “user” is not a “community.” While the former grows in numbers, the latter either does not grow by itself or grows wildly without the hope of project support. The task of a project manager who aspires to harness the untapped energy of users is like that of a farmer who needs to prepare fertile ground, plant seeds, water them, and perhaps prune unwanted branches before he can reap the fruits. Compared to the reward, the overall effort is small, but it is vital to do the right things at the right time.

Preparing the technical ground

Building a community begins even before the first user appears. The programming language chosen determines the number of people who will be able to publish and debug our code. The object-oriented language Caml seems like a beautiful language, but using Java instead will increase the number of potential users by many orders of magnitude. So developers have to balance things out, as the most popular technologies are often inefficient and inefficient. In practice, this is often a difficult step for researchers who usually prefer to design for superior languages. When I work on the contextual MediaWiki project, I often ask myself why on earth we are using PHP when Java for servers would be cleaner and more efficient. Perhaps comparing the size of the contextual MediaWiki community to that of similar Java projects will answer this question. This example also illustrates that the target audience determines the best choice of underlying technology. The developer himself must have the acumen to make the decision that is most appropriate for the situation.

Complete floor preparation

A related issue is to create readable and well-documented source code from the start. In an academic environment, some software projects are handled by several temporary contributors. Changes in staff and students on projects can ruin the quality of the code. I remember a small software project at TU Dresden that was well-managed by a student assistant. After he graduated, he discovered that his code was fully documented in Turkish. Researchers usually have a partial program, so there should be a system to enforce the additional work needed to make the code accessible. The reward would be a greater chance of useful bug reports, useful fixes, or even external developers at a later stage.

Spreading the seeds of communities

Inexperienced open-source developers often think that releasing their code openly is a big step. But in reality, no one else will notice. To attract users and contributors alike, someone has to spread the word. Public communication of a real project should at least include postings of each new release. Mailing lists are probably the best channel for this. It takes some social skill to find a balance between spam and timid advertising. Projects that are driven by a strong belief that they will help users solve real problems should be easy to advertise well. Users will easily notice the difference between blunt advertising and useful information. Obviously, active postings should be postponed until the project is ready. This includes not only the real code but also the main website and documentation for basic usage.

Over the life of the project, it should be mentioned in appropriate places, including websites (starting with your own!), presentations, scientific papers, and online discussions. One can never fully appreciate the power of a single link leading a subsequent major contributor to their first visit to the project page. Researchers should not forget to spread their programs beyond their immediate academic communities. Other researchers are rarely the best base for an active community.

Providing room for growth

It is a very simple but often neglected task for project managers to provide a space for communication where communities can grow. If a project does not have a dedicated mailing list, all support requests will be sent privately to the manager. If there is no public bug tracker, bug reports will be few and far between. Without an editable wiki for user documentation, the developer will have to constantly expand and revise the documentation. If there is no accessible source code development channel, users will not be able to check out the latest release before complaining about bugs. If the code repository is rigidly closed, it is impossible to allow other contributors. All of this infrastructure is freely available from many providers.

Not all forms of interaction are desirable—for example, there are reasons to keep a developer group closed. But it would be foolish to expect support from a community without setting up the basic spaces for that support.

Encourage and control growth

Inexperienced developers often worry about opening up mailing lists, forums, and wikis to users because it will require additional maintenance overhead. This is rare, but some basic activities are essential. It starts with a strict implementation of public communication. Users should be taught to ask questions in general terms, to look for answers in the documentation before asking, and to report bugs in the bug tracker rather than via email. I tend to decline all private support requests or pass answers to the public mailing list. This also ensures that solutions are available on the web for future users to find. In every case, users should be thanked for all contributions; it is essential to building a healthy community that there are lots of people who are enthusiastic and willing to help.

Once a certain level of user density is reached, support will start to emerge from user to user. This is usually a magical moment in a project and a sure sign that it is on the right track. Ideally, the primary project leader should continue to provide support for complex questions, but at some point, specific users will take the lead in discussions, and it is important to thank them (in person) and involve them more in the project. Conversely, unhealthy development should be stopped where possible, as some aggressive behaviors can be a real danger to the development of communities. Likewise, not all eagerness to help is productive, and it is often necessary to say no, in a friendly but clear manner, to prevent future problems.

The future is open

Building the initial community around a project is the most important part of moving a research model into a mature open-source program. If this part is successful, there are many options for future development of the project, depending on the goals of the project manager and the community. Some general trends include:

– Continue to grow and develop the project and community, expanding the core development team and its managers, and eventually becoming independent from its academic origins. This may require engaging in additional social activities (such as dedicated events) and perhaps establishing organizational support.

– Create a business investment company that builds on the project, for example, dual licensing or a consulting model. Ready-made tools and vibrant communities are key assets for any startup and can be useful for many business strategies without taking away from the open-source product.

– Withdrawing from the project. There are many reasons why one cannot maintain a strong relationship with a project. Creating a healthy open community maximizes the chances of a project’s viability. In any case, it is more respectful to be more explicit than to quietly kill the project, killing it with inactivity until it is so weak that it cannot find a future manager.

The community will look different when you work on one of these major options. However in each case, the role of the researcher in the project goal changes. The scientist and programmer may become a manager or a technical manager. In this sense, the main difference between an effective open-source project and an ongoing research model is not so much the work but the quality of the work required to succeed. Understanding this is part of success; the only other part is a great piece of software.

getforpc

0 8 minutes read

Fear of the user

Planting and harvesting

Preparing the technical ground

Complete floor preparation

Spreading the seeds of communities

Providing room for growth

Encourage and control growth

The future is open

Related Articles

Protect your phone from hacking in 8 simple steps

4 Smart Ways to Hide Files on Your Computer

What is Linux: An Overview of the Linux Operating System

How TO Buy A Laptop

Leave a Reply Cancel reply