August 30, 2012 Leave a comment
The process of configuration management for a database is more complex due to the nature of a database and the complexity of delivering database upgrades. It makes the organization of work even more important. Considering the high cost of resolving database merge and especially merging conflicts it makes important to pay close attention to organization of work in order to make the configuration management process more efficient. Here are some recommendations:
- Make teams of about 5-10 developers that work close to each other and can communicate verbally when having a question or can resolve any doubts that might arise. Good communication with promptly answers is of key importance for complex processes.
- Divide all aspects of a database development into a set of areas according to the functionality and technical difficulty.
- Make sure, that there is about 3-4 areas per developer. Assuming 5-10 person teams it divides a database into about 15-40 areas. Then assign to every developer 3-4 areas of specialization, communicate and make the map of specialization available for the team.
- Don’t change the assignments too frequently but keep it so the developers can learn and become experts in the assigned areas.
- Make sure, that every area has at least two assignee: one the primary and another for backup. That will open more possibilities to assign tasks and help resolve issues e.g. when somebody is on vacations.
- Assign work to developers according to the areas of expertise. When possible avoid assigning new tasks to developers that they are not familiar with unless it is a learning process.
- Avoid scheduling multiple changes in the same area/component (e.g. package or table) at the same time. In another words, the global perspective of changes made to the same area should reassemble serial order.
- On the other hand changes that belong to different areas can and should be coded parallel when possible, that is when the changes aren’t conflicting – speaking in technical terms.
- Perform merges of changes to other branches as soon as all changes in the given area are coded and verified. Sure that is risky and would cause much work in case the changes should be recalled. Fortunately that doesn’t happened very frequently. On the other hand, it is more probable, that late merges will expose a project to potential merge conflicts that will cost the team much more work and time.
- It is not recommended, to make multiple changes to the same component simultaneously in parallel branches by different developers. The cost of merges might be much higher than the cost of making the changes in serial ordering. And the cost of merges is not specific only to a particular tool or repository (e.g. SVN or GIT) but mostly to the nature of a database. So even if there are two developers who potentially can work simultaneously on two different changes regarding the same area it is better to make the changes serial to save merging costs.
- Should there be many changes in the same area and some of them should delivered sooner than the rest, make the more urgent changes a separate branch. Start the work in the branch first and schedule work on the remaining changes in another branch after the urgent changes are coded and verified. This way, the preferred changes will be delivered first.
- The goal is to avoid merge conflicts so merges become as easy as making copies of the changed components from one branch to another. Assuming that developers are experts in their dedicated areas they can be trusted to make the merges themselves. It will save a lot of config manager’s time, that can be used for something else, e.g. for reviewing the consistency of changes. Again prompt communication is the key aspect here.
The table on the left shows changes waiting to be assigned. Every change is associated with it’s area (by it’s own color). The cost is expressed in units of time. And there is the project that a given change belongs to.
The diagram on the right shows ordering of the changes in time withing each project. First the tasks are assigned to projects and then are globally ordered according to areas of expertise. There is one unit of time dedicated for the merge process (marked as m). The merges are triggered from Project I as shown by dotted lines. The symbol Ø represents idle windows of time, when no work is being performed in a given area.
The bottom line
Merging conflicts is costly, especially within a database project. It is worth to consider how tasks can be scheduled in order to avoid conflicts in a project. Without thinking ahead the costs of merging database changes could raise so high, that it will degraded the overall efficiency of teams working on parallel projects rendering the parallel strategy uneconomical. Experience is teaching that partially parallel projects with changes regarding the same area scheduled serially is much more economical even with some “idle” time. And the available time is not lost as it can be used for other important tasks like trainings.