Organization of work

The process of configuration management for a database is more complex due to the nature of a database and the complexity of delivering database upgrades. It makes the organization of work even more important. Considering the high cost of resolving database merge and especially merging conflicts it makes important to pay close attention to organization of work in order to make the configuration management process more efficient. Here are some recommendations:

  • Make teams of about 5-10 developers that work close to each other and can communicate verbally when having a question or can resolve any doubts that might arise. Good communication with promptly answers is of key importance for complex processes.
  • Divide all aspects of a database development into a set of areas according to the functionality and technical difficulty.
  • Make sure, that there is about 3-4 areas per developer. Assuming 5-10 person teams it divides a database into about 15-40 areas.  Then assign to every developer 3-4 areas of specialization, communicate and make the map of specialization available for the team.
  • Don’t change the assignments too frequently but keep it so the developers can learn and become experts in the assigned areas.
  • Make sure, that every area has at least two assignee: one the primary and another for backup. That will open more possibilities to assign tasks and help resolve issues e.g. when somebody is on vacations.
  • Assign work to developers according to the areas of expertise. When possible avoid assigning new tasks to developers that they are not familiar with unless it is a learning process.
  • Avoid scheduling multiple changes in the same area/component (e.g. package or table) at the same time.  In another words, the global perspective of changes made to the same area should reassemble serial order.
  • On the other hand changes that belong to different areas can and should be coded parallel when possible, that is when the changes aren’t conflicting – speaking in technical terms.
  • Perform merges of changes to other branches as soon as all changes in the given area are coded and verified.  Sure that is risky and would cause much work in case the changes should be recalled. Fortunately that doesn’t happened very frequently. On the other hand, it is more probable, that late merges will expose a project to potential merge conflicts that will cost the team much more work and time.

The consequences

  • It is not recommended, to make multiple changes to the same component simultaneously  in parallel branches by different developers.  The cost of merges might be much higher than the cost of making the changes in serial ordering. And the cost of merges is not specific only to a particular tool or repository (e.g. SVN or GIT) but mostly to the nature of a database. So even if there are two developers who potentially can work simultaneously on two different changes regarding the same area it is better to make the changes serial to save merging costs.
  • Should there be many changes in the same area and some of them should delivered sooner than the rest, make the more urgent changes a separate branch.  Start the work in the branch first and schedule work on the remaining changes in another branch after the urgent changes are coded and verified. This way, the preferred changes will be delivered first.
  • The goal is to avoid merge conflicts so merges become as easy as making copies of the changed components from one branch to another.  Assuming that developers are experts in their dedicated areas they can be trusted to make the merges themselves.  It will save a lot of config manager’s time, that can be used for something else, e.g. for reviewing the consistency of changes.  Again prompt communication is the key aspect here.

An example

The table on the left shows changes waiting to be assigned. Every change is associated with it’s area (by it’s own color). The cost is expressed in units of time. And there is the project that a given change belongs to.

The diagram on the right shows ordering of the changes in time withing each project. First the tasks are assigned to projects and then are globally ordered according to areas of expertise. There is one unit of time dedicated for the merge process (marked as m). The merges are triggered from Project I as shown by dotted lines. The symbol Ø represents idle windows of time, when no work is being performed in a given area.

The bottom line

Merging conflicts is costly, especially within a database project. It is worth to consider how tasks can be scheduled in order to avoid conflicts in a project. Without thinking ahead the costs of merging database changes could raise so high, that it will degraded the overall efficiency of teams working on parallel projects rendering the parallel strategy uneconomical. Experience is teaching that partially parallel projects with changes regarding the same area scheduled serially is much more economical even with some “idle” time. And the available time is not lost as it can be used for other important tasks like trainings.

CMToolBox RC1 Trial

The SCMSupport.com published the CMToolBox plugin for PL/SQL Developer on April 24, 2012. It is a public and free RC1 Trial version available from http://www.scmsupport.com/cmtoolbox/index.html.

The CMToolBox plugin allows to automatically save objects from database to local SVN branch and intermediates between SVN server and the local SVN directories in order to provide SVN operations: update, commit, browse, diff, status, log and add.

Functionality:

  • Supports multiple branches/databases representing multiple versions and projects.
  • Provides automated option to saves database objects to local SVN directories.
  • Incorporates easy access to commit, update, browse, diff, status log and add functions by TortoiseSVN.
  • Handles: FUNCTION, PROCEDURE, TRIGGER, PACKAGE, PACKAGE BODY, TYPE, TYPE BODY and VIEW types.
  • Automatically concatenates PACKAGE with PACKAGE BODY and TYPE with TYPE BODY respectively when saving database objects.
  • Optionally reduces diacritic characters to ASCII equivalents for compatibility with UTF-8.
  • Straightforward installation, configuration and easy to use pop-up menu and toolbar icons.

Here is a demo from YouTube:

 

Database test models

It will be nice to have automated continues tests for a database. Unfortunately testing a database is not as simple as testing an applications. Let’s consider there different models for testing a database.

Directly calling database procedures/functions and verifying responses.

This model is suitable for unit tests as it allows direct access to database objects. On the other hand testing complex use cases might require a series of calls that makes it more difficult.

Calling business application that is connected to database and verifying responses.

This model is suitable for testing business functionality through the business applications. But it is not suitable for unit tests or testing asynchronies processes as the responses might not always be available from the application or might not include all expected details about how the database performed a request.

Calling business application that is connected to database and verifying database changes/events.

It is the most complex model. It uses business application to request calls to the database and it assumes that the database has an agent that detects specific database events. The events are stored in event log and based on that a database monitor can process the events and forward it to the test engine for matching. The model is suitable for both low-level unit tests of elementary database objects as well as for testing complex business functions including asynchronies processes. On the other hand the model requires additional resources to run the agent, log and database monitor. And it requires more configuration to define and detect database events and additional configuration to match calls with events that might arrive asynchronously. One technique that helps to simplified the matching process might be adding distinguish identifier to each test if the application allows it and then storing the identifier with each change within the database. A practical example of an identifier could an USER id. To run a test in this model it will required a configuration of series of tests, and each test must include four configuration elements:

  • Definition of a call to application.
  • List of events that the Database Event Agent should watch for and then store to Event Log (both positive and negative).
  • List of events that Database Monitor should forward to the Test Engine.
  • Matching rules for the Test Engine that will define success and fail results.

The model might be suitable for continues integration of a complex database systems.