Agent and Repository

Problem

Many problems feature a large set of data, which is operated on by many different independent tasks. How can this be efficiently and correctly supported?

Context

Consider a large shared data structure. This structure might contain source code, a set of learned clauses in a Boolean satisfiability solver, or a full-blown, general-purpose database. The computation requires that many independent tasks be able to perform operations on this structure. The tasks operating on this structure are independent, so there is a good deal of task level parallelism. However, the data structure is shared, which can lead to bottlenecks, or incorrect behavior due to data races.

Forces

  • The agents want to perform operations on the data structure as quickly as possible.
  • The data structure is large, meaning that there is some pull towards distributing it across different memory pools.
  • Consistency must be maintained across the data structure, regardless of the sequence of operations that are performed on the data structure.
  • Consistency must be maintained, regardless of how the data structure may be distributed.
  • Enforcing consistency creates some pull towards centralization and serialization
  • Some operations may be small in scope, while others may touch every part of the data structure, leading to load balancing issues, or more seriously – starvation. Constant forward progress for all tasks must be guaranteed.

Solution

The software architecture is defined in terms of the following components:

  1. A large data structure (repository) that are shared among all agents.
  2. Agents that operate on the repository.
  3. A manager that maintains data consistency.

The repository is usually a large data structure. The repository could be stored in central location or it could be distributed. The agents operate on the repository. They are independent of each other (not aware of other agents). Usually, the agents act on a small piece of data of the repository. For efficiency reasons, it might be better for the agents to copy the working data locally and write it back to the repository after applying all operations, but care must be taken to ensure that the repository remains in a consistent state. Each agent believes it controls all of the repository it has access to. Thus, it is necessary to control the operations on the repository to ensure the repository behaves consistently with this premise. The manager controls repository accesses by the agents. The main purpose of the manager is to maintain data consistency, meaning that all agents reading the same part of the repository must have the same data. Writes by agents to the same part of the repository must be managed to avoid data getting corrupted. The manager could be

  1. Part of the repository e.g. databases
  2. Part of the agents (intelligent agents)
  3. A separate entity e.g Shared file systems
  4. Hardware based e.g. cache coherence protocols

Concurrency control can be achieved in a variety of ways.

  • The simplest way to to implement an Agent & Repository pattern would be to have (global) read & write locks for the repository. Any agent who wishes to read the repository can be given access if the write lock is not active. Any agent who wishes to write can be given access if both the read and write locks are not active. This method will be acceptable if there are only a few agents, agents only require occasional access to the repository and they do not lock the repository for a long time. This method will not be acceptable if there are a large number of agents, constantly requiring access to the repository and may lock up the repository for a considerable time.
  • We can implement the pattern better if the repository is divisible into several partitions, and the agents usually require access to only one (or a few) of the partitions at any time. In such a scenario, each of the partitions can have separate read/write locks. This method works well if the agents usually operate on distinct portions of the repository. Deadlocks are a real problem with this method and deadlock avoidance needs to be implemented.
  • More sophisticated implementations can have ways to do concurrent merging. This merging can be done intelligently by the manager itself if the ways in which the repository changes is known E.g. If all agents are trying to selectively add entries to a table, auto-merging can add entries produced by all the agents to a single row. If this is not possible, the manager should be able to detect conflicts and notify agents who should be able to undo their actions or produce alternative non-conflicting actions.

It is not necessary to have all agents be given access to all of the data in the repository. Some systems like databases and version control have strict permissions associated with the agents with respect to data accesses and modifications.

Invariant

Pre-condition: A repository large enough to accommodate all the data created by the agents, a set of independent agents acting on the data, a manager to control the data accesses by the agents to the repository.

Invariants: Data consistency is maintained at all times, i.e. reads to the same parts of repository return the same values and writes to the same parts of the repository are consistent. Tasks cannot be allowed to starve due to contention over shared resources.

Examples

Source Version Control System relies on the agent-repository pattern. The repository stores all the source files shared among several groups. The agents are the users who use the repository to develop programs. The manager defines the permissions of the agents and decides which parts of the repository are accessible to whom. Data is copied locally by the agents, operated on and uploaded back to the main repository. The agents can query if any other agents have modified parts of the data they are working on and if so, can update their local copies. Data corruption (Conflict) has to be resolved by the agents before updating the repository.

Known Uses

  • Version Control Systems e.g. CVS, SVN
  • Databases e.g. MySQL, IBM, Oracle
  • Shared File Systems e.g. NFS
  • Wiki software packages

Related Patterns

The following PLPP patterns are related to agent-repository

  • File System
  • Authority
  • Task parallelism
  • Shared queue

Authors

Bryan Catanzaro, Narayanan Sundaram, Bor-Yiing S