Data parallel algorithms constitute a large class of algorithms depending on the details of how data is shared as operations are applied concurrently to the data. If the sharing is minimal or if it can be handled by well-defined collective operations (e.g. parallel pre-fix or shift and mask operations) it may be possible to solve the problem with a single stream of instructions applied to data elements concurrently. In other words, the concurrency is strictly represented as a single stream of instructions applied to parallel data structures.