Concepts¶
The library defines several building blocks for processing data streams. It is recommended that you familiarize yourself with the concepts described in this document before reading about how to use and extend the library.
Stream source¶
A stream source (or collection) is any iterable that can be iterated over, which
means either an array or an instance of \Traversable. In short, it is
any iterable.
Each stream source emits values indexed by a key. The key is usually an int
or string, as PHP arrays are commonly used. However, the library assumes any
iterable, including, but not limited to:
\Generator, which may emit values with any type of key,\WeakMap, which emits objects as a key,and so on.
The common characteristic of a stream source is that it is not rewindable. Generators, for example, cannot be rewound; you cannot iterate over them twice. For this reason, even when using arrays (or any other rewindable stream source), the library assumes that the stream source is not rewindable.
Data stream, or stream wrapper¶
A data stream (or stream wrapper) is the
RunOpenCode\Component\Dataset\Stream class, which wraps a stream source and
provides stream processing using operators, reducers, collectors, and
aggregators (which will be discussed later in the document).
Using an object-oriented approach, you can apply various operations to your data source through the fluent API provided by the data stream instance.
1<?php
2
3use RunOpenCode\Component\Dataset\Stream;
4
5Stream::create(/* ... */)
6 ->map(/* ... */)
7 ->tap(/* ... */)
8 ->takeUntil(/* ... */)
9 ->finally(/* ... */);
With PHP 8.5 in mind, the library also provides functions to support a functional approach using the pipe operator.
1<?php
2
3use function RunOpenCode\Component\Dataset\stream;
4use function RunOpenCode\Component\Dataset\map;
5use function RunOpenCode\Component\Dataset\tap;
6use function RunOpenCode\Component\Dataset\takeUntil;
7use function RunOpenCode\Component\Dataset\finally;
8
9stream(/* ... */)
10 |> map(/* ... */)
11 |> tap(/* ... */)
12 |> takeUntil(/* ... */)
13 |> finally(/* ... */);
A data stream is, of course, iterable, and none of the operators are applied until the stream is iterated.
Operators¶
Operators are used to perform specific operations on a data stream. They process each yielded value one by one and yield the result of their operation.
The library provides a set of commonly used operators, such as map(),
filter(), take(), and others. However, you can extend the available set
of operators by implementing your own.
The general idea behind operators is to execute various operations that read from and/or modify the original stream as it is being iterated.
Reducers¶
Reducers iterate over a data stream and reduce all elements into a single value
of any kind. Common examples of reducers include sum(), average(),
min(), max(), all of which are provided by this library.
However, reducers are designed to be iterable as well and can be applied as aggregators (a concept introduced by this library, explained later in this document). This allows you to apply a reducer to a stream while still being able to iterate over it and obtain the reduced value at the same time.
1<?php
2
3use RunOpenCode\Component\Dataset\Stream;
4use RunOpenCode\Component\Dataset\Reducer\Sum;
5
6echo Stream::create([1, 3, 2, 5])
7 ->reduce(Sum::class); // prints 11
Collectors¶
When operators (and aggregators) are applied to a stream, you can access the stream data simply by iterating over it.
Sometimes, however, you may want to collect all the data into a specific data structure for further processing using other methods.
The library supports this concept and provides common collectors, such as
RunOpenCode\Component\Dataset\Collector\ArrayCollector, which collects all
items into an array, or
RunOpenCode\Component\Dataset\Collector\ListCollector, which collects items
into a numerically ordered array, and more.
1<?php
2
3use RunOpenCode\Component\Dataset\Stream;
4use RunOpenCode\Component\Dataset\Collector\ArrayCollector;
5
6$generator = function(): iterable {
7 yield 1;
8 yield 3;
9 yield 2;
10 yield 5;
11};
12
13$collected = Stream::create($generator())
14 ->collect(ArrayCollector::class);
15
16var_dump($collected->value); // prints [0 => 1, 1 => 3, 2 => 2, 3 => 5]
Aggregators¶
Aggregators are a concept introduced by this library. The general idea is that you can iterate over a stream with applied operators while simultaneously calculating a reduced value in a single pass.
This is useful, for example, when rendering a table of financial data and you want to display totals, averages, or similar summary values at the bottom of the table.
Aggregators are essentially “attached” reducers to a stream and can be accessed once the stream has been fully iterated.
1<?php
2
3use RunOpenCode\Component\Dataset\Stream;
4use RunOpenCode\Component\Dataset\Reducer\Sum;
5
6$stream = Stream::create([1, 3, 2, 5])
7 ->aggregate('sum', Sum::class);
8
9foreach($stream as $item) {
10 echo $item;
11 echo "\n";
12}
13
14echo $stream->aggregated['sum'];
With an understanding of the concepts used in this library, you can now proceed with the rest of the documentation.