Aggregation Builder
On this page
- Overview
- Add the Aggregation Builder Dependency
- Create Aggregation Stages
- Sample Documents
- Match Stage Example
- Group Stage Example
- Sort Stage Example
- Project Stage Example
- Build Aggregation Pipelines
- Filter and Group Example
- Unwind Embedded Arrays Example
- Single Equality Join Example
- Create a Custom Operator Factory
Overview
In this guide, you can learn how to perform aggregations and construct pipelines by using the Laravel Integration aggregation builder. The aggregation builder lets you use a type-safe syntax to construct a MongoDB aggregation pipeline.
An aggregation pipeline is a data processing pipeline that sequentially performs transformations and computations on data from the MongoDB database, then outputs the results as a new document or set of documents.
An aggregation pipeline is composed of aggregation stages. Aggregation stages use operators to process input data and produce data that the next stage uses as its input.
The Laravel MongoDB aggregation builder lets you build aggregation stages and aggregation pipelines. The following sections show examples of how to use the aggregation builder to create the stages of an aggregation pipeline:
Tip
The aggregation builder feature is available only in Laravel MongoDB versions 4.3 and later. To learn more about running aggregations without using the aggregation builder, see Aggregations in the Query Builder guide.
Add the Aggregation Builder Dependency
The aggregation builder is part of the mongodb/builder package. You must add this package as a dependency to your project to use it. Run the following command to add the aggregation builder dependency to your application:
composer require mongodb/builder:^0.2
When the installation completes, verify that the composer.json
file
includes the following line in the require
object:
"mongodb/builder": "^0.2",
Create Aggregation Stages
To start an aggregation pipeline, call the Model::aggregate()
method.
Then, chain aggregation stage methods and specify the necessary
parameters for the stage. For example, you can call the sort()
operator method to build a $sort
stage.
The aggregation builder includes the following namespaces that you can import to build aggregation stages:
MongoDB\Builder\Accumulator
MongoDB\Builder\Expression
MongoDB\Builder\Query
MongoDB\Builder\Type
Tip
To learn more about builder classes, see the mongodb/mongodb-php-builder GitHub repository.
This section features the following examples that show how to use common aggregation stages:
To learn more about MongoDB aggregation operators, see Aggregation Stages in the Server manual.
Sample Documents
The following examples run aggregation pipelines on a collection represented
by the User
model. You can add the sample data by running the following
insert()
method:
User::insert([ ['name' => 'Alda Gröndal', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('2002-01-01'))], ['name' => 'Francois Soma', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-02-02'))], ['name' => 'Janet Doe', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1987-03-03'))], ['name' => 'Eliud Nkosana', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1984-04-04'))], ['name' => 'Bran Steafan', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-05-05'))], ['name' => 'Ellis Lee', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1996-06-06'))], ]);
Match Stage Example
You can chain the match()
method to your aggregation pipeline to specify
a query filter. If you omit this stage, the aggregate()
method outputs
all the documents in the model's collection for the following stage.
This aggregation stage is often placed first to retrieve the data by using available indexes and reduce the amount of data the subsequent stages process.
Tip
If you omit the match()
method, the aggregation pipeline matches all
documents in the collection that correspond to the model before other
aggregation stages.
This example constructs a query filter for a match aggregation stage by
using the MongoDB\Builder\Query
builder. The match stage includes the
following criteria:
Returns results that match either of the query filters by using the
Query::or()
functionMatches documents that contain an
occupation
field with a value of"designer"
by using theQuery::query()
andQuery::eq()
functionsMatches documents that contain a
name
field with a value of"Eliud Nkosana"
by using theQuery::query()
andQuery::eq()
functions
Click the VIEW OUTPUT button to see the documents returned by running the code:
$pipeline = User::aggregate() ->match(Query::or( Query::query(occupation: Query::eq('designer')), Query::query(name: Query::eq('Eliud Nkosana')), )); $result = $pipeline->get();
[ { "_id": ..., "name": "Janet Doe", "occupation": "designer", "birthday": { "$date": { "$numberLong": "541728000000" } } }, { "_id": ..., "name": "Eliud Nkosana", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "449884800000" } } }, { "_id": ..., "name": "Ellis Lee", "occupation": "designer", "birthday": { "$date": { "$numberLong": "834019200000" } } } ]
Tip
The Query::or()
function corresponds to the $or
MongoDB query operator.
To learn more about this operator, see $or
in the Server manual.
Group Stage Example
You can chain the group()
method to your aggregation pipeline to modify the
structure of the data by performing calculations and grouping it by common
field values.
This aggregation stage is often placed immediately after a match stage to reduce the data subsequent stages process.
This example uses the MongoDB\Builder\Expression
builder to define the group keys in a
group aggregation stage. The group stage specifies the following grouping
behavior:
Sets the value of the group key, represented by the
_id
field, to the field value defined by theExpression
builderReferences the document values in the
occupation
field by calling theExpression::fieldPath()
function
Click the VIEW OUTPUT button to see the documents returned by running the code:
$pipeline = User::aggregate() ->group(_id: Expression::fieldPath('occupation')); $result = $pipeline->get();
[ { "_id": "engineer" }, { "_id": "designer" } ]
Tip
This example stage performs a similar task as the distinct()
query
builder method. To learn more about the distinct()
method, see the
Retrieve Distinct Field Values usage example.
Sort Stage Example
You can chain the sort()
method to your aggregation pipeline to specify
the documents' output order.
You can add this aggregation stage anywhere in the pipeline. It is often placed after the group stage since it can depend on the grouped data. We recommend placing the sort stage as late as possible in the pipeline to limit the data it processes.
To specify an sort, set the field value to the Sort::Asc
enum for an
ascending sort or the Sort::Desc
enum for a descending sort.
This example shows a sort()
aggregation pipeline stage that sorts the
documents by the name
field to Sort::Desc
, corresponding to reverse
alphabetical order. Click the VIEW OUTPUT button to see
the documents returned by running the code:
$pipeline = User::aggregate() ->sort(name: Sort::Desc); $result = $pipeline->get();
[ { "_id": ..., "name": "Janet Doe", "occupation": "designer", "birthday": { "$date": { "$numberLong": "541728000000" } } }, { "_id": ..., "name": "Francois Soma", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "886377600000" } } }, { "_id": ..., "name": "Ellis Lee", "occupation": "designer", "birthday": { "$date": { "$numberLong": "834019200000" } } }, { "_id": ..., "name": "Eliud Nkosana", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "449884800000" } } }, { "_id": ..., "name": "Bran Steafan", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "894326400000" } } }, { "_id": ..., "name": "Alda Gröndal", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "1009843200000" } } } ]
Project Stage Example
You can chain the project()
method to your aggregation pipeline to specify
which fields from the documents to display by this stage.
To specify fields to include, pass the name of a field and a truthy value,
such as 1
or true
. All other fields are omitted from the output.
Alternatively, to specify fields to exclude, pass each field name and
a falsy value, such as 0
or false
. All other fields are included in
the output.
Tip
When you specify fields to include, the _id
field is included by default.
To exclude the _id
field, explicitly exclude it in the projection stage.
This example shows how to use the project()
method aggregation stage to
include only the name
field and exclude all other fields from the output.
Click the VIEW OUTPUT button to see the data returned by
running the code:
$pipeline = User::aggregate() ->project(_id: 0, name: 1); $result = $pipeline->get();
[ { "name": "Alda Gröndal" }, { "name": "Francois Soma" }, { "name": "Janet Doe" }, { "name": "Eliud Nkosana" }, { "name": "Bran Steafan" }, { "name": "Ellis Lee" } ]
Build Aggregation Pipelines
To build an aggregation pipeline, call the Model::aggregate()
method,
then chain the aggregation stages in the sequence you want them to
run. The examples in this section are adapted from the Server manual.
Each example provides a link to the sample data that you can insert into
your database to test the aggregation operation.
This section features the following examples, which show how to use common aggregation stages:
Filter and Group Example
This example uses the sample data given in the Calculate Count,
Sum, and Average
section of the $group
stage reference in the Server manual.
The following code example calculates the total sales amount, average sales quantity, and sale count for each day in the year 2014. To do so, it uses an aggregation pipeline that contains the following stages:
$match stage to filter for documents that contain a
date
field in which the year is 2014$group stage to group the documents by date and calculate the total sales amount, average sales quantity, and sale count for each group
$sort stage to sort the results by the total sale amount for each group in descending order
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = Sale::aggregate() ->match( date: [ Query::gte(new UTCDateTime(new DateTimeImmutable('2014-01-01'))), Query::lt(new UTCDateTime(new DateTimeImmutable('2015-01-01'))), ], ) ->group( _id: Expression::dateToString(Expression::dateFieldPath('date'), '%Y-%m-%d'), totalSaleAmount: Accumulator::sum( Expression::multiply( Expression::numberFieldPath('price'), Expression::numberFieldPath('quantity'), ), ), averageQuantity: Accumulator::avg( Expression::numberFieldPath('quantity'), ), count: Accumulator::sum(1), ) ->sort( totalSaleAmount: Sort::Desc, );
[ { "_id": "2014-04-04", "totalSaleAmount": { "$numberDecimal": "200" }, "averageQuantity": 15, "count": 2 }, { "_id": "2014-03-15", "totalSaleAmount": { "$numberDecimal": "50" }, "averageQuantity": 10, "count": 1 }, { "_id": "2014-03-01", "totalSaleAmount": { "$numberDecimal": "40" }, "averageQuantity": 1.5, "count": 2 } ]
Unwind Embedded Arrays Example
This example uses the sample data given in the Unwind Embedded Arrays
section of the $unwind
stage reference in the Server manual.
The following code example groups sold items by their tags and calculates the total sales amount for each tag. To do so, it uses an aggregation pipeline that contains the following stages:
$unwind stage to output a separate document for each element in the
items
array$unwind stage to output a separate document for each element in the
items.tags
arrays$group stage to group the documents by the tag value and calculate the total sales amount of items that have each tag
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = Sale::aggregate() ->unwind(Expression::arrayFieldPath('items')) ->unwind(Expression::arrayFieldPath('items.tags')) ->group( _id: Expression::fieldPath('items.tags'), totalSalesAmount: Accumulator::sum( Expression::multiply( Expression::numberFieldPath('items.price'), Expression::numberFieldPath('items.quantity'), ), ), );
[ { "_id": "school", "totalSalesAmount": { "$numberDecimal": "104.85" } }, { "_id": "electronics", "totalSalesAmount": { "$numberDecimal": "800.00" } }, { "_id": "writing", "totalSalesAmount": { "$numberDecimal": "60.00" } }, { "_id": "office", "totalSalesAmount": { "$numberDecimal": "1019.60" } }, { "_id": "stationary", "totalSalesAmount": { "$numberDecimal": "264.45" } } ]
Single Equality Join Example
This example uses the sample data given in the Perform a Single
Equality Join with $lookup
section of the $lookup
stage reference in the Server manual.
The following code example joins the documents from the orders
collection with the documents from the inventory
collection by using
the item
field from the orders
collection and the sku
field
from the inventory
collection.
To do so, the example uses an aggregation pipeline that contains a $lookup stage that specifies the collection to retrieve data from and the local and foreign field names.
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = Order::aggregate() ->lookup( from: 'inventory', localField: 'item', foreignField: 'sku', as: 'inventory_docs', );
[ { "_id": 1, "item": "almonds", "price": 12, "quantity": 2, "inventory_docs": [ { "_id": 1, "sku": "almonds", "description": "product 1", "instock": 120 } ] }, { "_id": 2, "item": "pecans", "price": 20, "quantity": 1, "inventory_docs": [ { "_id": 4, "sku": "pecans", "description": "product 4", "instock": 70 } ] }, { "_id": 3, "inventory_docs": [ { "_id": 5, "sku": null, "description": "Incomplete" }, { "_id": 6 } ] } ]
Create a Custom Operator Factory
When using the aggregation builder to create an aggregation pipeline, you can define operations or stages in a custom operator factory. A custom operator factory is a function that returns expressions or stages of an aggregation pipeline. You can create these functions to improve code readability and reuse.
This example shows how to create and use a custom operator factory that returns expressions that extract the year from a specified date field.
The following function accepts the name of a field that contains a date and returns an expression that extracts the year from the date:
public function yearFromField(string $dateFieldName): YearOperator { return Expression::year( Expression::dateFieldPath($dateFieldName), ); }
The example aggregation pipeline includes the following stages:
addFields()
, which calls the custom operator factory function to extract the year from thebirthday
field and assign it to thebirth_year
fieldproject()
, which includes only thename
andbirth_year
fields in its output
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = User::aggregate() ->addFields(birth_year: $this->yearFromField('birthday')) ->project(_id: 0, name: 1, birth_year: 1);
[ { "name": "Alda Gröndal", "birth_year": 2002 }, { "name": "Francois Soma", "birth_year": 1998 }, { "name": "Janet Doe", "birth_year": 1987 }, { "name": "Eliud Nkosana", "birth_year": 1984 }, { "name": "Bran Steafan", "birth_year": 1998 }, { "name": "Ellis Lee", "birth_year": 1996 } ]