.. Copyright (C) 2014 Champs Libres Cooperative SCRLFS Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". Exports ******* Export is an important issue within the Chill software : users should be able to : - compute statistics about their activity ; - list "things" which are a part of their activities. The `main bundle`_ provides a powerful framework to build custom queries with re-usable parts across differents bundles. .. contents:: Table of content :local: .. seealso:: `The issue where this framework was discussed `_ Provides some information about the pursued features and architecture. Concepts ======== Some vocabulary: 3 "Export elements" ------------------------------------ Four terms are used for this framework : Exports provide some basic operation on the data. Two kinds of exports are available : - computed data : it may be "the number of people", "the number of activities", "the duration of activities", ... - list data : it may be "the list of people", "the list of activities", ... Filters The filters create a filter on the data: it removes some information the user doesn't want to introduce in the computation done by the export. Example of a filter: "people under 18 years olds", "activities between the 1st of June and the 31st December", ... Aggregators The aggregator aggregates the data into some group (some software use the term 'bucket'). Example of an aggregator : "group people by gender", "group people by nationality", "group activity by type", ... Formatters The formatters format the data into a :class:`Symfony\Component\HttpFoundation\Response`, which will be returned "as is" by the controller to the web client. Example of a formatter: "format data as CSV", "format data as an ods spreadsheet", ... Anatomy of an export --------------------- An export can be thought of as a sentence where each part of this sentence refers to one or multiple export elements. Examples : **Example 1**: Count the number of people having at least one activity in the last 12 month, and group them by nationality and gender, and format them in a CSV spreadsheet. Here : - *count the number of people* is the export part - *having at least one activity* is the filter part - *group them by nationality* is the aggregator part - *group them by gender* is a second aggregator part - *format the date in a CSV spreadsheet* is the formatter part Note that : - Aggregators, filters, exports and formatters are cross-bundle. Here the bundle *activity* provides a filter which is applied on an export provided by the person bundle ; - Multiple aggregator or filter for one export may exist. Currently, only one export is allowed. The result might be : +-----------------------+----------------+---------------------------+ | Nationality | Gender | Number of people | +=======================+================+===========================+ | Russian | Male | 12 | +-----------------------+----------------+---------------------------+ | Russian | Female | 24 | +-----------------------+----------------+---------------------------+ | France | Male | 110 | +-----------------------+----------------+---------------------------+ | France | Female | 150 | +-----------------------+----------------+---------------------------+ **Example 2**: Count the average duration of an activity with type "meeting", which occurs between the 1st of June and the 31st of December, group them by week, and format the data in an OpenDocument spreadsheet. Here : - *count the average duration of an activity* is the export part - *activity with type meeting* is a filter part - *activity which occurs between the 1st of June and the 31st of December* is a filter - *group them by week* is the aggregator part - *format the date in an OpenDocument spreadsheet* is the formatter part The result might be : +-----------------------+----------------------+ | Week | Number of activities | +=======================+======================+ | 2015-10 | 10 | +-----------------------+----------------------+ | 2015-11 | 12 | +-----------------------+----------------------+ | 2015-12 | 10 | +-----------------------+----------------------+ | 2015-13 | 9 | +-----------------------+----------------------+ Authorization and exports ------------------------- Exports, filters and aggregators should not show data the user is not allowed to see within the application. In other words, developers are required to take care of user authorization for each export. There should be a specific role that grants permission to users who are allowed to build exports. For more simplicity, this role should apply on a center, and should not require special circles. How does the magic work ? =========================== To build an export, we rely on the capacity of the database to execute queries with aggregate (i.e. GROUP BY) and filter (i.e. WHERE) instructions. An export is an SQL query which is initiated by an export, and modified by aggregators and filters. .. note:: **Example**: Count the number of people having at least one activity in the last 12 month, and group them by nationality and gender 1. The report initiates the query .. code-block:: SQL SELECT count(people.*) FROM people 2. The filter adds a where and join clause : .. code-block:: SQL SELECT count(people.*) FROM people RIGHT JOIN activity WHERE activity.date IS BETWEEN now AND 6 month ago 3. The aggregator "nationality" adds a GROUP BY clause and a column in the SELECT statement: .. code-block:: sql SELECT people.nationality, count(people.*) FROM people RIGHT JOIN activity WHERE activity.date IS BETWEEN now AND 6 month ago GROUP BY nationality 4. The aggregator "gender" does the same job as the nationality aggregator : it adds a GROUP BY clause and a column in the SELECT statement : .. code-block:: sql SELECT people.nationality, people.gender, count(people.*) FROM people RIGHT JOIN activity WHERE activity.date IS BETWEEN now AND 6 month ago GROUP BY nationality, gender Each filter, aggregator and filter may collect parameters from the user through a form. This form is appended to the export form. Here is an example. .. figure:: /_static/screenshots/development/export_form-fullpage.png The screenshot shows the export form for ``CountPeople`` (Nombre de personnes). The filter by date of birth is checked (*Filtrer par date de naissance de la personne*), which triggers a subform, which is provided by the :class:`Chill\PersonBundle\Export\Filter\BirthdateFilter`. The other unchecked filter does not show the subform. Two aggregators are also checked : by Country of birth (*Aggréger les personnes par pays de naissance*, the corresponding class is :class:`Chill\PersonBundle\Export\Aggregator\CountryOfBirthAggregator`, which also triggers a subform. The aggregator by gender (*Aggréger les personnes par genre*) is also checked, but there is no corresponding subform. The Export Manager ------------------ The Export manager (:class:`Chill\MainBundle\Export\ExportManager` is the central class which registers all exports, aggregators, filters and formatters. The export manager is also responsible for orchestrating the whole export process, producing a :class:`Symfony\FrameworkBundle\HttpFoundation\Request` for each export request. The export form step -------------------- The form step allows you to build a form, combining different parts of the module. The building of forms is split into different subforms, where each one is responsible for rendering their part of the form (aggregators, filters, and export). .. figure:: /_static/puml/exports/form_steps.png :scale: 40% The formatter form step ----------------------- The formatter form is processed *after* the user filled the export form. It is built the same way, but receives the data entered by the user on the previous step as parameters (i.e. export form). It may then adapt it accordingly (example: show a list of columns selected in aggregators). Processing the export --------------------- The export process can be explained by this schema : .. figure:: /_static/puml/exports/processing_export.png :scale: 40% (Click to enlarge) Export, formatters and filters explained ======================================== Exports ------- This is an example of the ``CountPerson`` export : .. literalinclude:: /_static/code/exports/CountPerson.php :language: php :linenos: * **Line 36**: the ``getType`` function returns a string. This string will be used to find the aggregtors and filters which will apply to this export. * **Line 41**: a simple description to help users understand what your export does. * **Line 46**: The title of the export. A summary of what your export does. * **Line 51**: The list of roles required to execute this export. * **Line 56**: We initiate the query here... * **Line 59**: We have to filter the query with centers the users checked in the form. We process the $acl variable to get all ``Center`` objects in one array * **Line 63**: We create the query with a query builder. * **Line 74**: We return the result, but make sure to hydrate the results as an array. * **Line 103**: return the list of formatter types which are allowed to be applied on this filter Filters ------- This is an example of the *filter by birthdate*. This filter asks some information through a form (`buildForm` is not empty), and this form must be validated. To perform this validation, we implement a new Interface: :class:`Chill\MainBundle\Export\ExportElementValidatedInterface`: .. literalinclude:: /_static/code/exports/BirthdateFilter.php :language: php .. todo:: Continue to explain the export framework .. _main bundle: https://git.framasoft.org/Chill-project/Chill-Main With many-to-* relationship, why should we set WHERE clauses in an EXISTS subquery instead of a JOIN ? `````````````````````````````````````````````````````````````````````````````````````````````````````` As we described above, the doctrine builder is converted into a sql query. Let's see how to compute the "number of course which count at least one activity type with the id 7". For the purpose of this demonstration, we will restrict this on two accompanying period only: the ones with id 329 and 334. Let's see the list of activities associated with those accompanying period: .. code-block:: sql SELECT id, accompanyingperiod_id, type_id FROM activity WHERE accompanyingperiod_id IN (329, 334) AND type_id = 7 ORDER BY accompanyingperiod_id; We see that we have 6 activities for the accompanying period with id 329, and only one for the 334's one. .. csv-table:: :header: id, accompanyingperiod_id, type_id 990,329,7 986,329,7 987,329,7 993,329,7 991,329,7 992,329,7 1000,334,7 Let's calculate the average duration for those accompanying periods, and the number of period: .. code-block:: sql SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)), COUNT(id) from chill_person_accompanying_period WHERE id IN (329, 334); The result of this query is: .. csv-table:: :header: AVG, COUNT 2 years 2 mons 21 days 12 hours 0 mins 0.0 secs,2 Now, we count the number of accompanying period, adding a :code:`JOIN` clause which make a link to the :code:`activity` table, and add a :code:`WHERE` clause to keep only the accompanying period which contains the given activity type: .. code-block:: sql SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7; What are the results here ? .. csv-table:: :header: COUNT 7 :code:`7` ! Why this result ? Because the number of lines is duplicated for each activity. Let's see the list of rows which are taken into account for the computation: .. code-block:: sql SELECT chill_person_accompanying_period.id, activity.id from chill_person_accompanying_period JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7; .. csv-table:: :header: accompanyingperiod.id, activity.id 329,993 334,1000 329,987 329,990 329,991 329,992 329,986 For each activity, a row is created and, as we count the number of non-null :code:`accompanyingperiod.id` columns, we count one entry for each activity (actually, we count the number of activities). So, let's use the :code:`DISTINCT` keyword to count only once the equal ids: .. code-block:: SELECT COUNT(DISTINCT chill_person_accompanying_period.id) from chill_person_accompanying_period JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7; Now, it works again... .. csv-table:: :header: COUNT 2 But, for the average duration, this won't work: the duration which are equals (because the :code:`openingdate` is the same and :code:`closingdate` is still :code:`NULL`, for instance) will be counted only once, which will give unexpected result. The solution is to move the condition "having an activity with activity type with id 7" in a :code:`EXISTS` clause: .. code-block:: sql SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id); The result is correct without :code:`DISTINCT` keyword: .. csv-table:: :header: COUNT 2 And we can now compute the average duration without fear: .. code-block:: sql SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)) from chill_person_accompanying_period WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id); Give the result: .. csv-table:: :header: AVG 2 years 2 mons 21 days 12 hours 0 mins 0.0 secs