Package

eu.cdevreeze.tqa

base

Permalink

package base

Visibility
  1. Public
  2. All

Value Members

  1. package common

    Permalink

    Classes that are common to the DOM and model packages, typically "enumerations".

  2. package dom

    Permalink

    Type-safe XBRL taxonomy DOM API.

    Type-safe XBRL taxonomy DOM API. This contains DOM-like elements in taxonomy documents. It offers the yaidom query API and more, and wraps an underlying element that itself offers the yaidom query API, whatever the underlying element implementation. This package does not offer any taxonomy abstractions that cross any document boundaries, except for the light-weight TaxonomyBase abstraction at the type-safe DOM level.

    This package mainly contains:

    This package has no knowledge about and dependency on XPath processing.

    Usage

    Suppose we have an eu.cdevreeze.tqa.base.dom.XsdSchema called schema. Then we can find all global element declarations in this schema as follows:

    import scala.reflect.classTag
    import eu.cdevreeze.tqa.ENames
    import eu.cdevreeze.tqa.base.dom.GlobalElementDeclaration
    
    // Low level yaidom query, returning the result XML elements as TaxonomyElem elements
    val globalElemDecls1 = schema.filterChildElems(_.resolvedName == ENames.XsElementEName)
    
    // Higher level yaidom query, querying for the type GlobalElementDeclaration
    // Prefer this to the lower level yaidom query above
    val globalElemDecls2 = schema.findAllChildElemsOfType(classTag[GlobalElementDeclaration])
    
    // The following query would have given the same result, because all global element declarations
    // are child elements of the schema root. Instead of child elements, we now query for all
    // descendant-or-self elements that are global element declarations
    val globalElemDecls3 = schema.findAllElemsOrSelfOfType(classTag[GlobalElementDeclaration])
    
    // We can query the schema for global element declarations directly, so let's do that
    val globalElemDecls4 = schema.findAllGlobalElementDeclarations

    Global element declarations in isolation do not know if they are (item or tuple) concept declarations. In order to turn them into concept declarations, we need a SubstitutionGroupMap as context. For example:

    import eu.cdevreeze.tqa.dom.ConceptDeclaration
    
    // One-time creation of a ConceptDeclaration builder
    val conceptDeclBuilder = new ConceptDeclaration.Builder(substitutionGroupMap)
    
    val globalElemDecls = schema.findAllGlobalElementDeclarations
    
    val conceptDecls = globalElemDecls.flatMap(decl => conceptDeclBuilder.optConceptDeclaration(decl))

    Most TQA client code does not start with this package, however, but works with entire taxonomies instead of individual type-safe DOM trees, and with relationships instead of the underlying XLink arcs.

    Leniency

    See the remarks on leniency for type TaxonomyElem and its sub-types. This type-safe XBRL DOM model has been designed to be very lenient on instantiation of the model. Therefore this TQA type-safe DOM model can also be used for validating potentially erroneous taxonomy documents.

    On the other hand, if the instantiated model cannot be trusted to be schema-valid, one should be careful in choosing the API calls that can safely be made on schema-invalid taxonomy content. Yaidom API level query methods that return collections or options are typically safe to use on potentially schema-invalid taxonomy content.

    Other remarks

    To get the model right, there are many sources to look at for inspiration. First of all, for schema content there are APIs like the Xerces schema API (https://xerces.apache.org/xerces2-j/javadocs/xs/org/apache/xerces/xs/XSModel.html). Also have a look at http://www.datypic.com/sc/xsd/s-xmlschema.xsd.html for the schema of XML Schema itself. Moreover, there are many XBRL APIs that model (instance and) taxonomy data.

    On the other hand, this API (and the entirety of TQA) has its own design. Briefly, it starts bottom-up with yaidom, and gradually offers higher level (partial) abstractions on top of that. It does not hide yaidom, however.

    TODO Support for built-in schema types, and built-in XBRL types.

  3. package queryapi

    Permalink

    Traits offering parts of a taxonomy query API.

    Traits offering parts of a taxonomy query API. They can be assembled into "taxonomy classes". There are purely abstract query API traits, and partial implementations of those traits.

    Examples of such traits are traits for querying schema content, for querying inter-concept relationships, for querying dimensional relationships in particular, etc. These traits combined form the eu.cdevreeze.tqa.base.queryapi.TaxonomyApi query API. The partial implementations combined form the eu.cdevreeze.tqa.base.queryapi.TaxonomyLike trait, which implements most of the TaxonomyApi query API.

    Most query API methods are quite forgiving when the taxonomy is incomplete or incorrect. They just return the queried data to the extent that it is found. Only the getXXX methods that expect precisely one result will throw an exception if no (single) result is found.

    Ideally, the taxonomy query API is very easy to use for XBRL taxonomy scripting tasks in a Scala REPL. It must also be easy to mix taxonomy query API traits, and to compose taxonomy implementations that know about specific relationship types (such as in formulas or tables), or that store specific data that is queried quite often.

    TQA (except the "richtaxonomy" namespace) has no knowledge about XPath, so any XPath in the taxonomy is just text, as far as TQA is concerned.

    This package unidirectionally depends on the eu.cdevreeze.tqa.base.relationship and eu.cdevreeze.tqa.base.dom packages.

    Usage

    In the following examples, assume that we have a taxonomy of type eu.cdevreeze.tqa.base.queryapi.TaxonomyApi, for example a eu.cdevreeze.tqa.base.taxonomy.BasicTaxonomy. It may or may not be closed under "DTS discovery rules". The examples show how the taxonomy query API, along with the types in packages eu.cdevreeze.tqa.base.relationship and eu.cdevreeze.tqa.base.dom can be used to query XBRL taxonomies.

    Suppose we want to query the taxonomy for all English verbose concept labels, grouped by the concept target EName. Note that the "target EName" of a concept declaration is the name attribute along with the target namespace of the schema, as yaidom EName. Here is how we can get the concept labels:

    import scala.reflect.classTag
    import eu.cdevreeze.yaidom.core.EName
    import eu.cdevreeze.tqa.base.relationship.ConceptLabelRelationship
    
    val concepts: Set[EName] =
      taxonomy.findAllConceptDeclarations.map(_.targetEName).toSet
    
    val conceptLabelRelationshipsByConceptEName = (concepts.toIndexedSeq map { conceptEName =>
      val conceptLabelRels =
        taxonomy.filterOutgoingConceptLabelRelationships(conceptEName) { rel =>
    
          rel.language == "en" && rel.resourceRole == "http://www.xbrl.org/2003/role/verboseLabel"
        }
    
      (conceptEName -> conceptLabelRels)
    }).toMap
    
    val verboseEnConceptLabels: Map[EName, Set[String]] =
      conceptLabelRelationshipsByConceptEName mapValues { rels =>
        rels.map(_.labelText).toSet
      }

    In the example above, each concept should have at most one English verbose label, unless relationship prohibition/overriding is used. Validating this is an exercise for the reader, as they say. Note that in the example above, we mainly used the following knowledge about XBRL taxonomies: we get concept-labels by querying for concept-label relationships, which are standard relationships.

    Now suppose we want to find all English terse concept labels, grouped by the concept target EName, but only for concrete primary items. So we exclude abstract concepts, tuples, hypercubes and dimensions. Here is how:

    val concepts: Set[EName] =
      taxonomy.filterPrimaryItemDeclarations(_.isConcrete).map(_.targetEName).toSet
    
    val conceptLabelRelationshipsByConceptEName = (concepts.toIndexedSeq map { conceptEName =>
      val conceptLabelRels =
        taxonomy.filterOutgoingConceptLabelRelationships(conceptEName) { rel =>
    
          rel.language == "en" && rel.resourceRole == "http://www.xbrl.org/2003/role/terseLabel"
        }
    
      (conceptEName -> conceptLabelRels)
    }).toMap
    
    val terseEnConceptLabels: Map[EName, Set[String]] =
      conceptLabelRelationshipsByConceptEName mapValues { rels =>
        rels.map(_.labelText).toSet
      }

    To simulate how TQA retrieves concrete primary item declarations, we could write more verbosely:

    import eu.cdevreeze.tqa.base.dom.ConceptDeclaration
    import eu.cdevreeze.tqa.base.dom.PrimaryItemDeclaration
    
    val substitutionGroupMap = taxonomy.substitutionGroupMap
    val conceptDeclarationBuilder = new ConceptDeclaration.Builder(substitutionGroupMap)
    
    val concepts: Set[EName] =
      taxonomy.filterGlobalElementDeclarations(_.isConcrete).
        flatMap(decl => conceptDeclarationBuilder.optConceptDeclaration(decl)).
        collect({ case decl: PrimaryItemDeclaration => decl }).map(_.targetEName).toSet

    To simulate how TQA filters the concept label relationships we are interested in, we could write more verbosely:

    import eu.cdevreeze.tqa.ENames
    
    // Falling back to more general method filterOutgoingStandardRelationshipsOfType
    
    val conceptLabelRelationshipsByConceptEName = (concepts.toIndexedSeq map { conceptEName =>
      val conceptLabelRels =
        taxonomy.filterOutgoingStandardRelationshipsOfType(
          conceptEName,
          classTag[ConceptLabelRelationship]) { rel =>
    
          rel.resolvedTo.resolvedElem.attribute(ENames.XmlLangEName) == "en" &&
            rel.resolvedTo.resolvedElem.attributeOption(ENames.XLinkRoleEName).contains("http://www.xbrl.org/2003/role/terseLabel")
        }
    
      (conceptEName -> conceptLabelRels)
    }).toMap
    
    val terseEnConceptLabels: Map[EName, Set[String]] =
      conceptLabelRelationshipsByConceptEName mapValues { rels =>
        rels.map(_.resolvedTo.resolvedElem.text).toSet
      }

    Suppose we want to query the taxonomy for the parent-child presentation hierarchies in some custom ELR (extended link role). Note that the result can come from multiple linkbase documents, and TQA takes care of that if we query for parent-child relationships instead of querying for the underlying XLink presentation arcs. Here is how we get the top 2 levels:

    import scala.collection.immutable
    import eu.cdevreeze.tqa.base.relationship.ParentChildRelationship
    
    val parentChildRelationships =
      taxonomy.filterParentChildRelationships(_.elr == customElr)
    
    val topLevelConcepts: Set[EName] =
      parentChildRelationships.map(_.sourceConceptEName).toSet.diff(
        parentChildRelationships.map(_.targetConceptEName).toSet)
    
    val topLevelParentChildren: Map[EName, immutable.IndexedSeq[EName]] =
      (topLevelConcepts.toIndexedSeq map { conceptEName =>
        val parentChildren =
          taxonomy.filterOutgoingParentChildRelationships(conceptEName)(_.elr == customElr)
    
        val childENames = parentChildren.sortBy(_.order).map(_.targetConceptEName)
    
        (conceptEName -> childENames)
      }).toMap

    These examples only scratch the surface of what is possible. Dimensional relationship queries are typically more interesting than the examples above, for example. See eu.cdevreeze.tqa.base.queryapi.DimensionalRelationshipContainerApi for the dimensional query API that is part of eu.cdevreeze.tqa.base.queryapi.TaxonomyApi.

    Notes on performance

    The performance characteristics of the eu.cdevreeze.tqa.base.queryapi.TaxonomyApi trait and its implementations partially depend on the concrete "taxonomy" class used. Still we can say in general that:

    • Querying for concept declarations and schema components in general, based on the "target EName", is very fast.
    • Other than that, querying for concept declarations and schema components in general is slow.
    • Querying for outgoing and incoming standard relationships, given a concept EName, is very fast.
    • Other than that, querying for relationships is in general slow.

    In other words, in "inner loops", do not query for taxonomy content other than querying based on specific concept ENames! Note that in the examples above, we started with a slow query, and used fast queries based on concept ENames after that. Keep in mind that in taxonomies with millions of relationships the slow queries may have to process collections of all those relationships.

  4. package relationship

    Permalink

    This package contains relationships.

    This package contains relationships. Relationships are like their underlying arcs, but resolving the locators. Note that an arc may represent more than 1 relationship.

    This package mainly contains:

    Relationship factories extract relationships from a eu.cdevreeze.tqa.base.dom.TaxonomyBase. They can be used directly, but typically they are used implicitly when creating a eu.cdevreeze.tqa.base.taxonomy.BasicTaxonomy.

    This package has no knowledge about and dependency on XPath processing.

    For the usage of this API, see packages eu.cdevreeze.tqa.base.queryapi and eu.cdevreeze.tqa.base.taxonomy.

    This package unidirectionally depends on the eu.cdevreeze.tqa.base.dom package.

  5. package taxonomy

    Permalink

    Taxonomy classes, containing type-safe DOM trees, and mixing in taxonomy query API traits.

    Taxonomy classes, containing type-safe DOM trees, and mixing in taxonomy query API traits. In particular, the eu.cdevreeze.tqa.base.queryapi.TaxonomyApi trait is mixed in as taxonomy query API. See package eu.cdevreeze.tqa.base.queryapi for more information about how to query XBRL taxonomy content.

    The term taxonomy is used here in a very general sense, namely as a collection of taxonomy documents.

    Various scenarios are supported. Taxonomies that are not closed (and not validated in any way) must be supported in order for TQA to be useful for taxonomy validation. Closed taxonomies are supported for reliable taxonomy querying. Taxonomies that model networks of relationships are also supported. Specific taxonomies knowing about formulas and/or tables are also supported. Extension taxonomies are also supported.

    Some important operations on taxonomies are prohibition/overriding resolution (to find networks of relationships), combining taxonomies (for building extension taxonomies, for example), filtering relationships (to ignore relationships that we are not interested in).

    Each taxonomy class has at least the following state (directly or indirectly): a collection of taxonomy DOM root elements, and a collection of relationships. The underlying arcs, locators and resources of those relationships must exist in the collection of taxonomy DOM trees, or else the taxonomy is corrupt.

    TQA has no knowledge about XPath, so any XPath in taxonomies is just text, as far as TQA is concerned.

    This package unidirectionally depends on the eu.cdevreeze.tqa.base.queryapi, eu.cdevreeze.tqa.base.relationship and eu.cdevreeze.tqa.base.dom packages.

  6. package taxonomybuilder

    Permalink

    TQA bootstrapping.

    TQA bootstrapping. It works both in the JVM and in JavaScript runtime environments.

    First of all, bootstrapping needs a DocumentBuilder. Next we need a discovery strategy for obtaining the root elements of the taxonomy, as DocumentCollector. This is typically DTS discovery (the details of which can be somewhat tweaked). Finally we need a RelationshipFactory (and maybe an arc filter) to create a BasicTaxonomy.

    Once a BasicTaxonomy is created, it can be used as basis for wrapper taxonomy objects that know about networks of relationships, tables/formulas, etc.

    The DocumentCollector and DocumentBuilder abstractions play well with XBRL Taxonomy Packages.

    Specific DocumentCollectors and DocumentBuilders can be backed by thread-safe (Google Guava) caches in order to prevent re-computations of the same data.

    The bootstrapping process is inherently flexible in supporting the loading of more or less broken taxonomies. For example, backing element builders can be made to post-process broken input before the taxonomy DOM is instantiated. As another example, relationships resolution can be as lenient as desired.

    This package unidirectionally depends on the eu.cdevreeze.tqa.base.taxonomy, eu.cdevreeze.tqa.base.queryapi, eu.cdevreeze.tqa.base.relationship and eu.cdevreeze.tqa.base.dom packages.

Ungrouped