StatsCriteria

Instance Constructors

new StatsCriteria(triples: DataSet[Triple])

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val env: ExecutionEnvironment
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val logger: Logger

Attributes
protected
Definition Classes
Logging
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def stats: DataSet[String]

Compute distributed RDF dataset statistics.
Compute distributed RDF dataset statistics.
returns
VoID description of the given dataset
def statsBlanksAsObject(): DataSet[Triple]

19.
19. Blanks as object criterion
returns
number of triples where blanknodes are used as objects.
def statsBlanksAsSubject(): DataSet[Triple]

18.
18. Blanks as subject criterion
returns
number of triples where blanknodes are used as subjects.
def statsClassUsageCount(): DataSet[(Node, Int)]

2. Class Usage Count Criterion
Count the usage of respective classes of a datase, the filter rule that is used to analyze a triple is the same as in the first criterion.
2. Class Usage Count Criterion
Count the usage of respective classes of a datase, the filter rule that is used to analyze a triple is the same as in the first criterion. As an action a map is being created having class IRIs as identifier and its respective usage count as value. If a triple is conform to the filter rule the respective value will be increased by one. Filter rule : ?p=rdf:type && isIRI(?o) Action : M[?o]++
returns
DataSet of classes used in the dataset and their frequencies.
def statsClassesDefined(): DataSet[Node]

3. Classes Defined Criterion
Gets a set of classes that are defined within a dataset this criterion is being used.
3. Classes Defined Criterion
Gets a set of classes that are defined within a dataset this criterion is being used. Usually in RDF/S and OWL a class can be defined by a triple using the predicate rdf:type and either rdfs:Class or owl:Class as object. The filter rule illustrates the condition used to analyze the triple. If the triple is accepted by the rule, the IRI used as subject is added to the set of classes. Filter rule : ?p=rdf:type && isIRI(?s) &&(?o=rdfs:Class||?o=owl:Class) Action : S += ?s
returns
DataSet of classes defined in the dataset.
def statsDataTypes(): DataSet[(String, Int)]

20.
20. Datatypes criterion
returns
histogram of types used for literals.
def statsDistinctEntities(): DataSet[Triple]

16. Distinct entities
Count distinct entities of a dataset by filtering out all IRIs.
16. Distinct entities
Count distinct entities of a dataset by filtering out all IRIs. Filter rule : S+=iris({?s,?p,?o}) Action : S
returns
DataSet of distinct entities in the dataset.
def statsDistinctObjects(): DataSet[Triple]

Distinct Objects
Count distinct objects within triples.
Distinct Objects
Count distinct objects within triples. Filter rule : isURI(?o) Action : M[?o]++
returns
DataSet of objects used in the dataset.
def statsDistinctSubjects(): DataSet[Triple]

Distinct Subjects
Count distinct subject within triples.
Distinct Subjects
Count distinct subject within triples. Filter rule : isURI(?s) Action : M[?s]++
returns
DataSet of subjects used in the dataset.
def statsLabeledSubjects(): DataSet[Node]

24.
24. Labeled subjects criterion.
returns
list of labeled subjects.
def statsLanguages(): DataSet[(String, Int)]

21.
21. Languages criterion
returns
histogram of languages used for literals.
def statsLinks(): DataSet[(String, String, Int)]

26.
26. Links criterion.
Computes the frequencies of links between entities of different namespaces. This measure is directed, i.e. a link from ns1 -> ns2 is different from ns2 -> ns1.
returns
list of namespace combinations and their frequencies.
def statsLiterals(): DataSet[Triple]

* 17.
* 17. Literals criterion
returns
number of triples that are referencing literals to subjects.
def statsObjectVocabularies(): AggregateDataSet[(String, Int)]

32. Object vocabularies
Compute object vocabularies/namespaces used through the dataset.
32. Object vocabularies
Compute object vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?o) Action : M[ns]++
returns
DataSet of distinct object vocabularies used in the dataset and their frequencies.
def statsPredicateVocabularies(): AggregateDataSet[(String, Int)]

31. Predicate vocabularies
Compute predicate vocabularies/namespaces used through the dataset.
31. Predicate vocabularies
Compute predicate vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?p) Action : M[ns]++
returns
DataSet of distinct predicate vocabularies used in the dataset and their frequencies.
def statsPropertiesDefined(): DataSet[Node]

Properties Defined
Count the defined properties within triples.
Properties Defined
Count the defined properties within triples. Filter rule : ?p=rdf:type && (?o=owl:ObjectProperty || ?o=rdf:Property)&& !isIRI(?s) Action : M[?p]++
returns
DataSet of predicates defined in the dataset.
def statsPropertyUsage(): DataSet[(Node, Int)]

5. Property Usage Criterion
Count the usage of properties within triples.
5. Property Usage Criterion
Count the usage of properties within triples. Therefore an DataSet will be created containing all property IRI's as identifier. Afterwards, their frequencies will be computed. Filter rule : none Action : M[?p]++
returns
DataSet of predicates used in the dataset and their frequencies.
def statsSameAs(): DataSet[Triple]

25.
25. SameAs criterion.
returns
list of triples with owl#sameAs as predicate
def statsSubjectVocabularies(): AggregateDataSet[(String, Int)]

30. Subject vocabularies
Compute subject vocabularies/namespaces used through the dataset.
30. Subject vocabularies
Compute subject vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?s) Action : M[ns]++
returns
DataSet of distinct subject vocabularies used in the dataset and their frequencies.
def statsTypedSubjects(): DataSet[Node]

24.
24. Typed subjects criterion.
returns
list of typed subjects.
def statsUsedClasses(): DataSet[Triple]

1. Used Classes Criterion
Creates an DataSet of classes are in use by instances of the analyzed dataset.
1. Used Classes Criterion
Creates an DataSet of classes are in use by instances of the analyzed dataset. As an example of such a triple that will be accepted by the filter is sda:Gezim rdf:type distLODStats:Developer. Filter rule : ?p=rdf:type && isIRI(?o) Action : S += ?o
returns
DataSet of classes/instances
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package stats

implicit class StatsCriteria extends Logging

Instance Constructors

new StatsCriteria(triples: DataSet[Triple])

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

val env: ExecutionEnvironment

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

val logger: Logger

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def stats: DataSet[String]

def statsBlanksAsObject(): DataSet[Triple]

def statsBlanksAsSubject(): DataSet[Triple]

def statsClassUsageCount(): DataSet[(Node, Int)]

def statsClassesDefined(): DataSet[Node]

def statsDataTypes(): DataSet[(String, Int)]

def statsDistinctEntities(): DataSet[Triple]

def statsDistinctObjects(): DataSet[Triple]

def statsDistinctSubjects(): DataSet[Triple]

def statsLabeledSubjects(): DataSet[Node]

def statsLanguages(): DataSet[(String, Int)]

def statsLinks(): DataSet[(String, String, Int)]

def statsLiterals(): DataSet[Triple]

def statsObjectVocabularies(): AggregateDataSet[(String, Int)]

def statsPredicateVocabularies(): AggregateDataSet[(String, Int)]

def statsPropertiesDefined(): DataSet[Node]

def statsPropertyUsage(): DataSet[(Node, Int)]

def statsSameAs(): DataSet[Triple]

def statsSubjectVocabularies(): AggregateDataSet[(String, Int)]

def statsTypedSubjects(): DataSet[Node]

def statsUsedClasses(): DataSet[Triple]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped