Package org.archive.modules.deciderules
Class PathologicalPathDecideRule
java.lang.Object
org.archive.modules.deciderules.DecideRule
org.archive.modules.deciderules.PathologicalPathDecideRule
- All Implemented Interfaces:
Serializable
,HasKeyedProperties
Rule REJECTs any URI which contains an excessive number of identical,
consecutive path-segments (eg http://example.com/a/a/a/boo.html == 3 '/a'
segments)
- Author:
- gojomo
- See Also:
-
Field Summary
Fields inherited from class org.archive.modules.deciderules.DecideRule
comment, kp
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected String
constructRegex
(int rep) int
protected DecideResult
innerDecide
(CrawlURI uri) void
setMaxRepetitions
(int maxRepetitions) Number of times the pattern should be allowed to occur.Methods inherited from class org.archive.modules.deciderules.DecideRule
accepts, decisionFor, getComment, getEnabled, getKeyedProperties, onlyDecision, setComment, setEnabled
-
Constructor Details
-
PathologicalPathDecideRule
public PathologicalPathDecideRule()Constructs a new PathologicalPathFilter.
-
-
Method Details
-
getMaxRepetitions
public int getMaxRepetitions() -
setMaxRepetitions
public void setMaxRepetitions(int maxRepetitions) Number of times the pattern should be allowed to occur. This rule returns its decision (usually REJECT) if a path-segment is repeated more than number of times. -
innerDecide
- Specified by:
innerDecide
in classDecideRule
-
constructRegex
-