it.unimi.dsi.mg4j.util.parser.callback |
MG4J: Managing Gigabytes for Java
Callbacks for the {@link it.unimi.dsi.mg4j.util.parser.BulletParser}.
|
Java Source File Name | Type | Comment |
AnchorExtractor.java | Class | A callback extracting anchor text. |
Callback.java | Interface | A callback for the
.
This interface is very loosely inspired to the SAX2 interface. |
ComposedCallbackBuilder.java | Class | A builder for composed callbacks. |
DebugCallbackDecorator.java | Class | A decorator that prints on standard error all calls to the underlying callback. |
DefaultCallback.java | Class | A default, do-nothing-at-all callback.
Callbacks can inherit from this class and forget about methods they are not interested in.
This class has a protected constructor. |
LinkExtractor.java | Class | A callback extracting links.
This callbacks extracts links existing in the web page. |
TextExtractor.java | Class | A callback extracting text and titles.
This callbacks extracts all text in the page, and the title.
The resulting
text is available through
TextExtractor.text , and the title through
TextExtractor.title .
Note that
TextExtractor.text and
TextExtractor.title are never trimmed. |