Java Doc for HTMLScanner.java in  » HTML-Parser » nekohtml » org » cyberneko » html » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » HTML Parser » nekohtml » org.cyberneko.html 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   org.cyberneko.html.HTMLScanner

HTMLScanner
public class HTMLScanner implements XMLDocumentScanner,XMLLocator,HTMLComponent(Code)
A simple HTML scanner. This scanner makes no attempt to balance tags or fix other problems in the source document — it just scans what it can and generates XNI document "events", ignoring errors of all kinds.

This component recognizes the following features:

  • http://cyberneko.org/html/features/augmentations
  • http://cyberneko.org/html/features/report-errors
  • http://apache.org/xml/features/scanner/notify-char-refs
  • http://apache.org/xml/features/scanner/notify-builtin-refs
  • http://cyberneko.org/html/features/scanner/notify-builtin-refs
  • http://cyberneko.org/html/features/scanner/fix-mswindows-refs
  • http://cyberneko.org/html/features/scanner/script/strip-cdata-delims
  • http://cyberneko.org/html/features/scanner/script/strip-comment-delims
  • http://cyberneko.org/html/features/scanner/style/strip-cdata-delims
  • http://cyberneko.org/html/features/scanner/style/strip-comment-delims
  • http://cyberneko.org/html/features/scanner/ignore-specified-charset
  • http://cyberneko.org/html/features/scanner/cdata-sections
  • http://cyberneko.org/html/features/override-doctype
  • http://cyberneko.org/html/features/insert-doctype

This component recognizes the following properties:

  • http://cyberneko.org/html/properties/names/elems
  • http://cyberneko.org/html/properties/names/attrs
  • http://cyberneko.org/html/properties/default-encoding
  • http://cyberneko.org/html/properties/error-reporter
  • http://cyberneko.org/html/properties/doctype/pubid
  • http://cyberneko.org/html/properties/doctype/sysid

See Also:   HTMLElements
See Also:   HTMLEntities
author:
   Andy Clark
author:
   Ahmed Ashour
version:
   $Id: HTMLScanner.java,v 1.19 2005/06/14 05:52:37 andyc Exp $

Inner Class :public interface Scanner
Inner Class :public static class CurrentEntity
Inner Class :public class ContentScanner implements Scanner
Inner Class :public class SpecialScanner implements Scanner
Inner Class :public static class PlaybackInputStream extends FilterInputStream
Inner Class :protected static class LocationItem implements HTMLEventInfo

Field Summary
final protected static  StringAUGMENTATIONS
     Include infoset augmentations.
final public static  StringCDATA_SECTIONS
     Scan CDATA sections.
final protected static  booleanDEBUG_CALLBACKS
     Set to true to debug callbacks.
final protected static  intDEFAULT_BUFFER_SIZE
     Default buffer size.
final protected static  StringDEFAULT_ENCODING
     Default encoding.
final protected static  StringDOCTYPE_PUBID
     Doctype declaration public identifier.
final protected static  StringDOCTYPE_SYSID
     Doctype declaration system identifier.
final protected static  StringERROR_REPORTER
     Error reporter.
final public static  StringFIX_MSWINDOWS_REFS
     Fix Microsoft Windows® character entity references.
final public static  StringHTML_4_01_FRAMESET_PUBID
     HTML 4.01 frameset public identifier ("-//W3C//DTD HTML 4.01 Frameset//EN").
final public static  StringHTML_4_01_FRAMESET_SYSID
     HTML 4.01 frameset system identifier ("http://www.w3.org/TR/html4/frameset.dtd").
final public static  StringHTML_4_01_STRICT_PUBID
     HTML 4.01 strict public identifier ("-//W3C//DTD HTML 4.01//EN").
final public static  StringHTML_4_01_STRICT_SYSID
     HTML 4.01 strict system identifier ("http://www.w3.org/TR/html4/strict.dtd").
final public static  StringHTML_4_01_TRANSITIONAL_PUBID
     HTML 4.01 transitional public identifier ("-//W3C//DTD HTML 4.01 Transitional//EN").
final public static  StringHTML_4_01_TRANSITIONAL_SYSID
     HTML 4.01 transitional system identifier ("http://www.w3.org/TR/html4/loose.dtd").
final public static  StringIGNORE_SPECIFIED_CHARSET
     Ignore specified charset found in the <meta equiv='Content-Type' content='text/html;charset=…'> tag.
final public static  StringINSERT_DOCTYPE
     Insert document type declaration.
final protected static  StringNAMES_ATTRS
     Modify HTML attribute names: { "upper", "lower", "default" }.
final protected static  StringNAMES_ELEMS
     Modify HTML element names: { "upper", "lower", "default" }.
final protected static  shortNAMES_LOWERCASE
     Lowercase HTML names.
final protected static  shortNAMES_NO_CHANGE
     Don't modify HTML names.
final protected static  shortNAMES_UPPERCASE
     Uppercase HTML names.
final protected static  StringNORMALIZE_ATTRIBUTES
     Normalize attribute values.
final public static  StringNOTIFY_CHAR_REFS
     Notify character entity references (e.g.
final public static  StringNOTIFY_HTML_BUILTIN_REFS
     Notify handler of built-in entity references (e.g.
final public static  StringNOTIFY_XML_BUILTIN_REFS
     Notify handler of built-in entity references (e.g.
final public static  StringOVERRIDE_DOCTYPE
     Override doctype declaration public and system identifiers.
final protected static  StringREPORT_ERRORS
     Report errors.
final public static  StringSCRIPT_STRIP_CDATA_DELIMS
     Strip XHTML CDATA delimiters ("<![CDATA[" and "]]>") from SCRIPT tag contents.
final public static  StringSCRIPT_STRIP_COMMENT_DELIMS
     Strip HTML comment delimiters ("<!−−" and "−−>") from SCRIPT tag contents.
final protected static  shortSTATE_CONTENT
     State: content.
final protected static  shortSTATE_END_DOCUMENT
     State: end document.
final protected static  shortSTATE_MARKUP_BRACKET
     State: markup bracket.
final protected static  shortSTATE_START_DOCUMENT
     State: start document.
final public static  StringSTYLE_STRIP_CDATA_DELIMS
     Strip XHTML CDATA delimiters ("<![CDATA[" and "]]>") from STYLE tag contents.
final public static  StringSTYLE_STRIP_COMMENT_DELIMS
     Strip HTML comment delimiters ("<!−−" and "−−>") from STYLE tag contents.
final protected static  HTMLEventInfoSYNTHESIZED_ITEM
     Synthesized event info item.
protected  booleanfAugmentations
     Augmentations.
protected  intfBeginColumnNumber
     Beginning column number.
protected  intfBeginLineNumber
     Beginning line number.
protected  PlaybackInputStreamfByteStream
     The playback byte stream.
protected  booleanfCDATASections
     CDATA sections.
protected  ScannerfContentScanner
     Content scanner.
protected  CurrentEntityfCurrentEntity
     Current entity.
final protected  StackfCurrentEntityStack
     The current entity stack.
protected  StringfDefaultIANAEncoding
     Default encoding.
protected  StringfDoctypePubid
     Doctype declaration public identifier.
protected  StringfDoctypeSysid
     Doctype declaration system identifier.
protected  XMLDocumentHandlerfDocumentHandler
     The document handler.
protected  intfElementCount
     Element count.
protected  intfElementDepth
     Element depth.
protected  intfEndColumnNumber
     Ending column number.
protected  intfEndLineNumber
     Ending line number.
protected  HTMLErrorReporterfErrorReporter
     Error reporter.
protected  booleanfFixWindowsCharRefs
     Fix Microsoft Windows® character entity references.
protected  StringfIANAEncoding
     Auto-detected IANA encoding.
protected  booleanfIgnoreSpecifiedCharset
     Ignore specified character set.
protected  booleanfInsertDoctype
     Insert document type declaration.
protected  booleanfIso8859Encoding
     True if the encoding matches "ISO-8859-*".
protected  StringfJavaEncoding
     Auto-detected Java encoding.
protected  shortfNamesAttrs
     Modify HTML attribute names.
protected  shortfNamesElems
     Modify HTML element names.
protected  booleanfNormalizeAttributes
     Normalize attribute values.
protected  booleanfNotifyCharRefs
     Notify character entity references.
protected  booleanfNotifyHtmlBuiltinRefs
     Notify HTML built-in general entity references.
protected  booleanfNotifyXmlBuiltinRefs
     Notify XML built-in general entity references.
protected  booleanfOverrideDoctype
     Override doctype declaration public and system identifiers.
protected  booleanfReportErrors
     Report errors.
protected  ScannerfScanner
     The current scanner.
protected  shortfScannerState
     The current scanner state.
protected  booleanfScriptStripCDATADelims
     Strip CDATA delimiters from SCRIPT tags.
protected  booleanfScriptStripCommentDelims
     Strip comment delimiters from SCRIPT tags.
protected  SpecialScannerfSpecialScanner
     Special scanner used for elements whose content needs to be scanned as plain text, ignoring markup such as elements and entity references.
final protected  XMLStringfString
     String.
final protected  XMLStringBufferfStringBuffer
     String buffer.
protected  booleanfStyleStripCDATADelims
     Strip CDATA delimiters from STYLE tags.
protected  booleanfStyleStripCommentDelims
     Strip comment delimiters from STYLE tags.


Method Summary
protected static  booleanbuiltinXmlRef(String name)
     Returns true if the name is a built-in XML general entity reference.
public  voidcleanup(boolean closeall)
     Cleans up used resources.
public  voidevaluateInputSource(XMLInputSource inputSource)
     Immediately evaluates an input source and add the new content (e.g.
public static  StringexpandSystemId(String systemId, String baseSystemId)
     Expands a system id and returns the system id as a URI, if it can be expanded.
protected static  StringfixURI(String str)
     Fixes a platform dependent filename to standard URI form.
Parameters:
  str - The string to fix.
protected  intfixWindowsCharacter(int origChar)
     Fixes Microsoft Windows® specific characters.
public  StringgetBaseSystemId()
     Returns the base system identifier.
public  intgetCharacterOffset()
     Returns the character offset.
public  intgetColumnNumber()
     Returns the current column number.
public  XMLDocumentHandlergetDocumentHandler()
     Returns the document handler.
public  StringgetEncoding()
     Returns the encoding.
public  StringgetExpandedSystemId()
     Returns the expanded system identifier.
public  BooleangetFeatureDefault(String featureId)
     Returns the default state for a feature.
public  intgetLineNumber()
     Returns the current line number.
public  StringgetLiteralSystemId()
     Returns the literal system identifier.
final protected static  shortgetNamesValue(String value)
     Converts HTML names string value to constant value.
public  ObjectgetPropertyDefault(String propertyId)
     Returns the default state for a property.
public  StringgetPublicId()
     Returns the public identifier.
public  String[]getRecognizedFeatures()
     Returns recognized features.
public  String[]getRecognizedProperties()
     Returns recognized properties.
protected static  StringgetValue(XMLAttributes attrs, String aname)
     Returns the value of the specified attribute, ignoring case.
public  StringgetXMLVersion()
     Returns the XML version.
 booleanisEncodingCompatible(String encoding1, String encoding2)
     To detect if 2 encoding are compatible, both must be able to read the meta tag specifying the new encoding.
protected  intload(int offset)
     Loads a new chunk of data into the buffer and returns the number of characters loaded or -1 if no additional characters were loaded.
final protected  AugmentationslocationAugs()
     Returns an augmentations object with a location item added.
final protected static  StringmodifyName(String name, short mode)
     Modifies the given name based on the specified mode.
public  voidpushInputSource(XMLInputSource inputSource)
     Pushes an input source onto the current entity stack.
protected  intread()
     Reads a single character.
public  voidreset(XMLComponentManager manager)
     Resets the component.
final protected  XMLResourceIdentifierresourceId()
     Returns an empty resource identifier.
protected  voidscanDoctype()
     Scans a DOCTYPE line.
public  booleanscanDocument(boolean complete)
     Scans the document.
protected  intscanEntityRef(XMLStringBuffer str, boolean content)
     Scans an entity reference.
protected  StringscanLiteral()
     Scans a quoted literal.
protected  StringscanName()
     Scans a name.
public  voidsetDocumentHandler(XMLDocumentHandler handler)
     Sets the document handler.
public  voidsetFeature(String featureId, boolean state)
     Sets a feature.
public  voidsetInputSource(XMLInputSource source)
     Sets the input source.
public  voidsetProperty(String propertyId, Object value)
     Sets a property.
protected  voidsetScanner(Scanner scanner)
     Sets the scanner.
protected  voidsetScannerState(short state)
     Sets the scanner state.
protected  booleanskip(String s, boolean caseSensitive)
     Returns true if the specified text is present and is skipped.
protected  booleanskipMarkup(boolean balance)
     Skips markup.
protected  intskipNewlines()
     Skips newlines and returns the number of newlines skipped.
protected  intskipNewlines(int maxlines)
     Skips newlines and returns the number of newlines skipped.
protected  booleanskipSpaces()
     Skips whitespace.
final protected  AugmentationssynthesizedAugs()
     Returns an augmentations object with a synthesized item added.

Field Detail
AUGMENTATIONS
final protected static String AUGMENTATIONS(Code)
Include infoset augmentations.



CDATA_SECTIONS
final public static String CDATA_SECTIONS(Code)
Scan CDATA sections.



DEBUG_CALLBACKS
final protected static boolean DEBUG_CALLBACKS(Code)
Set to true to debug callbacks.



DEFAULT_BUFFER_SIZE
final protected static int DEFAULT_BUFFER_SIZE(Code)
Default buffer size.



DEFAULT_ENCODING
final protected static String DEFAULT_ENCODING(Code)
Default encoding.



DOCTYPE_PUBID
final protected static String DOCTYPE_PUBID(Code)
Doctype declaration public identifier.



DOCTYPE_SYSID
final protected static String DOCTYPE_SYSID(Code)
Doctype declaration system identifier.



ERROR_REPORTER
final protected static String ERROR_REPORTER(Code)
Error reporter.



FIX_MSWINDOWS_REFS
final public static String FIX_MSWINDOWS_REFS(Code)
Fix Microsoft Windows® character entity references.



HTML_4_01_FRAMESET_PUBID
final public static String HTML_4_01_FRAMESET_PUBID(Code)
HTML 4.01 frameset public identifier ("-//W3C//DTD HTML 4.01 Frameset//EN").



HTML_4_01_FRAMESET_SYSID
final public static String HTML_4_01_FRAMESET_SYSID(Code)
HTML 4.01 frameset system identifier ("http://www.w3.org/TR/html4/frameset.dtd").



HTML_4_01_STRICT_PUBID
final public static String HTML_4_01_STRICT_PUBID(Code)
HTML 4.01 strict public identifier ("-//W3C//DTD HTML 4.01//EN").



HTML_4_01_STRICT_SYSID
final public static String HTML_4_01_STRICT_SYSID(Code)
HTML 4.01 strict system identifier ("http://www.w3.org/TR/html4/strict.dtd").



HTML_4_01_TRANSITIONAL_PUBID
final public static String HTML_4_01_TRANSITIONAL_PUBID(Code)
HTML 4.01 transitional public identifier ("-//W3C//DTD HTML 4.01 Transitional//EN").



HTML_4_01_TRANSITIONAL_SYSID
final public static String HTML_4_01_TRANSITIONAL_SYSID(Code)
HTML 4.01 transitional system identifier ("http://www.w3.org/TR/html4/loose.dtd").



IGNORE_SPECIFIED_CHARSET
final public static String IGNORE_SPECIFIED_CHARSET(Code)
Ignore specified charset found in the <meta equiv='Content-Type' content='text/html;charset=…'> tag.



INSERT_DOCTYPE
final public static String INSERT_DOCTYPE(Code)
Insert document type declaration.



NAMES_ATTRS
final protected static String NAMES_ATTRS(Code)
Modify HTML attribute names: { "upper", "lower", "default" }.



NAMES_ELEMS
final protected static String NAMES_ELEMS(Code)
Modify HTML element names: { "upper", "lower", "default" }.



NAMES_LOWERCASE
final protected static short NAMES_LOWERCASE(Code)
Lowercase HTML names.



NAMES_NO_CHANGE
final protected static short NAMES_NO_CHANGE(Code)
Don't modify HTML names.



NAMES_UPPERCASE
final protected static short NAMES_UPPERCASE(Code)
Uppercase HTML names.



NORMALIZE_ATTRIBUTES
final protected static String NORMALIZE_ATTRIBUTES(Code)
Normalize attribute values.



NOTIFY_CHAR_REFS
final public static String NOTIFY_CHAR_REFS(Code)
Notify character entity references (e.g. &#32;, &#x20;, etc).



NOTIFY_HTML_BUILTIN_REFS
final public static String NOTIFY_HTML_BUILTIN_REFS(Code)
Notify handler of built-in entity references (e.g. &nobr;, &copy;, etc).

Note: This includes the five pre-defined XML general entities.




NOTIFY_XML_BUILTIN_REFS
final public static String NOTIFY_XML_BUILTIN_REFS(Code)
Notify handler of built-in entity references (e.g. &amp;, &lt;, etc).

Note: This only applies to the five pre-defined XML general entities. Specifically, "amp", "lt", "gt", "quot", and "apos". This is done for compatibility with the Xerces feature.

To be notified of the built-in entity references in HTML, set the http://cyberneko.org/html/features/scanner/notify-builtin-refs feature to true.




OVERRIDE_DOCTYPE
final public static String OVERRIDE_DOCTYPE(Code)
Override doctype declaration public and system identifiers.



REPORT_ERRORS
final protected static String REPORT_ERRORS(Code)
Report errors.



SCRIPT_STRIP_CDATA_DELIMS
final public static String SCRIPT_STRIP_CDATA_DELIMS(Code)
Strip XHTML CDATA delimiters ("<![CDATA[" and "]]>") from SCRIPT tag contents.



SCRIPT_STRIP_COMMENT_DELIMS
final public static String SCRIPT_STRIP_COMMENT_DELIMS(Code)
Strip HTML comment delimiters ("<!−−" and "−−>") from SCRIPT tag contents.



STATE_CONTENT
final protected static short STATE_CONTENT(Code)
State: content.



STATE_END_DOCUMENT
final protected static short STATE_END_DOCUMENT(Code)
State: end document.



STATE_MARKUP_BRACKET
final protected static short STATE_MARKUP_BRACKET(Code)
State: markup bracket.



STATE_START_DOCUMENT
final protected static short STATE_START_DOCUMENT(Code)
State: start document.



STYLE_STRIP_CDATA_DELIMS
final public static String STYLE_STRIP_CDATA_DELIMS(Code)
Strip XHTML CDATA delimiters ("<![CDATA[" and "]]>") from STYLE tag contents.



STYLE_STRIP_COMMENT_DELIMS
final public static String STYLE_STRIP_COMMENT_DELIMS(Code)
Strip HTML comment delimiters ("<!−−" and "−−>") from STYLE tag contents.



SYNTHESIZED_ITEM
final protected static HTMLEventInfo SYNTHESIZED_ITEM(Code)
Synthesized event info item.



fAugmentations
protected boolean fAugmentations(Code)
Augmentations.



fBeginColumnNumber
protected int fBeginColumnNumber(Code)
Beginning column number.



fBeginLineNumber
protected int fBeginLineNumber(Code)
Beginning line number.



fByteStream
protected PlaybackInputStream fByteStream(Code)
The playback byte stream.



fCDATASections
protected boolean fCDATASections(Code)
CDATA sections.



fContentScanner
protected Scanner fContentScanner(Code)
Content scanner.



fCurrentEntity
protected CurrentEntity fCurrentEntity(Code)
Current entity.



fCurrentEntityStack
final protected Stack fCurrentEntityStack(Code)
The current entity stack.



fDefaultIANAEncoding
protected String fDefaultIANAEncoding(Code)
Default encoding.



fDoctypePubid
protected String fDoctypePubid(Code)
Doctype declaration public identifier.



fDoctypeSysid
protected String fDoctypeSysid(Code)
Doctype declaration system identifier.



fDocumentHandler
protected XMLDocumentHandler fDocumentHandler(Code)
The document handler.



fElementCount
protected int fElementCount(Code)
Element count.



fElementDepth
protected int fElementDepth(Code)
Element depth.



fEndColumnNumber
protected int fEndColumnNumber(Code)
Ending column number.



fEndLineNumber
protected int fEndLineNumber(Code)
Ending line number.



fErrorReporter
protected HTMLErrorReporter fErrorReporter(Code)
Error reporter.



fFixWindowsCharRefs
protected boolean fFixWindowsCharRefs(Code)
Fix Microsoft Windows® character entity references.



fIANAEncoding
protected String fIANAEncoding(Code)
Auto-detected IANA encoding.



fIgnoreSpecifiedCharset
protected boolean fIgnoreSpecifiedCharset(Code)
Ignore specified character set.



fInsertDoctype
protected boolean fInsertDoctype(Code)
Insert document type declaration.



fIso8859Encoding
protected boolean fIso8859Encoding(Code)
True if the encoding matches "ISO-8859-*".



fJavaEncoding
protected String fJavaEncoding(Code)
Auto-detected Java encoding.



fNamesAttrs
protected short fNamesAttrs(Code)
Modify HTML attribute names.



fNamesElems
protected short fNamesElems(Code)
Modify HTML element names.



fNormalizeAttributes
protected boolean fNormalizeAttributes(Code)
Normalize attribute values.



fNotifyCharRefs
protected boolean fNotifyCharRefs(Code)
Notify character entity references.



fNotifyHtmlBuiltinRefs
protected boolean fNotifyHtmlBuiltinRefs(Code)
Notify HTML built-in general entity references.



fNotifyXmlBuiltinRefs
protected boolean fNotifyXmlBuiltinRefs(Code)
Notify XML built-in general entity references.



fOverrideDoctype
protected boolean fOverrideDoctype(Code)
Override doctype declaration public and system identifiers.



fReportErrors
protected boolean fReportErrors(Code)
Report errors.



fScanner
protected Scanner fScanner(Code)
The current scanner.



fScannerState
protected short fScannerState(Code)
The current scanner state.



fScriptStripCDATADelims
protected boolean fScriptStripCDATADelims(Code)
Strip CDATA delimiters from SCRIPT tags.



fScriptStripCommentDelims
protected boolean fScriptStripCommentDelims(Code)
Strip comment delimiters from SCRIPT tags.



fSpecialScanner
protected SpecialScanner fSpecialScanner(Code)
Special scanner used for elements whose content needs to be scanned as plain text, ignoring markup such as elements and entity references. For example: <SCRIPT> and <COMMENT>.



fString
final protected XMLString fString(Code)
String.



fStringBuffer
final protected XMLStringBuffer fStringBuffer(Code)
String buffer.



fStyleStripCDATADelims
protected boolean fStyleStripCDATADelims(Code)
Strip CDATA delimiters from STYLE tags.



fStyleStripCommentDelims
protected boolean fStyleStripCommentDelims(Code)
Strip comment delimiters from STYLE tags.





Method Detail
builtinXmlRef
protected static boolean builtinXmlRef(String name)(Code)
Returns true if the name is a built-in XML general entity reference.



cleanup
public void cleanup(boolean closeall)(Code)
Cleans up used resources. For example, if scanning is terminated early, then this method ensures all remaining open streams are closed.
Parameters:
  closeall - Close all streams, including the original.This is used in cases when the application hasopened the original document stream and shouldbe responsible for closing it.



evaluateInputSource
public void evaluateInputSource(XMLInputSource inputSource)(Code)
Immediately evaluates an input source and add the new content (e.g. the output written by an embedded script).
Parameters:
  inputSource - The new input source to start evaluating.
See Also:   HTMLScanner.pushInputSource(XMLInputSource)



expandSystemId
public static String expandSystemId(String systemId, String baseSystemId)(Code)
Expands a system id and returns the system id as a URI, if it can be expanded. A return value of null means that the identifier is already expanded. An exception thrown indicates a failure to expand the id.
Parameters:
  systemId - The systemId to be expanded. Returns the URI string representing the expanded systemidentifier. A null value indicates that the givensystem identifier is already expanded.



fixURI
protected static String fixURI(String str)(Code)
Fixes a platform dependent filename to standard URI form.
Parameters:
  str - The string to fix. Returns the fixed URI string.



fixWindowsCharacter
protected int fixWindowsCharacter(int origChar)(Code)
Fixes Microsoft Windows® specific characters.

Details about this common problem can be found at http://www.cs.tut.fi/~jkorpela/www/windows-chars.html




getBaseSystemId
public String getBaseSystemId()(Code)
Returns the base system identifier.



getCharacterOffset
public int getCharacterOffset()(Code)
Returns the character offset.



getColumnNumber
public int getColumnNumber()(Code)
Returns the current column number.



getDocumentHandler
public XMLDocumentHandler getDocumentHandler()(Code)
Returns the document handler.



getEncoding
public String getEncoding()(Code)
Returns the encoding.



getExpandedSystemId
public String getExpandedSystemId()(Code)
Returns the expanded system identifier.



getFeatureDefault
public Boolean getFeatureDefault(String featureId)(Code)
Returns the default state for a feature.



getLineNumber
public int getLineNumber()(Code)
Returns the current line number.



getLiteralSystemId
public String getLiteralSystemId()(Code)
Returns the literal system identifier.



getNamesValue
final protected static short getNamesValue(String value)(Code)
Converts HTML names string value to constant value.
See Also:   HTMLScanner.NAMES_NO_CHANGE
See Also:   HTMLScanner.NAMES_LOWERCASE
See Also:   HTMLScanner.NAMES_UPPERCASE



getPropertyDefault
public Object getPropertyDefault(String propertyId)(Code)
Returns the default state for a property.



getPublicId
public String getPublicId()(Code)
Returns the public identifier.



getRecognizedFeatures
public String[] getRecognizedFeatures()(Code)
Returns recognized features.



getRecognizedProperties
public String[] getRecognizedProperties()(Code)
Returns recognized properties.



getValue
protected static String getValue(XMLAttributes attrs, String aname)(Code)
Returns the value of the specified attribute, ignoring case.



getXMLVersion
public String getXMLVersion()(Code)
Returns the XML version.



isEncodingCompatible
boolean isEncodingCompatible(String encoding1, String encoding2)(Code)
To detect if 2 encoding are compatible, both must be able to read the meta tag specifying the new encoding. This means that the byte representation of some minimal html markup must be the same in both encodings



load
protected int load(int offset) throws IOException(Code)
Loads a new chunk of data into the buffer and returns the number of characters loaded or -1 if no additional characters were loaded.
Parameters:
  offset - The offset at which new characters should be loaded.



locationAugs
final protected Augmentations locationAugs()(Code)
Returns an augmentations object with a location item added.



modifyName
final protected static String modifyName(String name, short mode)(Code)
Modifies the given name based on the specified mode.



pushInputSource
public void pushInputSource(XMLInputSource inputSource)(Code)
Pushes an input source onto the current entity stack. This enables the scanner to transparently scan new content (e.g. the output written by an embedded script). At the end of the current entity, the scanner returns where it left off at the time this entity source was pushed.

Note: This functionality is experimental at this time and is subject to change in future releases of NekoHTML.
Parameters:
  inputSource - The new input source to start scanning.
See Also:   HTMLScanner.evaluateInputSource(XMLInputSource)




read
protected int read() throws IOException(Code)
Reads a single character.



reset
public void reset(XMLComponentManager manager) throws XMLConfigurationException(Code)
Resets the component.



resourceId
final protected XMLResourceIdentifier resourceId()(Code)
Returns an empty resource identifier.



scanDoctype
protected void scanDoctype() throws IOException(Code)
Scans a DOCTYPE line.



scanDocument
public boolean scanDocument(boolean complete) throws XNIException, IOException(Code)
Scans the document.



scanEntityRef
protected int scanEntityRef(XMLStringBuffer str, boolean content) throws IOException(Code)
Scans an entity reference.



scanLiteral
protected String scanLiteral() throws IOException(Code)
Scans a quoted literal.



scanName
protected String scanName() throws IOException(Code)
Scans a name.



setDocumentHandler
public void setDocumentHandler(XMLDocumentHandler handler)(Code)
Sets the document handler.



setFeature
public void setFeature(String featureId, boolean state) throws XMLConfigurationException(Code)
Sets a feature.



setInputSource
public void setInputSource(XMLInputSource source) throws IOException(Code)
Sets the input source.



setProperty
public void setProperty(String propertyId, Object value) throws XMLConfigurationException(Code)
Sets a property.



setScanner
protected void setScanner(Scanner scanner)(Code)
Sets the scanner.



setScannerState
protected void setScannerState(short state)(Code)
Sets the scanner state.



skip
protected boolean skip(String s, boolean caseSensitive) throws IOException(Code)
Returns true if the specified text is present and is skipped.



skipMarkup
protected boolean skipMarkup(boolean balance) throws IOException(Code)
Skips markup.



skipNewlines
protected int skipNewlines() throws IOException(Code)
Skips newlines and returns the number of newlines skipped.



skipNewlines
protected int skipNewlines(int maxlines) throws IOException(Code)
Skips newlines and returns the number of newlines skipped.



skipSpaces
protected boolean skipSpaces() throws IOException(Code)
Skips whitespace.



synthesizedAugs
final protected Augmentations synthesizedAugs()(Code)
Returns an augmentations object with a synthesized item added.



Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.