| java.lang.Object java.nio.charset.CharsetDecoder
CharsetDecoder | abstract public class CharsetDecoder (Code) | | An converter that can convert bytes sequence in some charset to 16-bit
Unicode character sequence.
The input byte sequence is wrapped by
java.nio.ByteBuffer ByteBuffer and the output character sequence is
java.nio.CharBuffer CharBuffer .
A decoder instance should be used in following sequence, which is referred to
as a decoding operation:
- Invoking the
CharsetDecoder.reset() reset method to reset the decoder if the
decoder has been used;
- Invoking the
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode method until the additional input is not needed, the
endOfInput
parameter must be set to false, the input buffer must be filled and the
output buffer must be flushed between invocations;
- Invoking the
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode method last time, and the the
endOfInput parameter must be set
to true
- Invoking the
CharsetDecoder.flush(CharBuffer) flush method to flush the
output.
The
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode method will
convert as many bytes as possible, and the process won't stop except the
input bytes has been run out of, the output buffer has been filled or some
error has happened. A
CoderResult CoderResult instance will be
returned to indicate the stop reason, and the invoker can identify the result
and choose further action, which can include filling the input buffer,
flushing the output buffer, recovering from error and trying again.
There are two common decoding errors. One is named as malformed and it is
returned when the input byte sequence is illegal for current specific
charset, the other is named as unmappable character and it is returned when a
problem occurs mapping a legal input byte sequence to its Unicode character
equivalent.
The two errors can be handled in three ways, the default one is to report the
error to the invoker by a
CoderResult CoderResult instance, and the
alternatives are to ignore it or to replace the erroneous input with the
replacement string. The replacement string is "\uFFFD" by default and can be
changed by invoking
CharsetDecoder.replaceWith(String) replaceWith method. The
invoker of this decoder can choose one way by specifying a
CodingErrorAction CodingErrorAction instance for each error type via
CharsetDecoder.onMalformedInput(CodingErrorAction) onMalformedInput method and
CharsetDecoder.onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter method.
This class is abstract class and encapsulate many common operations of
decoding process for all charsets. Decoder for specific charset should extend
this class and need only implement
CharsetDecoder.decodeLoop(ByteBuffer,CharBuffer) decodeLoop method for basic
decoding loop. If a subclass maintains internal state, it should override the
CharsetDecoder.implFlush(CharBuffer) implFlush method and
CharsetDecoder.implReset() implReset method in addition.
This class is not thread-safe.
See Also: java.nio.charset.Charset See Also: java.nio.charset.CharsetEncoder |
Constructor Summary | |
protected | CharsetDecoder(Charset charset, float averageCharsPerByte, float maxCharsPerByte) Construct a new CharsetDecoder using given
Charset , average number and maximum number of characters
created by this decoder for one input byte, and the default replacement
string "\uFFFD". |
Method Summary | |
final public float | averageCharsPerByte() | final public Charset | charset() Get the Charset which creates this decoder. | final public CharBuffer | decode(ByteBuffer in) This is a facade method for decoding operation.
This method decodes the remaining byte sequence of the given byte buffer
into a new character buffer. | final public CoderResult | decode(ByteBuffer in, CharBuffer out, boolean endOfInput) Decodes bytes starting at the current position of the given input buffer,
and writes the equivalent character sequence into the given output buffer
from its current position.
The buffers' position will be changed with the reading and writing
operation, but their limits and marks will be kept intact.
A CoderResult instance will be returned according to
following rules:
-
CoderResult.OVERFLOW CoderResult.OVERFLOW indicates that
even though not all of the input has been processed, the buffer the
output is being written to has reached its capacity.
| abstract protected CoderResult | decodeLoop(ByteBuffer in, CharBuffer out) Decode bytes into characters. | public Charset | detectedCharset() Get the charset detected by this decoder, this method is optional.
If implementing an auto-detecting charset, then this decoder returns the
detected charset from this method when it is available. | final public CoderResult | flush(CharBuffer out) Flush this decoder.
This method will call
CharsetDecoder.implFlush(CharBuffer) implFlush . | protected CoderResult | implFlush(CharBuffer out) Flush this decoder. | protected void | implOnMalformedInput(CodingErrorAction newAction) Notify that this decoder's CodingErrorAction specified for
malformed input error has been changed. | protected void | implOnUnmappableCharacter(CodingErrorAction newAction) Notify that this decoder's CodingErrorAction specified for
unmappable character error has been changed. | protected void | implReplaceWith(String newReplacement) Notify that this decoder's replacement has been changed. | protected void | implReset() Reset this decoder's charset related state. | public boolean | isAutoDetecting() Get if this decoder implements an auto-detecting charset. | public boolean | isCharsetDetected() Get if this decoder has detected a charset, this method is optional.
If this decoder implements an auto-detecting charset, then this method
may start to return true during decoding operation to indicate that a
charset has been detected in the input bytes and that the charset can be
retrieved by invoking
CharsetDecoder.detectedCharset() detectedCharset method.
Note that a decoder that implements an auto-detecting charset may still
succeed in decoding a portion of the given input even when it is unable
to detect the charset. | public CodingErrorAction | malformedInputAction() Gets this decoder's CodingErrorAction when malformed input
occurred during decoding process. | final public float | maxCharsPerByte() | final public CharsetDecoder | onMalformedInput(CodingErrorAction newAction) Set this decoder's action on malformed input error. | final public CharsetDecoder | onUnmappableCharacter(CodingErrorAction newAction) Set this decoder's action on unmappable character error. | final public CharsetDecoder | replaceWith(String newReplacement) Set new replacement value. | final public String | replacement() | final public CharsetDecoder | reset() Reset this decoder. | public CodingErrorAction | unmappableCharacterAction() Gets this decoder's CodingErrorAction when unmappable
character occurred during decoding process. |
CharsetDecoder | protected CharsetDecoder(Charset charset, float averageCharsPerByte, float maxCharsPerByte)(Code) | | Construct a new CharsetDecoder using given
Charset , average number and maximum number of characters
created by this decoder for one input byte, and the default replacement
string "\uFFFD".
Parameters: charset - this decoder's Charset , which create thisdecoder Parameters: averageCharsPerByte - average number of characters created by this decoder for oneinput byte, must be positive Parameters: maxCharsPerByte - maximum number of characters created by this decoder for oneinput byte, must be positive throws: IllegalArgumentException - if averageCharsPerByte ormaxCharsPerByte is negative |
averageCharsPerByte | final public float averageCharsPerByte()(Code) | | get the average number of characters created by this decoder for single
input byte
the average number of characters created by this decoder forsingle input byte |
charset | final public Charset charset()(Code) | | Get the Charset which creates this decoder.
the Charset which creates this decoder |
decode | final public CharBuffer decode(ByteBuffer in) throws CharacterCodingException(Code) | | This is a facade method for decoding operation.
This method decodes the remaining byte sequence of the given byte buffer
into a new character buffer. This method performs a complete decoding
operation, resets at first, then decodes, and flushes at last.
This method should not be invoked if another decode operation is ongoing.
Parameters: in - the input buffer a new CharBuffer containing the the charactersproduced by this decoding operation. The buffer's limit will bethe position of last character in buffer, and the position willbe zero throws: IllegalStateException - if another decoding operation is ongoing throws: MalformedInputException - if illegal input byte sequence for this charset encountered,and the action for malformed error isCodingErrorAction.REPORT CodingErrorAction.REPORT throws: UnmappableCharacterException - if legal but unmappable input byte sequence for this charsetencountered, and the action for unmappable character error isCodingErrorAction.REPORT CodingErrorAction.REPORT.Unmappable means the byte sequence at the input buffer'scurrent position cannot be mapped to a Unicode charactersequence. throws: CharacterCodingException - if other exception happened during the decode operation |
decode | final public CoderResult decode(ByteBuffer in, CharBuffer out, boolean endOfInput)(Code) | | Decodes bytes starting at the current position of the given input buffer,
and writes the equivalent character sequence into the given output buffer
from its current position.
The buffers' position will be changed with the reading and writing
operation, but their limits and marks will be kept intact.
A CoderResult instance will be returned according to
following rules:
-
CoderResult.OVERFLOW CoderResult.OVERFLOW indicates that
even though not all of the input has been processed, the buffer the
output is being written to has reached its capacity. In the event of this
code being returned this method should be called once more with an
out argument that has not already been filled.
-
CoderResult.UNDERFLOW CoderResult.UNDERFLOW indicates that
as many bytes as possible in the input buffer have been decoded. If there
is no further input and no remaining bytes in the input buffer then this
operation may be regarded as complete. Otherwise, this method should be
called once more with additional input.
- A
CoderResult.malformedForLength(int) malformed input result
indicates that some malformed input error encountered, and the erroneous
bytes start at the input buffer's position and their number can be got by
result's
CoderResult.length length . This kind of result can be
returned only if the malformed action is
CodingErrorAction.REPORT CodingErrorAction.REPORT .
- A
CoderResult.unmappableForLength(int) unmappable character result indicates that some unmappable character error encountered, and
the erroneous bytes start at the input buffer's position and their number
can be got by result's
CoderResult.length length . This kind of
result can be returned only if the unmappable character action is
CodingErrorAction.REPORT CodingErrorAction.REPORT .
The endOfInput parameter indicates that if the invoker can
provider further input. This parameter is true if and only if the bytes
in current input buffer are all inputs for this decoding operation. Note
that it is common and won't cause error that the invoker sets false and
then finds no more input available; while it may cause error that the
invoker always sets true in several consecutive invocations so that any
remaining input will be treated as malformed input.
This method invokes
CharsetDecoder.decodeLoop(ByteBuffer,CharBuffer) decodeLoop method to
implement basic decode logic for specific charset.
Parameters: in - the input buffer Parameters: out - the output buffer Parameters: endOfInput - true if all the input characters have been provided a CoderResult instance which indicates the reasonof termination throws: IllegalStateException - if decoding has started or no more input is needed in thisdecoding progress. throws: CoderMalfunctionError - if the CharsetDecoder.decodeLoop(ByteBuffer,CharBuffer) decodeLoopmethod threw an BufferUnderflowException orBufferOverflowException |
decodeLoop | abstract protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out)(Code) | | Decode bytes into characters. This method is called by
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode method.
This method will implement the essential decoding operation, and it won't
stop decoding until either all the input bytes are read, the output
buffer is filled, or some exception encountered. And then it will return
a CoderResult object indicating the result of current
decoding operation. The rules to construct the CoderResult
is same as the
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode .
When exception encountered in the decoding operation, most implementation
of this method will return a relevant result object to
CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean) decode method, and some
performance optimized implementation may handle the exception and
implement the error action itself.
The buffers are scanned from their current positions, and their positions
will be modified accordingly, while their marks and limits will be
intact. At most
ByteBuffer.remaining in.remaining() characters
will be read, and
CharBuffer.remaining out.remaining() bytes
will be written.
Note that some implementation may pre-scan the input buffer and return
CoderResult.UNDERFLOW until it receives sufficient input.
Parameters: in - the input buffer Parameters: out - the output buffer a CoderResult instance indicating the result |
detectedCharset | public Charset detectedCharset()(Code) | | Get the charset detected by this decoder, this method is optional.
If implementing an auto-detecting charset, then this decoder returns the
detected charset from this method when it is available. The returned
charset will be the same for the rest of the decode operation.
If insufficient bytes have been read to determine the charset,
IllegalStateException will be thrown.
The default implementation always throws
UnsupportedOperationException , so it should be overridden
by subclass if needed.
the charset detected by this decoder, or null if it is not yetdetermined throws: UnsupportedOperationException - if this decoder does not implement an auto-detecting charset throws: IllegalStateException - if insufficient bytes have been read to determine the charset |
flush | final public CoderResult flush(CharBuffer out)(Code) | | Flush this decoder.
This method will call
CharsetDecoder.implFlush(CharBuffer) implFlush . Some
decoders may need to write some characters to the output buffer when they
have read all input bytes, subclasses can overridden
CharsetDecoder.implFlush(CharBuffer) implFlush to perform writing action.
The maximum number of written bytes won't larger than
CharBuffer.remaining out.remaining() . If some decoder want to
write more bytes than output buffer's remaining spaces, then
CoderResult.OVERFLOW will be returned, and this method
must be called again with a character buffer that has more spaces.
Otherwise this method will return CoderResult.UNDERFLOW ,
which means one decoding process has been completed successfully.
During the flush, the output buffer's position will be changed
accordingly, while its mark and limit will be intact.
Parameters: out - the given output buffer CoderResult.UNDERFLOW orCoderResult.OVERFLOW throws: IllegalStateException - if this decoder hasn't read all input bytes during onedecoding process, which means neither after callingCharsetDecoder.decode(ByteBuffer) decode(ByteBuffer) nor aftercalling CharsetDecoder.decode(ByteBuffer,CharBuffer,boolean)decode(ByteBuffer, CharBuffer, boolean) with true value forthe last boolean parameter |
implFlush | protected CoderResult implFlush(CharBuffer out)(Code) | | Flush this decoder. Default implementation does nothing and always return
CoderResult.UNDERFLOW , and this method can be overridden
if needed.
Parameters: out - the output buffer CoderResult.UNDERFLOW orCoderResult.OVERFLOW |
implOnMalformedInput | protected void implOnMalformedInput(CodingErrorAction newAction)(Code) | | Notify that this decoder's CodingErrorAction specified for
malformed input error has been changed. Default implementation does
nothing, and this method can be overridden if needed.
Parameters: newAction - The new action |
implOnUnmappableCharacter | protected void implOnUnmappableCharacter(CodingErrorAction newAction)(Code) | | Notify that this decoder's CodingErrorAction specified for
unmappable character error has been changed. Default implementation does
nothing, and this method can be overridden if needed.
Parameters: newAction - The new action |
implReplaceWith | protected void implReplaceWith(String newReplacement)(Code) | | Notify that this decoder's replacement has been changed. Default
implementation does nothing, and this method can be overridden if needed.
Parameters: newReplacement - the new replacement string |
implReset | protected void implReset()(Code) | | Reset this decoder's charset related state. Default implementation does
nothing, and this method can be overridden if needed.
|
isAutoDetecting | public boolean isAutoDetecting()(Code) | | Get if this decoder implements an auto-detecting charset.
true if this decoder implements an auto-detectingcharset |
isCharsetDetected | public boolean isCharsetDetected()(Code) | | Get if this decoder has detected a charset, this method is optional.
If this decoder implements an auto-detecting charset, then this method
may start to return true during decoding operation to indicate that a
charset has been detected in the input bytes and that the charset can be
retrieved by invoking
CharsetDecoder.detectedCharset() detectedCharset method.
Note that a decoder that implements an auto-detecting charset may still
succeed in decoding a portion of the given input even when it is unable
to detect the charset. For this reason users should be aware that a
false return value does not indicate that no decoding took
place.
The default implementation always throws an
UnsupportedOperationException ; it should be overridden by
subclass if needed.
true this decoder has detected a charset throws: UnsupportedOperationException - if this decoder doesn't implement an auto-detecting charset |
malformedInputAction | public CodingErrorAction malformedInputAction()(Code) | | Gets this decoder's CodingErrorAction when malformed input
occurred during decoding process.
this decoder's CodingErrorAction when malformedinput occurred during decoding process. |
maxCharsPerByte | final public float maxCharsPerByte()(Code) | | Get the maximum number of characters which can be created by this decoder
for one input byte, must be positive
the maximum number of characters which can be created by thisdecoder for one input byte, must be positive |
replaceWith | final public CharsetDecoder replaceWith(String newReplacement)(Code) | | Set new replacement value.
This method first checks the given replacement's validity, then changes
the replacement value, and at last calls
CharsetDecoder.implReplaceWith(String) implReplaceWith method with the given
new replacement as argument.
Parameters: newReplacement - the replacement string, cannot be null or empty this decoder throws: IllegalArgumentException - if the given replacement cannot satisfy the requirementmentioned above |
replacement | final public String replacement()(Code) | | Get the replacement string, which is never null or empty
the replacement string, cannot be null or empty |
reset | final public CharsetDecoder reset()(Code) | | Reset this decoder. This method will reset internal status, and then call
implReset() to reset any status related to specific
charset.
this decoder |
unmappableCharacterAction | public CodingErrorAction unmappableCharacterAction()(Code) | | Gets this decoder's CodingErrorAction when unmappable
character occurred during decoding process.
this decoder's CodingErrorAction when unmappablecharacter occurred during decoding process. |
|
|