顯示具有 Text Encode 標籤的文章。 顯示所有文章
顯示具有 Text Encode 標籤的文章。 顯示所有文章

星期二, 12月 13, 2011

[Java] DataOutputStream 的 writeBytes(String s) 編碼問題!!

Java編碼紀綠:

java 的DataOutputStream 的 writeBytes(String s) 方法對中文編碼會錯誤


public final void writeBytes(String s) throws IOException {

int len = s.length();

for (int i = 0 ; i < len ; i++) {

out.write((byte)s.charAt(i));

}

incCount(len);

}


举个例子,以字符串"你好"作为参数输入,(byte)s.charAt(i) 这句就会导致问题,
因为java里的char类型是16位的,一个char可以存储一个中文字符,在将其转换为 byte后高8位会丢失,
这样就无法将中文字符完整的输出到输出流中。
所以在可能有中文字符输出的地方最好先将其转换为字节数组,然再通过write(byte[] b)方法输出。例:

String s = "你好";

write(s.getBytes());

注意:getBytes沒指定編碼格式的話是使用預設系統的編碼。

2012/02/09更新:
DataOutputStream模擬form post上傳的時候,改用write方法才能解決中文編碼的問題。

  /*
  --boundary\r\n
  Content-Disposition: form-data; name=""; filename=""\r\n
  Content-Type: \r\n
  \r\n
  \r\n
  */ 
  this.dataOutputStream.writeBytes(this.PREFIX);
  this.dataOutputStream.writeBytes(this.boundary);
  this.dataOutputStream.writeBytes(this.CRLF);
  this.dataOutputStream.writeBytes("Content-Disposition: form-data; name=\"");
//  this.dataOutputStream.writeBytes(fieldName);
  this.dataOutputStream.write(fieldName.getBytes());
  this.dataOutputStream.writeBytes("\"; filename=\"");
  //don't support char in Chinese
//  this.dataOutputStream.writeBytes(fileName);
  this.dataOutputStream.write(fileName.getBytes());
  this.dataOutputStream.writeBytes("\"");
  this.dataOutputStream.writeBytes(this.CRLF);
  if(mimeType != null){
   this.dataOutputStream.writeBytes("Content-Type:");
   this.dataOutputStream.writeBytes(mimeType);
   this.dataOutputStream.writeBytes(this.CRLF);
   this.dataOutputStream.writeBytes(this.CRLF);
  }



2012/03/12 更新
今天在別的case之下竟然會讓中文亂碼,所以使用getBytes方法前還是指定你要的編碼比較合適
getBytes(Charset.forName("utf-8"))

Reference:
java 的DataOutputStream 的 writeBytes(String s) 方法在向
java String.getBytes()的問題

星期一, 12月 12, 2011

[Java] InputStreamReader

An InputStream is a binary stream, so there is no encoding. 
When you create the Reader, you need to know what character encoding to use, and that would depend on what the program you called produces (Java will not convert it in any way).


If you do not specify anything for InputStreamReader, it will use the platform default encoding, which may not be appropriate. There is another constructor that allows you to specify the encoding.


If you know what encoding to use (and you really have to know):

new InputStreamReader(process.getInputStream(), "UTF-8") // for example

其他你感興趣的文章

Related Posts with Thumbnails