본문 바로가기

Android/URLConnection, Multipart upload

안드로이드 Multipart 업로드 관련명세

안드로이드에서 Multipart form-data를 웹서버로 전송하기 위한 관련 명세

안드로이드에서 URLConnection을 이용하여 텍스트와 동영상 등을 서버측으로 전송하기 위해서는 우선 HTTP의 관련 명세를 확인하고 그 명세에 따라 프로그램을 작성해야 하므로 W3C에서 Multipart form-data에 대한 명세를 찾아 보았다.



Multipart form-data와 관련한 명세 http://www.w3.org/TR/html401/interact/forms.html

17.13.4 Form content types

The enctype attribute of the FORM element specifies the content type used to encode the form data set for submission to the server. User agents must support the content types listed below. Behavior for other content types is unspecified.

Please also consult the section on escaping ampersands in URI attribute values.

application/x-www-form-urlencoded  

This is the default content type. Forms submitted with this content type must be encoded as follows:

  1. Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').
  2. The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.

multipart/form-data  

Note. Please consult [RFC2388] for additional information about file uploads, including backwards compatibility issues, the relationship between "multipart/form-data" and other content types, performance issues, etc.

Please consult the appendix for information about security issues for forms.

The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.

The content "multipart/form-data" follows the rules of all multipart MIME data streams as outlined in [RFC2045]. The definition of "multipart/form-data" is available at the [IANA] registry.

A "multipart/form-data" message contains a series of parts, each representing a successful control. The parts are sent to the processing agent in the same order the corresponding controls appear in the document stream. Part boundaries should not occur in any of the data; how this is done lies outside the scope of this specification.

As with all multipart MIME types, each part has an optional "Content-Type" header that defaults to "text/plain". User agents should supply the "Content-Type" header, accompanied by a "charset" parameter.

Each part is expected to contain:

  1. a "Content-Disposition" header whose value is "form-data".
  2. a name attribute specifying the control name of the corresponding control. Control names originally encoded in non-ASCII character sets may be encoded using the method outlined in [RFC2045].

Thus, for example, for a control named "mycontrol", the corresponding part would be specified:

Content-Disposition: form-data; name="mycontrol"

As with all MIME transmissions, "CR LF" (i.e., `%0D%0A') is used to separate lines of data.

Each part may be encoded and the "Content-Transfer-Encoding" header supplied if the value of that part does not conform to the default (7BIT) encoding (see[RFC2045], section 6)

If the contents of a file are submitted with a form, the file input should be identified by the appropriate content type (e.g., "application/octet-stream"). If multiple files are to be returned as the result of a single form entry, they should be returned as "multipart/mixed" embedded within the "multipart/form-data".

The user agent should attempt to supply a file name for each submitted file. The file name may be specified with the "filename" parameter of the 'Content-Disposition: form-data' header, or, in the case of multiple files, in a 'Content-Disposition: file' header of the subpart. If the file name of the client's operating system is not in US-ASCII, the file name might be approximated or encoded using the method of [RFC2045]. This is convenient for those cases where, for example, the uploaded files might contain references to each other (e.g., a TeX file and its ".sty" auxiliary style description).

The following example illustrates "multipart/form-data" encoding. Suppose we have the following form:

 <FORM action="http://server.com/cgi/handle"
       enctype="multipart/form-data"
       method="post">
   <P>
   What is your name? <INPUT type="text" name="submit-name"><BR>
   What files are you sending? <INPUT type="file" name="files"><BR>
   <INPUT type="submit" value="Send"> <INPUT type="reset">
 </FORM>

If the user enters "Larry" in the text input, and selects the text file "file1.txt", the user agent might send back the following data:

   Content-Type: multipart/form-data; boundary=AaB03x

   --AaB03x
   Content-Disposition: form-data; name="submit-name"

   Larry
   --AaB03x
   Content-Disposition: form-data; name="files"; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --AaB03x--

If the user selected a second (image) file "file2.gif", the user agent might construct the parts as follows:

   Content-Type: multipart/form-data; boundary=AaB03x

   --AaB03x
   Content-Disposition: form-data; name="submit-name"

   Larry
   --AaB03x
   Content-Disposition: form-data; name="files"
   Content-Type: multipart/mixed; boundary=BbC04y

   --BbC04y
   Content-Disposition: file; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --BbC04y
   Content-Disposition: file; filename="file2.gif"
   Content-Type: image/gif
   Content-Transfer-Encoding: binary

   ...contents of file2.gif...
   --BbC04y-- 
   --AaB03x--