2 8.4. Binary Data Types #
4 8.4.1. bytea Hex Format
5 8.4.2. bytea Escape Format
7 The bytea data type allows storage of binary strings; see Table 8.6.
9 Table 8.6. Binary Data Types
10 Name Storage Size Description
11 bytea 1 or 4 bytes plus the actual binary string variable-length binary
14 A binary string is a sequence of octets (or bytes). Binary strings are
15 distinguished from character strings in two ways. First, binary strings
16 specifically allow storing octets of value zero and other
17 “non-printable” octets (usually, octets outside the decimal range 32 to
18 126). Character strings disallow zero octets, and also disallow any
19 other octet values and sequences of octet values that are invalid
20 according to the database's selected character set encoding. Second,
21 operations on binary strings process the actual bytes, whereas the
22 processing of character strings depends on locale settings. In short,
23 binary strings are appropriate for storing data that the programmer
24 thinks of as “raw bytes”, whereas character strings are appropriate for
27 The bytea type supports two formats for input and output: “hex” format
28 and PostgreSQL's historical “escape” format. Both of these are always
29 accepted on input. The output format depends on the configuration
30 parameter bytea_output; the default is hex. (Note that the hex format
31 was introduced in PostgreSQL 9.0; earlier versions and some tools don't
34 The SQL standard defines a different binary string type, called BLOB or
35 BINARY LARGE OBJECT. The input format is different from bytea, but the
36 provided functions and operators are mostly the same.
38 8.4.1. bytea Hex Format #
40 The “hex” format encodes binary data as 2 hexadecimal digits per byte,
41 most significant nibble first. The entire string is preceded by the
42 sequence \x (to distinguish it from the escape format). In some
43 contexts, the initial backslash may need to be escaped by doubling it
44 (see Section 4.1.2.1). For input, the hexadecimal digits can be either
45 upper or lower case, and whitespace is permitted between digit pairs
46 (but not within a digit pair nor in the starting \x sequence). The hex
47 format is compatible with a wide range of external applications and
48 protocols, and it tends to be faster to convert than the escape format,
49 so its use is preferred.
52 SET bytea_output = 'hex';
54 SELECT '\xDEADBEEF'::bytea;
59 8.4.2. bytea Escape Format #
61 The “escape” format is the traditional PostgreSQL format for the bytea
62 type. It takes the approach of representing a binary string as a
63 sequence of ASCII characters, while converting those bytes that cannot
64 be represented as an ASCII character into special escape sequences. If,
65 from the point of view of the application, representing bytes as
66 characters makes sense, then this representation can be convenient. But
67 in practice it is usually confusing because it fuzzes up the
68 distinction between binary strings and character strings, and also the
69 particular escape mechanism that was chosen is somewhat unwieldy.
70 Therefore, this format should probably be avoided for most new
73 When entering bytea values in escape format, octets of certain values
74 must be escaped, while all octet values can be escaped. In general, to
75 escape an octet, convert it into its three-digit octal value and
76 precede it by a backslash. Backslash itself (octet decimal value 92)
77 can alternatively be represented by double backslashes. Table 8.7 shows
78 the characters that must be escaped, and gives the alternative escape
79 sequences where applicable.
81 Table 8.7. bytea Literal Escaped Octets
82 Decimal Octet Value Description Escaped Input Representation Example
84 0 zero octet '\000' '\000'::bytea \x00
85 39 single quote '''' or '\047' ''''::bytea \x27
86 92 backslash '\\' or '\134' '\\'::bytea \x5c
87 0 to 31 and 127 to 255 “non-printable” octets '\xxx' (octal value)
90 The requirement to escape non-printable octets varies depending on
91 locale settings. In some instances you can get away with leaving them
94 The reason that single quotes must be doubled, as shown in Table 8.7,
95 is that this is true for any string literal in an SQL command. The
96 generic string-literal parser consumes the outermost single quotes and
97 reduces any pair of single quotes to one data character. What the bytea
98 input function sees is just one single quote, which it treats as a
99 plain data character. However, the bytea input function treats
100 backslashes as special, and the other behaviors shown in Table 8.7 are
101 implemented by that function.
103 In some contexts, backslashes must be doubled compared to what is shown
104 above, because the generic string-literal parser will also reduce pairs
105 of backslashes to one data character; see Section 4.1.2.1.
107 Bytea octets are output in hex format by default. If you change
108 bytea_output to escape, “non-printable” octets are converted to their
109 equivalent three-digit octal value and preceded by one backslash. Most
110 “printable” octets are output by their standard representation in the
111 client character set, e.g.:
112 SET bytea_output = 'escape';
114 SELECT 'abc \153\154\155 \052\251\124'::bytea;
119 The octet with decimal value 92 (backslash) is doubled in the output.
120 Details are in Table 8.8.
122 Table 8.8. bytea Output Escaped Octets
123 Decimal Octet Value Description Escaped Output Representation Example
125 92 backslash \\ '\134'::bytea \\
126 0 to 31 and 127 to 255 “non-printable” octets \xxx (octal value)
128 32 to 126 “printable” octets client character set representation
131 Depending on the front end to PostgreSQL you use, you might have
132 additional work to do in terms of escaping and unescaping bytea
133 strings. For example, you might also have to escape line feeds and
134 carriage returns if your interface automatically translates these.