Network Working Group E. Wilde
Internet-Draft EMC
Updates: 5261 (if approved) November 7, 2013
Intended status: Informational
Expires: May 11, 2014
A Media Type for XML Patch Operations
draft-wilde-xml-patch-07
Abstract
The XML Patch media type "application/xml-patch+xml" defines an XML
document structure for expressing a sequence of patch operations that
are applied to an XML document. The XML Patch document format's
foundations are defined in RFC 5261, this specification defines a
document format and a media type registration, so that XML Patch
documents can be labeled with a media type, for example in HTTP
conversations.
In addition to the media type registration, this specification also
updates RFC 5261 in some aspects, limiting these updates to cases
where RFC 5261 needed to be fixed, or was hard to understand.
Note to Readers
This draft should be discussed on the apps-discuss mailing list [1].
Online access to all versions and files is available on github [2].
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 11, 2014.
Copyright Notice
Wilde Expires May 11, 2014 [Page 1]
Internet-Draft XML Patch November 2013
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Patch Documents . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Patch Document Format . . . . . . . . . . . . . . . . . . 3
2.2. Patch Examples . . . . . . . . . . . . . . . . . . . . . . 5
3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5
4. Security Considerations . . . . . . . . . . . . . . . . . . . 6
5. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 6
6. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.1. From -06 to -07 . . . . . . . . . . . . . . . . . . . . . 7
6.2. From -05 to -06 . . . . . . . . . . . . . . . . . . . . . 7
6.3. From -04 to -05 . . . . . . . . . . . . . . . . . . . . . 7
6.4. From -03 to -04 . . . . . . . . . . . . . . . . . . . . . 7
6.5. From -02 to -03 . . . . . . . . . . . . . . . . . . . . . 8
6.6. From -01 to -02 . . . . . . . . . . . . . . . . . . . . . 8
6.7. From -00 to -01 . . . . . . . . . . . . . . . . . . . . . 8
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1. Normative References . . . . . . . . . . . . . . . . . . . 8
7.2. Non-Normative References . . . . . . . . . . . . . . . . . 9
Appendix A. Implementation Hints . . . . . . . . . . . . . . . . 10
A.1. Matching Namespaces . . . . . . . . . . . . . . . . . . . 10
A.2. Patching Namespaces . . . . . . . . . . . . . . . . . . . 11
Appendix B. Implementation Status . . . . . . . . . . . . . . . . 13
Appendix C. ABNF for RFC 5261 . . . . . . . . . . . . . . . . . . 13
Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 14
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14
Wilde Expires May 11, 2014 [Page 2]
Internet-Draft XML Patch November 2013
1. Introduction
The Extensible Markup Language (XML) [RFC3023] is a common format for
the exchange and storage of structured data. HTTP PATCH [RFC5789]
extends HTTP [RFC2616] with a method to perform partial modifications
to resources. HTTP PATCH requires that patch documents be sent along
with the request, and it is therefore useful for there to be
standardized patch document formats (identified by media types) for
popular media types.
The XML Patch media type "application/xml-patch+xml" is an XML
document structure for expressing a sequence of operations to apply
to a target XML document, suitable for use with the HTTP PATCH
method. Servers can freely choose which patch formats they want to
accept, and "application/xml-patch+xml" could be a simple default
format that can be used unless a server decides to use a different
(maybe more sophisticated) patch format for XML.
The format for patch documents is based on the XML Patch Framework
defined in RFC 5261 [RFC5261]. While RFC 5261 does define a concrete
syntax as well as the media type "application/patch-ops-error+xml"
for error documents, it only defines XML Schema (XSD)
[W3C.REC-xmlschema-1-20041028] types for patch operations. The
concrete document format and the media type for patch operations are
defined in an XSD defined in this specification.
This specification relies on RFC 5261, but also requires that the
known errata are taken into account. The main reason for the errata
are the problematic ways in which RFC 5261 relies on XPath as the
expression language for selecting the location of a patch, while at
the same time XPath's data model does not contain sufficient
information to determine whether such a selector indeed can be used
for a patch operation, or should result in an error. Specifically,
the problem occurs with namespaces, where XPath does not expose
namespace declaration attributes, while the patch model needs them to
determine whether a namespace patch is allowed or not. Appendix A
contains more information about the general problem, and the errata
themselves are available through the regular IETF errata system.
2. Patch Documents
2.1. Patch Document Format
The XML patch document format is based on a simple schema that uses a
"patch" element as the document element, and allows an arbitrary
sequence of "add", "remove", and "replace" elements as the children
of the document element. These children follow the semantics defined
Wilde Expires May 11, 2014 [Page 3]
Internet-Draft XML Patch November 2013
in RFC 5261, which means that each element is treated as an
individual patch operation, and the result of each patch operation is
a patched XML document that is the target XML document for the next
patch operation.
The following example patch document uses the example from RFC 5261
section A.18; it uses a "patch" element and a new XML namespace. It
shows the general structure of an XML patch document, as well as an
example of each operation.
Patched doc
new attr
As this example demonstrates, both the document element "patch" and
the patch operation elements are in the same XML namespace. This is
the result of RFC 5261 only defining types for the patch operation
elements, which then can be reused in schemas to define concrete
patch elements.
RFC 5261 defines an XML Schema (XSD) [W3C.REC-xmlschema-1-20041028]
for the patch operation types. The following schema for the XML
Patch media type is based on the types defined in RFC 5261, which are
imported as "rfc5261.xsd" in the following schema. The schema
defines a "patch" document element, and then allows an unlimited (and
possibly empty) sequence of the "add", "remove", and "replace"
operation elements, which are directly based on the respective types
from the schema defined in RFC 5261.
Wilde Expires May 11, 2014 [Page 4]
Internet-Draft XML Patch November 2013
2.2. Patch Examples
Since the semantics of the XML patch operations are defined by RFC
5261, please refer to the numerous examples in that specification for
concrete XML patch document examples. Most importantly, the examples
in RFC 5261 can be taken literally as examples for the XML Patch
media type, as long as it is assumed that the XML namespace for the
operation elements in these examples is the URI "urn:ietf:rfc:XXXX".
3. IANA Considerations
The Internet media type [RFC6838] for an XML Patch Document is
application/xml-patch+xml.
Type name: application
Subtype name: xml-patch+xml
Required parameters: none
Optional parameters:
charset: Same as charset parameter for the media type
"application/xml" as specified in RFC 3023 [RFC3023].
Encoding considerations: Same as encoding considerations of media
type "application/xml" as specified in RFC 3023 [RFC3023].
Security considerations: This media type has all of the security
considerations described in RFC 3023 [RFC3023], RFC 5261
[RFC5261], and RFC 3470 [RFC3470], plus those listed in Section 4.
Wilde Expires May 11, 2014 [Page 5]
Internet-Draft XML Patch November 2013
Interoperability considerations: N/A
Published specification: RFC XXXX
Applications that use this media type: Applications that
manipulate XML documents.
Additional information:
Magic number(s): N/A
File extension(s): XML documents often use ".xml" as the file
extension, and this media type does not propose a specific
extension other than this generic one.
Macintosh file type code(s): TEXT
Person & email address to contact for further information: Erik
Wilde
Intended usage: COMMON
Restrictions on usage: none
Author: Erik Wilde
Change controller: IETF
4. Security Considerations
Parsing XML may entail including information from external sources
through XML's mechanism of external entities. Implementations
therefore should be aware of the fact that standard parsers may
resolve external entities, and thus include external information as a
result of applying patch operations to an XML document.
5. Open Issues
Note to RFC Editor: Please remove this section before publication.
o "ncname" in the XSD grammar is defined as "\i\c*", but so far has
been translated into "1*%x00-ffffffff" in the ABNF grammar.
Ideally, the grammar should reflect how XSD defines these multi-
character escapes, but they map into rather complicated character
ranges in XML itself (such as
http://www.w3.org/TR/xml/#NT-Letter).
Wilde Expires May 11, 2014 [Page 6]
Internet-Draft XML Patch November 2013
6. Change Log
Note to RFC Editor: Please remove this section before publication.
6.1. From -06 to -07
o Moving category back to "info" (from "std"), because the errata to
RFC 5261 are now approved separately.
o Removing the section with "Updates to RFC 5261" because that's
done via errata now.
o Adding reference to RFC 3470 to "Security Considerations".
o Updating the ABNF to correctly only allow lowercase characters in
the string parts.
6.2. From -05 to -06
o Updating "Implementation Status" section to refer to RFC 6982
[RFC6982].
o Properly listing "charset" as an optional media type parameter
(was ill-formatted before).
o Adding corrections from Tony Hansen's review, including document
structure (section/appendix order), and improvements of the ABNF
grammar.
o Moving category back to "std" (from "info"), because that's was
needed for an RFC that is updating an RFC that has been published
on the standards track.
6.3. From -04 to -05
o Improved formatting of XML/XSD and ABNF code.
o Moving category from "std" to "info" (intended to become an
informational RFC).
6.4. From -03 to -04
o Added text and section "Updates to RFC 5261" about updating RFC
5261 (instead of relying on errata).
Wilde Expires May 11, 2014 [Page 7]
Internet-Draft XML Patch November 2013
6.5. From -02 to -03
o Added section on "Implementation Status" (Appendix B).
o Improved "Implementation Hints" (Appendix A).
6.6. From -01 to -02
o Textual edits.
o Added section on "Implementation Hints" (Appendix A).
6.7. From -00 to -01
o Removed Mark Nottingham from author list.
o Changed media type name to application/xml-patch+xml (added suffix
per draft-ietf-appsawg-media-type-suffix-regs)
o Added ABNF grammar derived from XSD (Appendix C)
7. References
7.1. Normative References
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, November 1996.
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996.
[RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media
Types", RFC 3023, January 2001.
[RFC3470] Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for
the Use of Extensible Markup Language (XML) within IETF
Protocols", BCP 70, RFC 3470, January 2003.
[RFC5261] Urpalainen, J., "An Extensible Markup Language (XML) Patch
Operations Framework Utilizing XML Path Language (XPath)
Selectors", RFC 5261, September 2008.
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13,
RFC 6838, January 2013.
Wilde Expires May 11, 2014 [Page 8]
Internet-Draft XML Patch November 2013
7.2. Non-Normative References
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[RFC5789] Dusseault, L. and J. Snell, "PATCH Method for HTTP",
RFC 5789, March 2010.
[RFC6982] Sheffer, Y. and A. Farrel, "Improving Awareness of Running
Code: The Implementation Status Section", RFC 6982,
July 2013.
[W3C.REC-DOM-Level-3-Core-20040407]
Robie, J., Wood, L., Champion, M., Hegaret, P., Nicol, G.,
Le Hors, A., and S. Byrne, "Document Object Model (DOM)
Level 3 Core Specification", World Wide Web Consortium
Recommendation REC-DOM-Level-3-Core-20040407, April 2004,
.
[W3C.REC-xml-20081126]
Sperberg-McQueen, C., Yergeau, F., Paoli, J., Maler, E.,
and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth
Edition)", World Wide Web Consortium Recommendation REC-
xml-20081126, November 2008,
.
[W3C.REC-xml-names-20091208]
Hollander, D., Layman, A., Bray, T., Tobin, R., and H.
Thompson, "Namespaces in XML 1.0 (Third Edition)", World
Wide Web Consortium Recommendation REC-xml-names-20091208,
December 2009,
.
[W3C.REC-xmlschema-1-20041028]
Thompson, H., Beech, D., Maloney, M., and N. Mendelsohn,
"XML Schema Part 1: Structures Second Edition", World Wide
Web Consortium Recommendation REC-xmlschema-1-20041028,
October 2004,
.
[W3C.REC-xpath-19991116]
DeRose, S. and J. Clark, "XML Path Language (XPath)
Version 1.0", World Wide Web Consortium
Recommendation REC-xpath-19991116, November 1999,
.
[W3C.REC-xpath20-20101214]
Wilde Expires May 11, 2014 [Page 9]
Internet-Draft XML Patch November 2013
Boag, S., Berglund, A., Kay, M., Simeon, J., Robie, J.,
Chamberlin, D., and M. Fernandez, "XML Path Language
(XPath) 2.0 (Second Edition)", World Wide Web Consortium
Recommendation REC-xpath20-20101214, December 2010,
.
URIs
[1]
[2]
Appendix A. Implementation Hints
This section is informative. It described some issues that might be
interesting for implementers, but it might also be interesting for
users of XML Patch that want to understand some of the differences
between standard XPath 1.0 processing, and the processing model of
selectors in RFC 5261.
A.1. Matching Namespaces
RFC 5261 defines standard rules for matching prefixed names in
expressions: Any prefixes are interpreted according to the namespace
bindings of the diff document (the document that the expression is
applied against). This means that each prefixed name can be
interpreted in the context of the diff document.
For unprefixed names in expressions, the rules depart from XPath 1.0
[W3C.REC-xpath-19991116]. XPath 1.0 defines that unprefixed names in
expressions match namespace-less names (i.e., there is no "default
namespace" for names used in XPath 1.0 expressions). RFC 5261
requires, however, that unprefixed names in expressions must use the
default namespace of the diff document (if there is one). This means
that it is not possible to simply take a selector from a patch
document and evaluate it in the context of the diff document
according to the rules of XPath 1.0, because this would interpret
unprefixed names incorrectly. As a consequence, it is not possible
to simply take an XPath 1.0 processor and evaluate XMPL Patch
selectors in the context of the diff document.
As an extension of XPath 1.0's simple model, XPath 2.0
[W3C.REC-xpath20-20101214] specifies different processing rules for
unprefixed names: They are matched against the URI of the "default
element/type namespace", which is defined as part of an expression's
static context. In some XPath 2.0 applications, this can be set;
XSLT 2.0 for example has the ability to define an "xpath-default-
Wilde Expires May 11, 2014 [Page 10]
Internet-Draft XML Patch November 2013
namespace", which then will be used to match unprefixed names in
expressions. Thus, by using an XPath 2.0 implementation that allows
to set this URI, and setting it to the default namespace of the diff
document (or leaving it undefined if there is no such default
namespace), it is possible to use an out-of-the-box XPath 2.0
implementation for evaluating XML Patch selectors.
Please keep in mind, however, that evaluating selectors is only one
part of applying patches. When it comes to applying the actual patch
operation, neither XPath 1.0 nor XPath 2.0 are sufficient because
they do not preserve some of the information from the XML syntax
(specifically: namespace declarations) that is required to correctly
apply patch operations. The following section describes this issue
in more detail.
Please note that RFC 5261's Section 4.2.2 on namespace matching
explains XPath 2.0's rules incorrectly
. For this reason,
an erratum is available for Section 4.2.2 of RFC 5261.
A.2. Patching Namespaces
One of the issues when patching namespaces based on XPath is that
XPath exposes namespaces differently than the XML 1.0
[W3C.REC-xml-20081126] syntax for XML Namespaces
[W3C.REC-xml-names-20091208]. In the XML syntax, a namespace is
declared with an attribute using the reserved name or prefix "xmlns",
and this results in this namespace being available recursively
through the document tree. In XPath, the namespace declaration is
not exposed as an attribute (i.e., the attribute, although
syntactically an XML attribute, is not accessible in XPath), but the
resulting namespace nodes are exposed recursively through the tree.
RFC 5261 uses the terms "namespace declaration" and "namespace"
almost interchangeably, but it is important to keep in mind that the
namespace declaration is an XML syntax construct that is unavailable
in XPath, while the namespace itself is a logical construct that is
not visible in the XML syntax, but a result of a namespace
declaration. The intent of RFC 5261 is to patch namespaces as if
namespace declarations were patched, and thus it only allows patching
namespace nodes on the element nodes where the namespace has been
declared.
Patching namespaces in XML Patch is supposed to "emulate" the effect
of actually changing the namespace declaration (which is why a
namespace can only be patched at the element where it has been
declared). Therefore, when patching a namespace, even though XPath's
"namespace" axis is used, implementations have to make sure that not
Wilde Expires May 11, 2014 [Page 11]
Internet-Draft XML Patch November 2013
only the single selected namespace node is being patched, but that
all namespaces nodes resulting from the namespace declaration of this
namespace are also patched accordingly.
This means that an implementation might have to descend into the
tree, matching all namespace nodes with the selected prefix/URI pair
recursively, until it encounters leaf elements or namespace
declarations with the same prefix it is patching. Determining this
requires access to the diff document beyond XPath, because in XPath
itself namespace declarations are not represented, and thus such a
recursive algorithm wouldn't know when to stop. Consider the
following document:
If this document is patched with a selector of /x/namespace::a, then
only the namespace node on element x should be patched, even though
the namespace node on element y has the same prefix/URI combination
as the one on element x. However, determining that the repeated
namespace declaration was present at all on element y is impossible
when using XPath alone, which means that implementations must have an
alternative way to determine the difference between the document
above, and this one:
In this second example, patching with a selector of /x/namespace::a
should indeed change the namespace nodes on elements x and y, because
they both have been derived from the same namespace declaration.
The conclusion of these considerations is that for implementing XML
Patch, access closer to the XML syntax (specifically: access to
namespace declarations) is necessary. As a result, implementations
attempting to exclusively use the XPath model for implementing XML
Patch will fail to correctly address certain edge cases (such as the
one shown above).
Note that XPath's specific limitations do not mean that it is
impossible to use XML technologies other than XPath. The Document
Object Model (DOM) [W3C.REC-DOM-Level-3-Core-20040407], for example,
does expose namespace declaration attributes as regular attributes in
the document tree, and thus could be used to differentiate between
the two variants shown above.
Please note that RFC 5261's Section 4.4.3 on replacing namespaces
mixes the terms "namespace declaration" and "namespace". For this
Wilde Expires May 11, 2014 [Page 12]
Internet-Draft XML Patch November 2013
reason, an erratum is available for Section 4.4.3 of RFC 5261.
Appendix B. Implementation Status
Note to RFC Editor: Please remove this section before publication.
This section records the status of known implementations of the
protocol defined by this specification at the time of posting of this
Internet-Draft, and is based on a proposal described in RFC 6982
[RFC6982]. The description of implementations in this section is
intended to assist the IETF in its decision processes in progressing
drafts to RFCs. Please note that the listing of any individual
implementation here does not imply endorsement by the IETF.
Furthermore, no effort has been spent to verify the information
presented here that was supplied by IETF contributors. This is not
intended as, and must not be construed to be, a catalog of available
implementations or their features. Readers are advised to note that
other implementations may exist.
According to RFC 6982, "this will allow reviewers and working groups
to assign due consideration to documents that have the benefit of
running code, which may serve as evidence of valuable experimentation
and feedback that have made the implemented protocols more mature.
It is up to the individual working groups to use this information as
they see fit".
EMC: EMC's IIG unit has implemented the selector part of the
specification, which is the trickiest part (see Appendix A.1 for
an explanation). By reusing an existing XPath 1.0 implementation
and changing it to match the changed default namespace processing
model, the required behavior is fairly easy to implement. This
does, however, require that the implementation is available in
source code, and also does require some changes to the
implementation's code. The resulting implementation is closed
source and will be made available, if released, as part of EMC's
XML database product xDB
.
Appendix C. ABNF for RFC 5261
RFC 5261 [RFC5261] does not contain an ABNF grammar for the allowed
subset of XPath expressions, but includes an XSD-based grammar in its
type definition for operation types). In order to make
implementation easier, this appendix contains an ABNF grammar that
has been derived from the XSD expressions in RFC 5261. In the
following grammar, "xpath" is the definition for the allowed XPath
Wilde Expires May 11, 2014 [Page 13]
Internet-Draft XML Patch November 2013
expressions for remove and replace operations, and "xpath-add" is the
definition for the allowed XPath expressions for add operations. The
names of all grammar productions are the ones used in the XSD-based
grammar of RFC 5261.
anychar = %x00-ffffffff
ncname = 1*%x00-ffffffff
qname = [ ncname ":" ] ncname
aname = "@" qname
pos = "[" 1*DIGIT "]"
attr = ( "[" aname "='" 0*anychar "']" ) /
( "[" aname "=" DQUOTE 0*anychar DQUOTE "]" )
valueq = "[" ( qname / "." ) "=" DQUOTE 0*anychar DQUOTE "]"
value = ( "[" ( qname / "." ) "='" 0*anychar "']" ) / valueq
cond = attr / value / pos
step = ( qname / "*" ) 0*cond
piq = %x70.72.6f.63.65.73.73.69.6e.67.2d.69.6e.73.74.72.75.63.74.69.6f.6e
; "processing-instruction", case-sensitive
"(" [ DQUOTE ncname DQUOTE ] ")"
pi = ( %x70.72.6f.63.65.73.73.69.6e.67.2d.69.6e.73.74.72.75.63.74.69.6f.6e
; "processing-instruction", case-sensitive
"(" [ "'" ncname "'" ] ")" ) / piq
id = ( %x69.64 ; "id", case-sensitive
"(" [ "'" ncname "'" ] ")" ) /
( %x69.64 ; "id", case-sensitive
"(" [ DQUOTE ncname DQUOTE ] ")" )
com = %x63.6f.6d.6d.65.6e.74 ; "comment", case-sensitive
"()"
text = %x74.65.78.74 ; "text", case-sensitive
"()"
nspa = %x6e.61.6d.65.73.70.61.63.65; "namespace", case-sensitive
"::" ncname
cnodes = ( text / com / pi ) [ pos ]
child = cnodes / step
last = child / aname / nspa
xpath = [ "/" ] ( ( id [ 0*( "/" step ) "/" last ] ) /
( 0*( step "/" ) last ) )
xpath-add = [ "/" ] ( ( id [ 0*( "/" step ) "/" child ] ) /
( 0*( step "/" ) child ) )
Appendix D. Acknowledgements
Thanks for comments and suggestions provided by Bas de Bakker, Tony
Hansen, Bjoern Hoehrmann, and Julian Reschke.
Wilde Expires May 11, 2014 [Page 14]
Internet-Draft XML Patch November 2013
Author's Address
Erik Wilde
EMC
6801 Koll Center Parkway
Pleasanton, CA 94566
U.S.A.
Phone: +1-925-6006244
Email: erik.wilde@emc.com
URI: http://dret.net/netdret/
Wilde Expires May 11, 2014 [Page 15]