Discussion:
[xmlsec] canonicalization and transcoding
Slava Kostin (Slava Kostin)
2003-03-29 03:26:04 UTC
Permalink
Hello.

It's me again. :)

Should I make canonicalization by myself using libxml2 API or XMLSec
can do it before signing "insensibly" for me?
Canonicalization also implies a transformation from any other
encoding to UTF-8. Should I do such conversion by myself using iconv
or only learn how to work with libXMLSec correctly?

Best regards,
Slava Kostin
Aleksey Sanin
2003-03-29 03:44:55 UTC
Permalink
XMLSec and LibXML takes care about canonicalization and all other stuff.
Please check examples for more details about creating signatures.

Aleksey
Post by Slava Kostin (Slava Kostin)
Hello.
It's me again. :)
Should I make canonicalization by myself using libxml2 API or XMLSec
can do it before signing "insensibly" for me?
Canonicalization also implies a transformation from any other
encoding to UTF-8. Should I do such conversion by myself using iconv
or only learn how to work with libXMLSec correctly?
Best regards,
Slava Kostin
_______________________________________________
xmlsec mailing list
http://www.aleksey.com/mailman/listinfo/xmlsec
Slava Kostin (Slava Kostin)
2003-03-29 06:03:32 UTC
Permalink
Hello, Aleksey Sanin.

Saturday, March 29, 2003, 5:44:55 you wrote about "[xmlsec] canonicalization and transcoding":

AS> XMLSec and LibXML takes care about canonicalization and all other stuff.
AS> Please check examples for more details about creating signatures.

Do you mean that example dsig1.c does canonisation and transcoding?
I tried to change encoding="Windows-1251" and added two tags without
pair (<InnerTag attr1="10"/>). And after processing all tags are still
without pairs and in that lexical order as thay was. Encoding also
has not been changed.
Should I describe DTD before trying to sign document?

Best regards,
Slava Kostin
Aleksey Sanin
2003-03-29 06:26:38 UTC
Permalink
I am not sure what do you mean by transcoding in application to XML but
I believe you are
mixing canonicalization with something different. Canonicalization
(c14N) is a process of
transformng an XML document or a part of XML document to a binary
stream. You *have*
to do c14n in order to sign or digest XML data just because digests and
signature work on
binary data only. Currently, several c14n algorithms are defined by W3C
and all of them
are implemented in xmlsec library.

Next, there is no reason why XML parser should replace
<foo />
with
<foo></foo>
The "<foo/>" is a perfectly valid XML. There is also no reason for XML
parser
to sort nodes (moreover, the parser that does it is actualy not an XML
parser at all :) ).

Also when you specify encoding in the XML document, LibXML is smart
enough to
do correct automatic encoding conversion when it reads or writes
document. Internaly,
all the strings are UTF8 (see libxml documentation). And the
signatures/digests are
calculated other UTF8 data as it is required by specification. However,
when the result
document is dupmed to output in the example you've mentioned, it is
converted
back to the encoding specified in the document. Of course, you can force
libxml
to write document in any other encoding but this is beyond the limits of
the xmlsec library
examples.

You can use "xmlsec" command line utility to look "inside" the signature
process.
Try '--store-references' or '--store-signatures' options when verifiying
a document.
It'll print out the binary stream just before calculating digest or
signature. You can also
get access to the same data from your application (check the xmlsec
utility sources for details).

Aleksey
Post by Slava Kostin (Slava Kostin)
Do you mean that example dsig1.c does canonisation and transcoding?
I tried to change encoding="Windows-1251" and added two tags without
pair (<InnerTag attr1="10"/>). And after processing all tags are still
without pairs and in that lexical order as thay was. Encoding also
has not been changed.
Should I describe DTD before trying to sign document?
Best regards,
Slava Kostin
_______________________________________________
xmlsec mailing list
http://www.aleksey.com/mailman/listinfo/xmlsec
Slava Kostin (Slava Kostin)
2003-03-29 13:51:58 UTC
Permalink
Hello, Aleksey Sanin.

I'm really sorry, but let me to draw away you ones more...

AS> Next, there is no reason why XML parser should replace
AS> <foo />
AS> with
AS> <foo></foo>
AS> The "<foo/>" is a perfectly valid XML.

[...]

AS> There is also no reason for XML parser to sort nodes (moreover,
AS> the parser that does it is actualy not an XML parser at all :) ).

Canonical XML
Version 1.0
W3C Recommendation 15 March 2001

http://www.w3.org/TR/xml-c14n

3.3. Start and End Tags

Doesn't this section describe such process?

Demonstrates:

- *Empty element conversion to start-end tag pair*
- Normalization of whitespace in start and end tags
- Relative order of namespace and attribute axes
- *Lexicographic ordering of namespace and attribute axes*
- Retention of namespace prefixes from original document
- Elimination of superfluous namespace declarations
- Addition of default attribute

AS> Of course, you can force libxml to write document in any other
AS> encoding but this is beyond the limits of the xmlsec library
AS> examples.

Thank you.

Best regards,
Slava Kostin
Aleksey Sanin
2003-03-29 19:05:03 UTC
Permalink
You are absolutely right! But this happens internaly and does not affect
your
document. For example, I have the following file template file with an
enveloped
signature (some line skipped):

[***@lsh examples]$ cat test.xml
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test />
</Data>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
....
</Signature>
</Envelope>

Now I am signing it with xmlsec utility using '--store-references' option
to see waht *exactly* was signed (it ouputs a lot of data so we save our
document
to a file with '--output' option and as usual skip some lines):

[***@lsh examples]$ ../apps/xmlsec sign --store-references --privkey
./rsakey.pem --output ./test-res.xml ./test.xml
...
== PreDigest data - start buffer:
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test></test>
</Data>

</Envelope>
== PreDigest data - end buffer
...

As you can see, before calculating the digest we did C14N as it is
described (for example,
inserted the missing end tag). However, the result document does not
have this tag because
one who verifies the signature *MUST* perform the same C14N internaly:

[***@lsh examples]$ cat test-res.xml
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test />
</Data>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
....
</Signature>
</Envelope>

Try it out yourself! Use '--store-results' and '--store-signatures'
option to see what
binary data go to digest or signature.



Aleksey
Aleksey Sanin
2008-10-29 20:05:17 UTC
Permalink
You are absolutely right! But this happens internaly and does not affect
your
document. For example, I have the following file template file with an
enveloped
signature (some line skipped):

[***@lsh examples]$ cat test.xml
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test />
</Data>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
....
</Signature>
</Envelope>

Now I am signing it with xmlsec utility using '--store-references' option
to see waht *exactly* was signed (it ouputs a lot of data so we save our
document
to a file with '--output' option and as usual skip some lines):

[***@lsh examples]$ ../apps/xmlsec sign --store-references --privkey
./rsakey.pem --output ./test-res.xml ./test.xml
...
== PreDigest data - start buffer:
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test></test>
</Data>

</Envelope>
== PreDigest data - end buffer
...

As you can see, before calculating the digest we did C14N as it is
described (for example,
inserted the missing end tag). However, the result document does not
have this tag because
one who verifies the signature *MUST* perform the same C14N internaly:

[***@lsh examples]$ cat test-res.xml
<Envelope xmlns="urn:envelope">
<Data>
Hello, World!
<test />
</Data>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
....
</Signature>
</Envelope>

Try it out yourself! Use '--store-results' and '--store-signatures'
option to see what
binary data go to digest or signature.



Aleksey

Slava Kostin
2008-10-29 20:05:17 UTC
Permalink
Hello, Aleksey Sanin.

I'm really sorry, but let me to draw away you ones more...

AS> Next, there is no reason why XML parser should replace
AS> <foo />
AS> with
AS> <foo></foo>
AS> The "<foo/>" is a perfectly valid XML.

[...]

AS> There is also no reason for XML parser to sort nodes (moreover,
AS> the parser that does it is actualy not an XML parser at all :) ).

Canonical XML
Version 1.0
W3C Recommendation 15 March 2001

http://www.w3.org/TR/xml-c14n

3.3. Start and End Tags

Doesn't this section describe such process?

Demonstrates:

- *Empty element conversion to start-end tag pair*
- Normalization of whitespace in start and end tags
- Relative order of namespace and attribute axes
- *Lexicographic ordering of namespace and attribute axes*
- Retention of namespace prefixes from original document
- Elimination of superfluous namespace declarations
- Addition of default attribute

AS> Of course, you can force libxml to write document in any other
AS> encoding but this is beyond the limits of the xmlsec library
AS> examples.

Thank you.

Best regards,
Slava Kostin
Aleksey Sanin
2008-10-29 20:05:17 UTC
Permalink
I am not sure what do you mean by transcoding in application to XML but
I believe you are
mixing canonicalization with something different. Canonicalization
(c14N) is a process of
transformng an XML document or a part of XML document to a binary
stream. You *have*
to do c14n in order to sign or digest XML data just because digests and
signature work on
binary data only. Currently, several c14n algorithms are defined by W3C
and all of them
are implemented in xmlsec library.

Next, there is no reason why XML parser should replace
<foo />
with
<foo></foo>
The "<foo/>" is a perfectly valid XML. There is also no reason for XML
parser
to sort nodes (moreover, the parser that does it is actualy not an XML
parser at all :) ).

Also when you specify encoding in the XML document, LibXML is smart
enough to
do correct automatic encoding conversion when it reads or writes
document. Internaly,
all the strings are UTF8 (see libxml documentation). And the
signatures/digests are
calculated other UTF8 data as it is required by specification. However,
when the result
document is dupmed to output in the example you've mentioned, it is
converted
back to the encoding specified in the document. Of course, you can force
libxml
to write document in any other encoding but this is beyond the limits of
the xmlsec library
examples.

You can use "xmlsec" command line utility to look "inside" the signature
process.
Try '--store-references' or '--store-signatures' options when verifiying
a document.
It'll print out the binary stream just before calculating digest or
signature. You can also
get access to the same data from your application (check the xmlsec
utility sources for details).

Aleksey
Post by Slava Kostin (Slava Kostin)
Do you mean that example dsig1.c does canonisation and transcoding?
I tried to change encoding="Windows-1251" and added two tags without
pair (<InnerTag attr1="10"/>). And after processing all tags are still
without pairs and in that lexical order as thay was. Encoding also
has not been changed.
Should I describe DTD before trying to sign document?
Best regards,
Slava Kostin
_______________________________________________
xmlsec mailing list
http://www.aleksey.com/mailman/listinfo/xmlsec
Slava Kostin
2008-10-29 20:05:17 UTC
Permalink
Hello, Aleksey Sanin.

Saturday, March 29, 2003, 5:44:55 you wrote about "[xmlsec] canonicalization and transcoding":

AS> XMLSec and LibXML takes care about canonicalization and all other stuff.
AS> Please check examples for more details about creating signatures.

Do you mean that example dsig1.c does canonisation and transcoding?
I tried to change encoding="Windows-1251" and added two tags without
pair (<InnerTag attr1="10"/>). And after processing all tags are still
without pairs and in that lexical order as thay was. Encoding also
has not been changed.
Should I describe DTD before trying to sign document?

Best regards,
Slava Kostin
Rich Salz
2003-03-29 14:20:18 UTC
Permalink
Let me try.

The XML spec allows variations; for example, the ordering of attributes
doesn't matter.
<a foo='bar' bar='foo'/>
and
<a bar='foo' foo='bar'/>
are equivalent, *as far as XML is concerned.*

When you generate a document, you can make your output be in some "official"
order. But you have no guarantee that if another XML program reads your
document and makes new output, that your original order will be preserved.

That is legal behavior.

For cryptography, however, an official ordering is needed; this is called
canonicalization, or C14N (count the letters... :). It does things like
specify the order of attributes, specify an encoding, etc. In short, it
picks "one way" to do everything where XML allows variations. This is
important, so that the hash will always be the same, no matter which XML
software processes the document.

For portability, you should not rely on all programs keeping your XML
document in the same ordering, etc. You should use an XML DSIG transform.
In particular, the standard XML C14N or XML Exclusive C14N.

Hope this helps.
/r$
Aleksey Sanin
2008-10-29 20:05:17 UTC
Permalink
XMLSec and LibXML takes care about canonicalization and all other stuff.
Please check examples for more details about creating signatures.

Aleksey
Post by Slava Kostin (Slava Kostin)
Hello.
It's me again. :)
Should I make canonicalization by myself using libxml2 API or XMLSec
can do it before signing "insensibly" for me?
Canonicalization also implies a transformation from any other
encoding to UTF-8. Should I do such conversion by myself using iconv
or only learn how to work with libXMLSec correctly?
Best regards,
Slava Kostin
_______________________________________________
xmlsec mailing list
http://www.aleksey.com/mailman/listinfo/xmlsec
Rich Salz
2008-10-29 20:05:17 UTC
Permalink
Let me try.

The XML spec allows variations; for example, the ordering of attributes
doesn't matter.
<a foo='bar' bar='foo'/>
and
<a bar='foo' foo='bar'/>
are equivalent, *as far as XML is concerned.*

When you generate a document, you can make your output be in some "official"
order. But you have no guarantee that if another XML program reads your
document and makes new output, that your original order will be preserved.

That is legal behavior.

For cryptography, however, an official ordering is needed; this is called
canonicalization, or C14N (count the letters... :). It does things like
specify the order of attributes, specify an encoding, etc. In short, it
picks "one way" to do everything where XML allows variations. This is
important, so that the hash will always be the same, no matter which XML
software processes the document.

For portability, you should not rely on all programs keeping your XML
document in the same ordering, etc. You should use an XML DSIG transform.
In particular, the standard XML C14N or XML Exclusive C14N.

Hope this helps.
/r$
Loading...