Next: HTTP, Previous: Types and the Web, Up: Web [Contents][Index]
Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.
The generic URI syntax is as follows:
URI-reference := [scheme ":"] ["//" [userinfo "@"] host [":" port]] path \
[ "?" query ] [ "#" fragment ]
For example, in the URI, ‘http://www.gnu.org/help/’, the
scheme is http, the host is www.gnu.org, the path is
/help/, and there is no userinfo, port, query, or fragment.
Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form username:passwd. But
since passwords do not belong in URIs, the RFC does not want to condone
this practice, so it calls anything before the @ sign
userinfo.
(use-modules (web uri))
The following procedures can be found in the (web uri)
module. Load it into your Guile, using a form like the above, to have
access to them.
The most common way to build a URI from Scheme is with the
build-uri function.
#f] [#:host=#f] [#:port=#f] [#:path=""] [#:query=#f] [#:fragment=#f] [#:validate?=#t]Construct a URI. scheme should be a symbol, port either a
positive, exact integer or #f, and the rest of the fields are
either strings or #f. If validate? is true, also run some
consistency checks to make sure that the constructed URI is valid.
Return #t if obj is a URI.
Guile, URIs are represented as URI records, with a number of associated accessors.
Field accessors for the URI record type. The URI scheme will be a
symbol, or #f if the object is a relative-ref (see below). The
port will be either a positive, exact integer or #f, and the rest
of the fields will be either strings or #f if not present.
Parse string into a URI object. Return #f if the string
could not be parsed.
#t]Serialize uri to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization. If include-fragment? is given as false, the resulting string will omit the fragment (if any).
Declare a default port for the given URI scheme.
"utf-8"] [#:decode-plus-to-space? #t]Percent-decode the given str, according to encoding, which should be the name of a character encoding.
Note that this function should not generally be applied to a full URI
string. For paths, use split-and-decode-uri-path instead. For
query strings, split the query on & and = boundaries, and
decode the components separately.
Note also that percent-encoded strings encode bytes, not
characters. There is no guarantee that a given byte sequence is a valid
string encoding. Therefore this routine may signal an error if the
decoded bytes are not valid for the given encoding. Pass #f for
encoding if you want decoded bytes as a bytevector directly.
See set-port-encoding!, for more information on
character encodings.
If decode-plus-to-space? is true, which is the default, also
replace instances of the plus character ‘+’ with a space character.
This is needed when parsing application/x-www-form-urlencoded
data.
Returns a string of the decoded characters, or a bytevector if
encoding was #f.
"utf-8"] [#:unescaped-chars]Percent-encode any character not in the character set, unescaped-chars.
The default character set includes alphanumerics from ASCII, as well as
the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any
other character will be percent-encoded, by writing out the character to
a bytevector within the given encoding, then encoding each byte as
%HH, where HH is the hexadecimal representation of
the byte.
Split path into its components, and decode each component, removing empty components.
For example, "/foo/bar%20baz/" decodes to the two-element list,
("foo" "bar baz").
URI-encode each element of parts, which should be a list of
strings, and join the parts together with / as a delimiter.
For example, the list ("scrambled eggs" "biscuits&gravy") encodes
as "scrambled%20eggs/biscuits%26gravy".
As we noted above, not all URI objects have a scheme. You might have
noted in the “generic URI syntax” example that the left-hand side of
that grammar definition was URI-reference, not URI. A
URI-reference is a generalization of a URI where the scheme is
optional. If no scheme is specified, it is taken to be relative to some
other related URI. A common use of URI references is when you want to
be vague regarding the choice of HTTP or HTTPS – serving a web page
referring to /foo.css will use HTTPS if loaded over HTTPS, or
HTTP otherwise.
#f] [#:userinfo=#f] [#:host=#f] [#:port=#f] [#:path=""] [#:query=#f] [#:fragment=#f] [#:validate?=#t]Like build-uri, but with an optional scheme.
Return #t if obj is a URI-reference. This is the most
general URI predicate, as it includes not only full URIs that have
schemes (those that match uri?) but also URIs without schemes.
It’s also possible to build a relative-ref: a URI-reference that explicitly lacks a scheme.
#f] [#:host=#f] [#:port=#f] [#:path=""] [#:query=#f] [#:fragment=#f] [#:validate?=#t]Like build-uri, but with no scheme.
Return #t if obj is a “relative-ref”: a URI-reference
that has no scheme. Every URI-reference will either match uri?
or relative-ref? (but not both).
In case it’s not clear from the above, the most general of these URI
types is the URI-reference, with build-uri-reference as the most
general constructor. build-uri and build-relative-ref
enforce enforce specific restrictions on the URI-reference. The most
generic URI parser is then string->uri-reference, and there is
also a parser for when you know that you want a relative-ref.
Note that uri? will only return #t for URI objects that
have schemes; that is, it rejects relative-refs.
Parse string into a URI object, while not requiring a scheme.
Return #f if the string could not be parsed.
Parse string into a URI object, while asserting that no scheme is
present. Return #f if the string could not be parsed.
Next: HTTP, Previous: Types and the Web, Up: Web [Contents][Index]