netlib url Parsing

A url can contain a lot of information; the scheme/protocol to use to retreive it, the server to connect to, the file to retrieve, etc. Querying the url for this information is constantly done throughout netlib code. Netlib uses the loosly defined url parsing api to simplify and standardize these queries. All parsing routines, among other utility functions, can be found in either mkparse.c or mkutils.c.

char* net_ReduceURL(char*) Arguments:
char* - the url to be reduced.
Returns:
char* - the same pointer passed in, whose chars may have been modified.

Reduces a url, collapsing directories as necessary (i.e. "http://abc/../xyz.html" becomes, "http://abc/xyz.html"). Modifies the char * passed in and does not re-alloc memory.

char* NET_UnEscape(char*) Arguments:
char* - url to be unescaped.
Returns:
char* - the unescaped url.

Does not allocate any memory, the original char* passed in is modified.

char* NET_MakeTargetURL (char*,
char*,
char**)
Arguments:
char* - the base url
char* - the source url
char** - [out] the new relative url
Returns:
char* - the new target url
char* NET_MakeAbsoluteURL(char*,
char*)
Arguments:
char* - the url to which the second arg is made relative to.
char* - the url to be made relative.
Returns:
char* - a malloc'd char* that points to the new absolute url based on the urls passed in. (i.e. if "http://abc/" and "xyz.html" are passed in, "http://abc/xyz.html" is returned).
time_t NET_ParseDate(char*) Arguments:
char* - a string representation of a date
Returns:
time_t - the time_t representation of the date passed in.

Date parsing is based on RFC 850 and RFC 822.

char* NET_ParseURL(char*, int) Arguments:
char* - the url to parse
int - an integer whose bit fields are referenced to determine which portion of the url to return (see list of bit flags below)
Returns:
char* - a malloc'd char* that contains the portion of the url specified. NULL is only returned if there was a memory allocation problem, otherwise an empty string is returned if the requested part is not found in the url. It is the user's responsibility to check the returned string for the null byte.

These bit flags are defined in ns/include/net.h

  • GET_ALL_PARTS - returns the entire url
  • GET_PASSWORD_PART - returns the password
  • GET_USERNAME_PART - returns the username
  • GET_PROTOCOL_PART - returns the protocol
  • GET_HOST_PART - returns the host
  • GET_PATH_PART - returns everything after the first slash
  • GET_HASH_PART - returns everything after the hash
  • GET_SEARCH_PART - returns everything after the question mark
char* NET_EscapeBytes (const char*,
int32, int, int32*)
Arguments:
const char* - bytes to be escaped
int32 - the string length of the bytes passed in
int - the mask to use
int32* - [out] param. Will be set to the string length size of the resulting escaped char* that is returned.
Returns:
char* - an alloc'd char* that is the escaped representation of the char* passed in.
char* NET_Escape (const char*, int) Arguments:
const char* - bytes to be escaped
int - the mask to use
Returns:
char* - an alloc'd char* that is the escaped representation of the char* passed in.

This function simply wraps a call to NET_EscapeBytes().

int32 NET_EscapedSize (const char*, int) Arguments:
const char* - url to find out the size of the escaped version of
int - the mask to use
Returns:
int32 - the size of the url that would result in a call to NET_EscapeBytes or NET_Escape
int32 NET_UnEscapeBytes (char*, int32) Arguments:
char* - the bytes to be unescaped
int32 - the string length of the bytes passed in
Returns:
int32 - the string length of the new, unescaped char*.

Does not allocate memory, the char* passed in is modified.

XP_Bool NET_IsHTTP_URL (const char*) Arguments:
const char* - url to be checked
Returns:
XP_Bool - true if the url is of type HTTP_TYPE_URL, or SECURE_HTTP_TYPE_URL.

This function simply calls NET_URL_Type() and checks it's return value.

int NET_MakeRelativeURL (char*base_url,
char *input_url,
char **relative_url )
Arguments:
char* - the base url to which the input url is made relative.
char* - the input url to be made relative to the base url
char** - [out] the
Returns:
int - an integer indicating the result of th e operation. Return values are defined in ns/include/net.h and are:
  • NET_URL_SAME_DIRECTORY - Only the filename differs.
  • NET_URL_SAME_DEVICE - The two urls are on the same device. The path differs.
  • NET_URL_NOT_ON_SAME_DEVICE - The urls are on differing devices.
  • NET_URL_FAIL - The operation failed.
Bool NET_IsURLSecure (char*) Arguments:
char* - the url to be checked for security
Returns:
Bool - true if the url is considered secure, false otherwise.

This function simply calls NET_URL_Type() and compares it's return value.

int NET_URL_Type (const char*) Arguments:
const char* - the url to be identified.
Returns:
int - the url type as defined in net.h, zero if the type is not recognized.

Example return values:

  • HTTP_TYPE_URL
  • FTP_TYPE_URL

Judson Valeski, 1998