URL validering
Jeg forsøger at validere en url efter følgende skema: http://www.cse.ohio-state.edu/cgi-bin/rfc/rfc1738.html sektion 5.Jeg er kommet frem med følgende:
const string RE_ALPHA = "([a-z]|[A-Z])";
const string RE_DIGIT = "[0-9]";
const string RE_DIGITS = RE_DIGIT + "+";
const string RE_HEX = "(" + RE_DIGIT + "|[A-F]|[a-f])";
const string RE_ALPHA_DIGIT = "(" + RE_ALPHA + "|" + RE_DIGITS + ")";
const string RE_UNRESERVED = "(" + RE_ALPHA + "|" + RE_DIGIT + "|[$-_.+]|[!*'(),])";
const string RE_DOMAIN_LABEL = "" + RE_ALPHA_DIGIT + "|" + RE_ALPHA_DIGIT + "(" + RE_ALPHA_DIGIT + "|-)*" + RE_ALPHA_DIGIT;
const string RE_TOP_LABEL = "(" + RE_ALPHA + "|" + RE_ALPHA + "(" + RE_ALPHA_DIGIT + "|-)*" + RE_ALPHA_DIGIT + ")";
const string RE_HOST_NUMBER = "(" + RE_DIGITS + "." + RE_DIGITS + "." + RE_DIGITS + "." + RE_DIGITS + ")";
const string RE_HOSTNAME = "(" + RE_DOMAIN_LABEL + ".)*" + RE_TOP_LABEL;
const string RE_UCHAR = RE_UNRESERVED + "|%" + RE_HEX + RE_HEX;
const string RE_HSEGMENT = "(" + RE_UCHAR + "|[;:@&=])*";
const string RE_HPATH = RE_HSEGMENT + "(/" + RE_HSEGMENT + ")*";
const string RE_SEARCH = "(" + RE_UCHAR + "|[;:@&=])*";
const string RE_HOST = "^(" + RE_HOSTNAME + "[:/]){1}?"; //+ RE_HOST_NUMBER + "|" +
const string RE_PORT = "^(:" + RE_DIGITS + "/)?";
const string RE_PATH = "^(/" + RE_HPATH + "([?]" + RE_SEARCH + ")?){1}?";
Men det fungere ikke efter hensigten, idet følgende "godkendes" www.d+r.dk. Hvordan laver man så f.eks. kun bogstaverne A-Z godkendes i en streng.
