Tue, 28 Oct 2014 11:19 am
Fungsi GRAB crawler Republika must 200 set ke OFF, karena kadang ada link yang redirect, kode 302
def grab(list,label) : parameter = ["curl","-H","User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0 FirePHP/0.7.4","--max-time", "1000", "--write-out", "%{http_code}", "--silent"] for l in list : parameter.append(l) default_code = "000" reconnect = "" proc = subprocess.Popen(parameter,stdout=subprocess.PIPE) (out, err) = proc.communicate() print label ''' while default_code != "200" : proc = subprocess.Popen(parameter,stdout=subprocess.PIPE) (out, err) = proc.communicate() if str(out) == "200" : default_code = "200" # OK info = "OK" elif str(out) == "000" : info = "FAIL" reconnect = "reconnect" elif str(out) == "302" : info = "REDIRECT" reconnect = "reconnect" else : info = "..." reconnect = "reconnect" print label + ": " + str(out) + " " + info + " " + reconnect '''#webkoe #script #python
Additional Info: