博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Python抓取淘宝IP地址数据
阅读量:4340 次
发布时间:2019-06-07

本文共 2777 字,大约阅读时间需要 9 分钟。

def fetch(ip):    url = 'http://ip.taobao.com/service/getIpInfo.php?ip=' + ip    result = []    try:        response = urllib.urlopen(url).read()        jsondata = json.loads(response)        if jsondata[u'code'] == 0:            result.append(jsondata[u'data'][u'ip'].encode('utf-8'))                        result.append(jsondata[u'data'][u'country'].encode('utf-8'))            result.append(jsondata[u'data'][u'country_id'].encode('utf-8'))            result.append(jsondata[u'data'][u'area'].encode('utf-8'))            result.append(jsondata[u'data'][u'area_id'].encode('utf-8'))            result.append(jsondata[u'data'][u'region'].encode('utf-8'))            result.append(jsondata[u'data'][u'region_id'].encode('utf-8'))            result.append(jsondata[u'data'][u'city'].encode('utf-8'))            result.append(jsondata[u'data'][u'city_id'].encode('utf-8'))            result.append(jsondata[u'data'][u'county'].encode('utf-8'))            result.append(jsondata[u'data'][u'county_id'].encode('utf-8'))            result.append(jsondata[u'data'][u'isp'].encode('utf-8'))            result.append(jsondata[u'data'][u'isp_id'].encode('utf-8'))                    else:            return 0, result    except:        logging.exception("Url open failed:" + url)        return 0, result    return 1, result def worker(ratelimit, jobs, results, progress):    global cancel    while not cancel:        try:            ratelimit.ratecontrol()            ip = jobs.get(timeout=2) # Wait 2 seconds            ok, result = fetch(ip)            if not ok:                logging.error("Fetch information failed, ip:{}".format(ip))                progress.put("") # Notify the progress even it failed            elif result is not None:                results.put(" ".join(result))            jobs.task_done()    # Notify one item        except Queue.Empty:            pass        except:            logging.exception("Unknown Error!")
def process(target, results, progress):    global cancel    while not cancel:        try:            line = results.get(timeout=5)        except Queue.Empty:            pass        else:            print >>target, line            progress.put("")            results.task_done()
def progproc(progressbar, count, progress):    """    Since ProgressBar is not a thread-safe class, we use a Queue to do the counting job, like    two other threads. Use this thread do the printing of progress bar. By the way, it will    print to stderr, which does not conflict with the default result output(stdout).    """    idx = 1    while True:        try:            progress.get(timeout=5)        except Queue.Empty:            pass        else:            progressbar.update(idx)            idx += 1

 

转载于:https://www.cnblogs.com/chenjingyi/p/5794736.html

你可能感兴趣的文章
转:How to force a wordbreaker to be used in Sharepoint Search
查看>>
MySQL存储过程定时任务
查看>>
Python中and(逻辑与)计算法则
查看>>
POJ 3267 The Cow Lexicon(动态规划)
查看>>
设计原理+设计模式
查看>>
tomcat 7服务器跨域问题解决
查看>>
前台实现ajax 需注意的地方
查看>>
Jenkins安装配置
查看>>
个人工作总结05(第二阶段)
查看>>
Java clone() 浅拷贝 深拷贝
查看>>
深入理解Java虚拟机&运行时数据区
查看>>
02-环境搭建
查看>>
spring第二冲刺阶段第七天
查看>>
搜索框键盘抬起事件2
查看>>
阿里百川SDK初始化失败 错误码是203
查看>>
透析Java本质-谁创建了对象,this是什么
查看>>
BFS和DFS的java实现
查看>>
关于jquery中prev()和next()的用法
查看>>
一、 kettle开发、上线常见问题以及防错规范步骤
查看>>
eclipse没有server选项
查看>>