本文共 1864 字,大约阅读时间需要 6 分钟。
根据指定的手机号码,查询其归属地等相关信息,python实现:
手机号文件:phone.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | 18815484184 18818701639 18818773287 18818791154 18819026693 18820160604 18823376260 18823669247 18823834556 18824635390 18824722564 18824724252 18824728654 18824731004 18824734215 18824766242 18824932474 18825243001 18825255219 18825269277 18825276414 18825287578 18826014855 18826017814 18826532860 18826573310 18833526414 18837925448 18846911049 18875909323 18876361443 |
python实现:
# coding=UTF-8
# get provider information by phoneNumber
from urllib import urlopen
import re
# get html source code for url
def getPageCode(url):
file = urlopen(url)
text = file.read()
file.close()
# text = text.decode("utf-8") # depending on coding of source code responded
return text
# parse html source code to get provider information
def parseString(src, result):
pat = []
pat.append('(?<=归属地:</span>).+(?=<br />)')
pat.append('(?<=卡类型:</span>).+(?=<br />)')
pat.append('(?<=运营商:</span>).+(?=<br />)')
pat.append('(?<=区号:</span>)\d+(?=<br />)')
pat.append('(?<=邮编:</span>)\d+(?=<br />)')
item = []
for i in range(len(pat)):
m = re.search(pat[i], src)
if m:
v = m.group(0)
item.append(v)
return item
# get provider by phoneNum
def getProvider(phoneNum, result):
url = "http://www.sjgsd.com/n/?q=%s" %phoneNum
text = getPageCode(url)
item = parseString(text, result)
result.append((phoneNum, item))
# write result to file
def writeResult(result):
f = open("result.log", "w")
for num, item in result:
f.write("%s:\t" %num)
for i in item:
f.write("%s,\t" %i)
f.write("\n")
f.close()
if __name__ == "__main__":
result = []
for line in open("test.txt", "r"):
phoneNum = line.strip(" \t\r\n")
getProvider(phoneNum, result)
print("%s is finished" %phoneNum)
writeResult(result)
本文转自027ryan 51CTO博客,原文链接:http://blog.51cto.com/ucode/1735183,如需转载请自行联系原作者